Dataset repository


FRACS (FAIR Repository for Annotations, Corpora and Schemas) is a repository of annotations that facilitates the creation, storage, retrieval, manipulation and distribution of annotation sets on corpora of documents. FRACS conforms to the FAIR principles, allowing the discovery, accessibility, interoperability and reuse of your research data.

About the FAIR principles

In the context of Internet accessibility, Big Data, Open Science and, more broadly, data sharing and dissemination, the notion of FAIR data covers the ways in which data can be constructed, stored, presented or published. (Source: Wikipedia)


Data and supplementary materials have sufficiently rich metadata and a unique and persistent identifier


Metadata and data are understandable to humans and machines. Data is deposited in a trusted repository.


Metadata use a formal, accessible, shared, and broadly applicable language for knowledge representation.


Data and collections have a clear usage licenses and provide accurate information on provenance.

The FRACS Project

FRACS is a multi-faceted project.

Connect FRACS to you research application

Generated data will be stored in a structured manner and quickly accessible.

Use FRACS as a storage engine for you platform

An easy to use REST API allows for faultless interoperability.

Generate distributions of you datasets

Build a public or private archive of a version of your datasets to share it easily with other teams.

Publish your dataset in the web catalog

Enrich your data with metadata to make it easier to index and discover.


FRACS is offered as interoperable modules ready to be deployed and instantiated individually or as a whole. It is part of the Canadian National Data Services Framework initiative. The RACS storage module was developed by CRIM as part of the Adnotare project, to store the data generated by the PACTE text annotation platform. As part of the FRACS project, RACS will be enhanced to meet the needs of other research disciplines and complemented by a rights management module (UsAc), a binary archive storage module (MSS) and a metadata catalogue (Catá).


Storage module based on Elasticsearch and offering a REST API for the management of corpora, documents, schemas and annotations. The API has been carefully designed to provide a quality developer experience.


Authentication and access rights management module based on Apache Shiro. It allows a granular management of access rights per user for each type of resource in the system. It can work as a library or as a service behind a REST API.


Binary objects storage module originally developed for video and audio files in the VESTA platform. The archives (distributions) generated from the datasets contained in the RACS module will be stored in this module.


Web catalogue of datasets with rich metadata and compliant with FAIR principles. Through Catá, the datasets will be findable and reusable.

Copyright © 2019 - Computer Research Institute of MontrealCopyright © 2019 - CRIM
Funded byPowered by