FAIR Genomes metadata schema promoting Next Generation Sequencing data reuse in Dutch healthcare and research

Abstract The genomes of thousands of individuals are profiled within Dutch healthcare and research each year. However, this valuable genomic data, associated clinical data and consent are captured in different ways and stored across many systems and organizations. This makes it difficult to discover rare disease patients, reuse data for personalized medicine and establish research cohorts based on specific parameters. FAIR Genomes aims to enable NGS data reuse by developing metadata standards for the data descriptions needed to FAIRify genomic data while also addressing ELSI issues. We develop... Mehr ...

Verfasser: K. Joeri van der Velde
Gurnoor Singh
Rajaram Kaliyaperumal
XiaoFeng Liao
Sander de Ridder
Susanne Rebers
Hindrik H. D. Kerstens
Fernanda de Andrade
Jeroen van Reeuwijk
Fini E. De Gruyter
Saskia Hiltemann
Maarten Ligtvoet
Marjan M. Weiss
Hanneke W. M. van Deutekom
Anne M. L. Jansen
Andrew P. Stubbs
Lisenka E. L. M. Vissers
Jeroen F. J. Laros
Esther van Enckevort
Daphne Stemkens
Peter A. C. ‘t Hoen
Jeroen A. M. Beliën
Mariëlle E. van Gijn
Morris A. Swertz
Dokumenttyp: Artikel
Erscheinungsdatum: 2022
Reihe/Periodikum: Scientific Data, Vol 9, Iss 1, Pp 1-13 (2022)
Verlag/Hrsg.: Nature Portfolio
Schlagwörter: Science / Q
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-29401225
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://doi.org/10.1038/s41597-022-01265-x

Abstract The genomes of thousands of individuals are profiled within Dutch healthcare and research each year. However, this valuable genomic data, associated clinical data and consent are captured in different ways and stored across many systems and organizations. This makes it difficult to discover rare disease patients, reuse data for personalized medicine and establish research cohorts based on specific parameters. FAIR Genomes aims to enable NGS data reuse by developing metadata standards for the data descriptions needed to FAIRify genomic data while also addressing ELSI issues. We developed a semantic schema of essential data elements harmonized with international FAIR initiatives. The FAIR Genomes schema v1.1 contains 110 elements in 9 modules. It reuses common ontologies such as NCIT, DUO and EDAM, only introducing new terms when necessary. The schema is represented by a YAML file that can be transformed into templates for data entry software (EDC) and programmatic interfaces (JSON, RDF) to ease genomic data sharing in research and healthcare. The schema, documentation and MOLGENIS reference implementation are available at https://fairgenomes.org .