Knowledge Graph for Microdata of Statistics Netherlands ...

Statistics Netherlands (CBS) hosted a huge amount of data not only on the statistical level but also on the individual level. With the development of data science technologies, more and more researchers request to conduct their research by using high-quality individual data from CBS (called CBS Microdata) or combining them with other data sources. Making great use of these data for research and scientific purposes can tremendously benefit the whole society. However, CBS Microdata has been collected and maintained in different ways by different departments in and out of CBS. The representation,... Mehr ...

Verfasser: Sun, Chang
Dokumenttyp: Artikel
Erscheinungsdatum: 2021
Verlag/Hrsg.: arXiv
Schlagwörter: Digital Libraries cs.DL / Databases cs.DB / FOS: Computer and information sciences
Sprache: unknown
Permalink: https://search.fid-benelux.de/Record/base-29161348
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://dx.doi.org/10.48550/arxiv.2101.07622

Statistics Netherlands (CBS) hosted a huge amount of data not only on the statistical level but also on the individual level. With the development of data science technologies, more and more researchers request to conduct their research by using high-quality individual data from CBS (called CBS Microdata) or combining them with other data sources. Making great use of these data for research and scientific purposes can tremendously benefit the whole society. However, CBS Microdata has been collected and maintained in different ways by different departments in and out of CBS. The representation, quality, metadata of datasets are not sufficiently harmonized. The project converts the descriptions of all CBS microdata sets into one knowledge graph with comprehensive metadata in Dutch and English using text mining and semantic web technologies. Researchers can easily query the metadata, explore the relations among multiple datasets, and find the needed variables. For example, if a researcher searches a dataset ...