A Privacy-Preserving Infrastructure for Analyzing Personal Health Data in a Vertically Partitioned Scenario
It is widely anticipated that the use and analysis of health-related big data will enable further understanding and improvements in human health and wellbeing. Here, we propose an innovative infrastructure, which supports secure and privacy-preserving analysis of personal health data from multiple providers with different governance policies. Our objective is to use this infrastructure to explore the relation between Type 2 Diabetes Mellitus status and healthcare costs. Our approach involves the use of distributed machine learning to analyze vertically partitioned data from the Maastricht Stud... Mehr ...
It is widely anticipated that the use and analysis of health-related big data will enable further understanding and improvements in human health and wellbeing. Here, we propose an innovative infrastructure, which supports secure and privacy-preserving analysis of personal health data from multiple providers with different governance policies. Our objective is to use this infrastructure to explore the relation between Type 2 Diabetes Mellitus status and healthcare costs. Our approach involves the use of distributed machine learning to analyze vertically partitioned data from the Maastricht Study, a prospective population-based cohort study, and data from the official statistics agency of the Netherlands, Statistics Netherlands (Centraal Bureau voor de Statistiek; CBS). This project seeks an optimal solution accounting for scientific, technical, and ethical/legal challenges. We describe these challenges, our progress towards addressing them in a practical use case, and a simulation experiment.