Jaideep Vaidya Distinguished Professor, Rutgers University
The End of Anonymity in Genomics: Rethinking Privacy, Sharing, and Trust
Abstract: Genomic research is built on the premise that data can be shared safely, yet advances in machine learning are beginning to erode this foundation. In this talk, I examine how modern models can memorize and expose sensitive genomic information, enabling membership inference and related attacks even in high-dimensional biological settings. I argue that privacy in genomics is fundamentally relational, since shared ancestry allows information about one individual to reveal information about others, complicating standard notions of anonymization and consent. I will also discuss the limits of synthetic genomic data as a privacy solution, showing how generative models can reproduce fine-grained population structure and rare variation.
At the same time, I will present emerging approaches for enabling collaborative genomic studies under realistic privacy constraints, including a sandbox framework for genome-wide association studies that combines technical advances with policy-aligned risk assessment to support IRB decision-making and broaden access to data. These developments suggest that anonymity in genomics is not a guarantee but an assumption that must be reconsidered, with significant implications for how collaborative genomic studies are designed and governed in the future.
Bio: Jaideep Vaidya is a Distinguished Professor of Computer Information Systems and Vice Dean for Faculty Affairs and Research at Rutgers University, the Director of the Rutgers Institute for Data Science, Learning, and Applications. He received the B.E. degree in Computer Engineering from the University of Mumbai, the M.S. and Ph.D. degree in Computer Science from Purdue University. His general area of research is at the intersection of security, privacy, data mining, data management, and artificial intelligence. He has published over 200 technical papers in peer-reviewed journals and conference proceedings, and has received best paper awards from the premier conferences in data mining, databases, digital government, security, and informatics. He is a Fellow of the AAAS, ACMI, AIMBE, IAHSI, IEEE, and IFIP as well as an ACM Distinguished Scientist.
Marco Lorenzi Research Director, Inria Center at Université Côte d'Azur
Federated Learning in Healthcare: From Theory to Practice
Abstract: Is federated learning (FL) ready to move from academic experiment to clinical infrastructure? While FL enables collaborative machine learning across institutions without sharing raw patient data, translating this approach into clinical practice requires navigating complex technical regulatory and governance challenges. In this talk I presented the project Fed-BioMed in which we aim at providing the technical foundations for translating FL into real-world healthcare applications. Fed-BioMed is an open-source healthcare-first FL framework designed to support the deployment of FL across real multi-hospital consortia spanning domains from oncology to neurology and cardiac imaging. Healthcare-first means that security governance and regulatory compliance are engineered into the system from the ground up. Beyond infrastructure the Fed-BioMed platform provides a unique testbed for innovative research as demonstrated by recent advances in federated data harmonization and machine unlearning. Our vision is an open interdisciplinary ecosystem for medical AI governed transparently and built collaboratively across academia healthcare and industry.
Bio: Marco Lorenzi is Research Director (DR) in the EPIONE team at Inria Sophia Antipolis and Université Côte d’Azur. Prior to this position, he held an affiliate appointment within the School of Biomedical Engineering & Imaging Sciences at King's College London as Visiting Senior Lecturer, served as Research Associate in the Centre for Medical Image Computing at University College London, and worked as a Researcher at the Hospital San Giovanni di Dio Fatebenefratelli in Italy. His research focuses on statistical learning methods for modeling heterogeneous data in biomedical applications. Together with his team, he develops approaches to model disease progression from multimodal biomedical data in order to support clinical decision-making and simulate intervention strategies. His work also addresses the analysis of multi-centric biomedical information, with a particular emphasis on developing methods and software for collaborative learning in healthcare applications.