Scalable and Privacy-Preserving Federated Principal Component Analysis

Sep 1, 2023ยท
David Froelicher
,
Hyunghoon Cho
,
Manaswitha Edupalli
,
Joao Sa Sousa
Jean-Philippe Bossuat
Jean-Philippe Bossuat
,
Apostolos Pyrgelis
,
Juan R. Troncoso-Pastoriza
,
Jean-Pierre Hubaux
ยท 0 min read
Abstract
Principal component analysis (PCA) is an essential algorithm for dimensionality reduction in many data science domains. We address the problem of performing a federated PCA on private data distributed among multiple data providers while ensuring data confidentiality. Our solution, SF-PCA, is an end-to-end secure system that preserves the confidentiality of both the original data and all intermediate results in a passive-adversary model with up to all-but-one colluding parties. SF-PCA jointly leverages multiparty homomorphic encryption, interactive protocols, and edge computing to efficiently interleave computations on local cleartext data with operations on collectively encrypted data. SF-PCA obtains results as accurate as non-secure centralized solutions, independently of the data distribution among the parties. It scales linearly or better with the dataset dimensions and with the number of data providers. SF-PCA is more precise than existing approaches that approximate the solution by combining local analysis results, and between 3x and 250x faster than privacy-preserving alternatives based solely on secure multiparty computation or homomorphic encryption. Our work demonstrates the practical applicability of secure and federated PCA on private distributed datasets.
Type
Publication
2023 IEEE Symposium on Security and Privacy