A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
FEVERLESS: Fast and Secure Vertical Federated Learning Based on XGBoost for Decentralized Labels
Tekijät: Wang, Rui; Ersoy, Oguzhan; Zhu, Hangyu; Jin, Yaochu; Liang, Kaitai
Kustantaja: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Julkaisuvuosi: 2024
Lehti: IEEE Transactions on Big Data
Vuosikerta: 10
Numero: 6
Aloitussivu: 1001
Lopetussivu: 1015
ISSN: 2372-2096
eISSN: 2332-7790
DOI: https://doi.org/10.1109/TBDATA.2022.3227326
Julkaisun avoimuus kirjaamishetkellä: Ei avoimesti saatavilla
Julkaisukanavan avoimuus : Osittain avoin julkaisukanava
Verkko-osoite: https://ieeexplore.ieee.org/document/9973381
Tiivistelmä
Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of n-2 out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.
Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of n-2 out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.