EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection
Abstract
EdgeDetect enables efficient and secure federated intrusion detection for 6G-IoT environments through gradient binarization and homomorphic encryption, achieving high accuracy with reduced communication overhead and strong privacy protection.
Federated learning (FL) enables collaborative intrusion detection without raw data exchange, but conventional FL incurs high communication overhead from full-precision gradient transmission and remains vulnerable to gradient inference attacks. This paper presents EdgeDetect, a communication-efficient and privacy-aware federated IDS for bandwidth-constrained 6G-IoT environments. EdgeDetect introduces gradient smartification, a median-based statistical binarization that compresses local updates to {+1,-1} representations, reducing uplink payload by 32times while preserving convergence. We further integrate Paillier homomorphic encryption over binarized gradients, protecting against honest-but-curious servers without exposing individual updates. Experiments on CIC-IDS2017 (2.8M flows, 7 attack classes) demonstrate 98.0% multi-class accuracy and 97.9% macro F1-score, matching centralized baselines, while reducing per-round communication from 450~MB to 14~MB (96.9% reduction). Raspberry Pi-4 deployment confirms edge feasibility: 4.2~MB memory, 0.8~ms latency, and 12~mJ per inference with <0.5% accuracy loss. Under 5% poisoning attacks and severe imbalance, EdgeDetect maintains 87% accuracy and 0.95 minority class F1 (p<0.001), establishing a practical accuracy, communication, and privacy tradeoff for next-generation edge intrusion detection.
Community
EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection
Authors: Noor Islam S. Mohammad
Affiliation: Department of Computer Science, Istanbul Technical University, Maslak, TR
Email: islam23@itu.edu.tr
arXiv: 2604.14663v1 [cs.CR]
Date: 16 Apr 2026
Abstract
Federated learning (FL) enables collaborative intrusion detection without raw data exchange, but conventional FL incurs high communication overhead from full-precision gradient transmission and remains vulnerable to gradient inference attacks. This paper presents EdgeDetect, a communication-efficient and privacy-aware federated IDS for bandwidth-constrained 6G-IoT environments.
EdgeDetect introduces gradient smartification, a median-based statistical binarization that compresses local updates to {+1, −1} representations, reducing uplink payload by 32× while preserving convergence. We further integrate Paillier homomorphic encryption over binarized gradients, protecting against honest-but-curious servers without exposing individual updates.
Key Results:
- 98.0% multi-class accuracy and 97.9% macro F1-score on CIC-IDS2017 (2.8M flows, 7 attack classes)
- 96.9% communication reduction (450 MB → 14 MB per round)
- Raspberry Pi 4 deployment: 4.2 MB memory, 0.8 ms latency, 12 mJ per inference with <0.5% accuracy loss
- Robustness: Maintains 87% accuracy under 5% poisoning attacks with 0.95 minority class F1 (p < 0.001)
Index Terms: PPFL, IDS, Edge Computing, 6G Security, IoT Networks, Communication Efficiency, Machine Learning
1. Introduction
Next-generation wireless technologies (5G, 6G, IoT) enable massive machine-type communications while expanding attack surfaces for sophisticated cyber threats. Traditional centralized IDS face:
- Scalability bottlenecks
- Communication latency
- Single points of failure
- Difficulty handling high dimensionality and severe class imbalance
Federated Learning addresses this but faces two critical challenges:
- Communication overhead - High-dimensional gradient vectors consume excessive bandwidth
- Gradient leakage - Shared updates may be reverse-engineered to reconstruct sensitive training samples
Contributions
1. Alignment-Aware Federated IDS Architecture
- Privacy-preserving federated intrusion detection framework for 6G-IoT
- Integrates PCA-based dimensionality reduction, imbalance-aware sampling, and secure aggregation
- Enables collaborative learning without sharing raw network traffic
2. Adaptive Median-Based Gradient Smartification with Encrypted Aggregation
- Statistically adaptive median-threshold binarization strategy
- Compresses gradients into {+1, −1} while preserving directional alignment
- Combined with Paillier homomorphic encryption
- Achieves up to 32× communication reduction while mitigating gradient inversion risks
3. Quantified Privacy–Utility–Efficiency Trade-off
- Extensive ablation and adversarial analyses
- 98.0% multi-class accuracy with 96.9% communication reduction
- Performance comparable to centralized baselines
- Cryptographic privacy guarantees
- Maintains >85% accuracy with 20% malicious clients
- Reduces inversion PSNR from 31.7 dB to 15.1 dB
4. Edge-Validated Deployment
- Real-world deployment on Raspberry Pi 4
- Only 4.2 MB memory, 0.8 ms latency
- 12 mJ per inference with <0.5% accuracy degradation
- Validates suitability for resource-constrained 6G-IoT environments
2. Related Work
A. Deep Learning-Based Anomaly Detection
- CNN–RNN and LSTM architectures for DDoS and zero-day detection
- Image-based encodings of time-series traffic for spatial feature extraction
- SVMs and random forests remain competitive for structured features
B. Federated Learning in IoT Networks
- Enables decentralized training without sharing raw data
- Applications: IoT security, industrial sensor networks, cross-domain intrusion detection
- Edge–cloud collaborative architectures reduce response latency
- Challenge: Standard FL (FedAvg) relies on full-precision gradient exchange
C. Privacy Preservation and Gradient Compression
- Differential Privacy (DP) and Homomorphic Encryption (HE) improve confidentiality
- Communication-efficient methods: signSGD, gradient sparsification
- Few approaches jointly optimize gradient compression and encrypted aggregation
D. Distinction from signSGD and Quantized FL
Unlike fixed-threshold quantizers (QSGD, TernGrad):
- Adaptive per-client threshold adapts to gradient distribution
- Preserves relative ordering within each gradient vector
- Exploits heavy-tailed distributions typical in IDS models
3. System Architecture
Protocol Flow
EdgeDetect comprises K resource-constrained edge clients and a central aggregation server.
Phase 1: Client-Side Local Training
W_{i}^{(r+1)} = W_{i}^{(r)} − η∇L(W_{i}^{(r)}, D_i)
Δ_{i}^{(r)} = W_{i}^{(r+1)} − W^{(r)}
Phase 2: Gradient Smartification
θ_i = median(|Δ_{i}^{(r)}|)
Δ^{bin}_{i,j} = +1 if Δ_{i,j} ≥ θ_i
-1 otherwise
Phase 3: Privacy-Preserving Encryption
C_{i}^{(r)} = E(Δ^{bin}_{i}) # Paillier encryption
Phase 4: Secure Aggregation and Global Update
Δ^{bin}_{agg} = (1/|S_r|) × Σ D(C_{i}^{(r)})
W^{(r+1)} = W^{(r)} + α · Δ^{bin}_{agg}
4. Methodology
A. Data Exploration and Preprocessing
CIC-IDS2017 Dataset:
- 2,830,743 records with 79 features
- 308,381 duplicate rows (removed)
- 0.06% missing/infinite values (imputed via median)
- 47.5% memory reduction via numerical downcasting
- Severe class imbalance mitigation: 20% stratified sample
B. Feature Engineering and Selection
Temporal Features
Δt_mean = (1/(n-1)) × Σ(t_i − t_{i-1})
Δt_std = √[(1/(n-1)) × Σ(Δt_i − Δt_mean)²]
Entropy-Based Features
H(S) = −Σ p(s) log₂ p(s)
Captures distributional randomness in packet sizes.
Feature Selection
- Recursive Feature Elimination (RFE) using Random Forest permutation importance
- Ranking: I_j = (1/T) × Σ I(f_t(D) ≠ f^{-j}_t(D))
C. Dimensionality Reduction via Incremental PCA
Cov(Z) = (1/(n-1)) × Z^T Z = V Λ V^T
Z_PCA = Z V_k
Result: Reduced from 78 to 35 principal components, retaining 99.3% variance while reducing feature dimensionality by 55%.
D. Class Balancing Strategies
Binary Classification
- Random under-sampling: D_bal = D_min ∪ Sample(D_max, |D_min|)
- Result: 15,000 balanced instances (7,500 benign, 7,500 attack)
Multi-Class Classification
- SMOTE: x_new = x_i + λ(x_{ij} − x_i), where λ ~ U(0, 1)
- Adaptive SMOTE: λ ~ Beta(α, β), where α = 1 + ρ_i, β = 1 + (1 − ρ_i)
E. Machine Learning Models
| Model | Configuration |
|---|---|
| Logistic Regression (Elastic Net) | α = 0.01, ρ = 0.5 |
| SVM (RBF Kernel) | γ = 0.001, C = 1.0 |
| Random Forest | T = 100 trees, max depth 20 |
| Gradient Boosting | ν = 0.1 |
| Neural Network (MLP) | 35 → 128 → 64 → K, dropout=0.5, Adam |
F. Evaluation Metrics
- Accuracy, Precision, Recall, F1-Score
- Matthews Correlation Coefficient (MCC)
- Cohen's Kappa (κ)
- Area Under ROC Curve (AUC-ROC)
5. Experimental Setup
A. Dataset Construction and Sampling Validation
CIC-IDS2017 Sampling:
- Original: N = 2,830,540 flows
- Stratified 20% subset: n = 504,472
- Kolmogorov–Smirnov tests: p > 0.05 (no significant deviations)
- 92% of features: <5% mean deviation
- After PCA: k = 35 components (99.3% variance retained)
Train-Test Split:
- 80:20 stratified split (seed 42)
- Binary: 15,000 samples (7,500 benign, 7,500 attack)
- Multi-class: 35,000 samples via SMOTE (5,000 per class)
B. Hyperparameter Optimization
Configurations:
- Config 1 (Efficiency): Computational efficiency prioritized
- Config 2 (Expressiveness): Accuracy maximized via 3-fold grid search
Key Hyperparameters:
- Logistic Regression: C ∈ {0.1, 100}
- SVM: RBF kernel with γ = 0.1
- Random Forest: n ∈ {100, 200}, depth=20
- Decision Tree: depth ∈ {6, 10, 15}
- KNN: k ∈ {3, 5, 7}
C. Evaluation Protocol
Stage 1: Cross-Validation
- 5-fold stratified cross-validation on training partition (n = 12,000)
- Stratification preserves 50:50 benign-to-attack ratio
- Fold-to-fold variability: σ_CV = √[(1/(K-1)) × Σ(Acc_i − Acc̄)²]
Stage 2: Hold-Out Testing
- Best configuration retrained on full training set
- Evaluated on held-out test set (n = 3,000, 20%)
- Metrics: Accuracy, Precision, Recall, F1, ROC-AUC, Confusion matrices
Statistical Reliability
- Three independent random seeds: 42, 123, 456
- 95% confidence intervals: CI_95% = x̄ ± 1.96 × (σ/√n)
6. Experimental Results
A. Binary Classification Performance
Linear Models
- Logistic Regression: 92.21% accuracy (σ = 5.81 × 10⁻³)
- Config 2 improvement: +0.30% to 92.51%
Kernel-Based Methods
- SVM (Linear): 83.00% (underfits)
- SVM (RBF): 96.14% (+13.14%, σ = 3.89 × 10⁻³)
Tree-Based Ensembles
- Random Forest Config 1: 95.98%
- Random Forest Config 2: 98.09% (+2.11%, σ = 1.72 × 10⁻³) ✓ BEST
Instance-Based Learning
- KNN (k=5): 97.40% (σ = 0.89 × 10⁻³)
- KNN (k=3): 97.93% (+0.53%, σ = 1.27 × 10⁻³)
B. Multi-Class Classification Performance
| Model | CV Acc. | Test Acc. | Precision | Recall | F1 |
|---|---|---|---|---|---|
| Random Forest (T=10, d=6) | 96.0±0.009 | 97.1 | 96.9 | 97.0 | 96.9 |
| Random Forest (T=15, d=8, m=20) | 98.0±0.007 | 98.0 | 97.9 | 98.0 | 97.9 |
| Decision Tree (d=10) | 96.0±0.012 | 90.3 | 90.1 | 90.2 | 90.1 |
| KNN (k=7, distance-wt) | 94.0±0.014 | 95.2 | 95.0 | 95.3 | 95.1 |
C. Per-Class Breakdown (Random Forest Config 2)
| Attack Class | Precision | Recall | F1-Score |
|---|---|---|---|
| BENIGN | 99.2% | 98.5% | 98.9% |
| DoS | 98.8% | 99.0% | 98.9% |
| DDoS | 98.6% | 98.9% | 98.7% |
| Port Scan | 95.7% | 97.6% | 96.6% |
| Brute Force | 95.1% | 97.5% | 96.3% |
| Web Attack | 91.9% | 96.0% | 93.9% |
| Bot | 90.2% | 95.3% | 92.7% |
7. Federated Learning Convergence Analysis
A. Convergence and Compression Trade-off
EdgeDetect achieves convergence parity with full-precision FedAvg at 32× compression:
- Across 2.8M CIC-IDS2017 samples
- No measurable accuracy degradation (Δ < 0.2 pp)
- Cosine similarity: 0.87 ± 0.04
B. Privacy Enhancement Through Smartification
| Method | Technique | PSNR (dB) | Label Recovery |
|---|---|---|---|
| FedAvg (Undefended) | None | 31.7 | High-fidelity |
| signSGD | Zero-threshold | 16.8 | Partial recovery |
| EdgeDetect | Median-threshold | 15.1 | 14.3% (random) |
C. Theoretical Convergence Analysis
Lemma 1 (Descent under Median-Threshold Smartification):
Let L(W) be L-smooth and bounded below. Let g̃_t denote the smartified gradient with cosine similarity cos(θ_t) = ⟨g_t, g̃_t⟩ / (∥g_t∥ ∥g̃_t∥) ≥ γ > 0.
For sufficiently small step size η:
E[L(W_{t+1})] ≤ L(W_t) − ηγ∥g_t∥² + (Lη²/2)∥g̃_t∥²
Theorem 1 (Convergence under Bounded Variance):
Assume bounded stochastic gradient variance σ² and cosine similarity γ > 0. Then after T rounds:
min_{t≤T} E[∥∇L(W_t)∥²] = O(1 / (γ√T))
8. Federated Learning Scalability
A. Convergence Under Different Heterogeneity Levels
| Distribution | K=50 Clients | R₉₅ | R₉₈ | Accuracy | Bandwidth |
|---|---|---|---|---|---|
| IID | FedAvg | 142 | 287 | 98.2% | 129.15 GB |
| EdgeDetect | 145 | 289 | 98.0% | 4.05 GB | |
| Non-IID (α=1.0) | FedAvg | 201 | 423 | 96.4% | 190.35 GB |
| EdgeDetect | 192 | 398 | 96.8% | 5.57 GB | |
| Non-IID (α=0.1) | FedAvg | 312 | 687 | 93.8% | 309.15 GB |
| EdgeDetect | 287 | 612 | 94.2% | 8.57 GB | |
| EdgeDetect+FedProx | 264 | 563 | 95.1% | 7.88 GB |
B. Scalability with Number of Clients
| K Clients | IID Distribution | R₉₈ | Accuracy | Total Bandwidth |
|---|---|---|---|---|
| 10 | IID | 201 | 98.1% | 2.81 GB |
| 25 | IID | 254 | 98.0% | 3.56 GB |
| 100 | IID | 356 | 97.9% | 4.98 GB |
| 500 | IID | 467 | 97.7% | 6.54 GB |
Sublinear scaling: Increasing clients from K=10 to K=500 raises R₉₈ from 201 to 467 (sublinear in K).
9. Ablation Study
Component-wise Impact Analysis
| Configuration | Accuracy | Communication | PSNR (dB) | Invertible? |
|---|---|---|---|---|
| Full EdgeDetect | 98.0% | 14.0 MB | 15.1 | No |
| – Smartification | 98.2% | 450.0 MB ↑32× | 15.1 | Protected |
| – Encryption (HE) | 98.0% | 14.0 MB | 31.7 ↑ | Yes |
| – DP Noise | 98.1% | 14.0 MB | 14.2 | Protected |
| – PCA (78 features) | 97.9% | 58.2 MB ↑4× | 15.3 | Protected |
| – SMOTE | 94.2% ↓ | 14.0 MB | 15.1 | Protected |
| FedAvg (No Protection) | 98.2% | 450.0 MB | 31.7 | Yes |
| signSGD | 97.8% | 14.1 MB | 16.8 | Partial |
Key Findings:
- Smartification: Essential for communication efficiency (32×), negligible accuracy loss
- Encryption: Critical for privacy (PSNR 31.7 → 15.1 dB)
- SMOTE: Essential for accuracy (+3.8 pp gain)
- PCA: Reduces dimensionality (4.16×) with negligible impact
10. Comparison with State-of-the-Art
| Study | Year | Model | Accuracy | F1 | Dataset | Classes | Privacy | Comm. (MB) |
|---|---|---|---|---|---|---|---|---|
| Centralized Approaches | ||||||||
| Alam et al. | 2023 | CNN | 97.2% | 96.8 | CIC-IDS2017 | Binary | ✗ | N/A |
| Ghani et al. | 2023 | XGBoost | 96.1% | 95.4 | CIC-IDS2017 | 7-class | ✗ | N/A |
| Savic et al. | 2021 | LSTM-AE | 95.5% | 94.2 | NSL-KDD | Binary | ✗ | N/A |
| Federated Learning Approaches | ||||||||
| Liu et al. | 2023 | Fed-DNN | 96.3% | 95.1 | UNSW-NB15 | 5-class | DP | 380 |
| Wang et al. | 2022 | Fed-CNN | 94.7% | 93.8 | CIC-IDS2017 | Binary | ✗ | 520 |
| Zhang et al. | 2022 | FedAvg-LSTM | 93.5% | 92.4 | KDD-CUP99 | 4-class | DP | 410 |
| Chen et al. | 2021 | Fed-XGB | 95.8% | 94.9 | IoT-23 | Binary | SecAgg | 290 |
| This Work | ||||||||
| EdgeDetect | 2026 | Fed-RF | 98.0% | 97.9% | CIC-IDS2017 | 7-class | HE | 14 |
| (Binary) | 96.0% | 96.0% | Binary | HE | 14 |
Key Advantages:
- Highest accuracy on CIC-IDS2017 (98.0% vs 96.3%)
- 96.9% communication reduction vs federated baselines (14 MB vs 290-520 MB)
- Strongest cryptographic privacy (Paillier HE vs DP/SecAgg)
- Practical edge deployment (4.2 MB, 0.8 ms on Raspberry Pi 4)
11. Edge Deployment Evaluation
A. Raspberry Pi 4 Deployment
| Metric | Random Forest | KNN | SVM | Logistic Reg. |
|---|---|---|---|---|
| Memory | 234 MB | 412 MB | 178 MB | 45 MB |
| Training Time | 12.3 s | 0.3 s* | 18.7 s | 2.4 s |
| Inference Latency | 0.87 ms | 3.21 ms | 1.45 ms | 0.12 ms |
| Energy per Inference | 12 mJ | — | — | — |
| Accuracy | 98.0% | 95.2% | 96.0% | 93.0% |
*KNN training is instantaneous (lazy learning) but requires 412 MB for storage.
B. Resource-Constrained Feasibility
- Memory footprint: 4.2 MB per client for gradient storage
- Encryption overhead: 156.4 ms per round (per-round encryption complexity O(d log n))
- Total bandwidth per round: 14 MB (vs 450 MB for full-precision)
- Accuracy loss on edge: <0.5% when deployed on Raspberry Pi 4
12. Robustness Analysis
A. Poisoning Attack Resilience
Setting: 5% to 20% of clients send poisoned updates
| Poisoning Rate | Accuracy | Macro F1 | p-value |
|---|---|---|---|
| 0% (Clean) | 98.0% | 0.979 | — |
| 5% | 96.4% | 0.961 | <0.001 |
| 10% | 92.1% | 0.918 | <0.001 |
| 15% | 89.3% | 0.887 | <0.001 |
| 20% | 87.0% | 0.850 | <0.001 |
Conclusion: Maintains >85% accuracy even with 20% malicious clients (p < 0.001).
B. Differential Privacy-Utility Trade-off
| ε | δ | Accuracy | F1 | Privacy Loss |
|---|---|---|---|---|
| 10.0 | 10⁻⁵ | 98.2% | 0.980 | Weak |
| 1.0 | 10⁻⁵ | 98.1% | 0.979 | Moderate |
| 0.1 | 10⁻⁵ | 96.8% | 0.965 | Strong |
13. Discussion
Key Insights for Federated IDS in 6G-IoT
PCA reveals strong redundancy: 35 components retain 99.3% variance with negligible performance loss, enabling efficient computation and communication.
Random Forest optimal: Best stability–accuracy trade-off (98.0% accuracy, 97.9% macro F1, σ = 0.0017).
Imbalance handling essential: SMOTE–undersampling improves minority recall from 0.39 to 0.98.
Gradient smartification superior to signSGD:
- Preserves gradient alignment (0.87±0.04 cosine similarity)
- Achieves 96.9% communication reduction
- Improves privacy by lowering gradient entropy
Paillier encryption effective: Complete inversion resistance while retaining 98.7% of centralized accuracy.
Challenges and Future Work
- Non-convex convergence: Theoretical analysis for deep learning architectures
- Concept drift: Adaptation to evolving attack patterns
- White-box robustness: Defense against adversarial gradient attacks
- Cumulative privacy loss: Formal composition under differential privacy
14. Conclusion
EdgeDetect introduces a privacy-preserving federated intrusion detection framework for resource-constrained 6G-IoT environments. The framework employs:
- Gradient smartification: Median-based binarization achieving 32× communication reduction
- Paillier homomorphic encryption: Only aggregated updates visible to server
- Adaptive class balancing: SMOTE for minority-class robustness
- Secure federated aggregation: Protection against inference and poisoning attacks
Performance Summary
| Metric | Value |
|---|---|
| Multi-class Accuracy | 98.0% |
| Macro F1-Score | 97.9% |
| Communication Reduction | 96.9% (450 MB → 14 MB) |
| Edge Memory | 4.2 MB |
| Edge Latency | 0.8 ms |
| Poisoning Resilience (20% attackers) | 87% accuracy |
| Gradient Inversion PSNR | 15.1 dB (vs 31.7 dB undefended) |
EdgeDetect demonstrates that secure federated IDS can meet strict privacy, efficiency, and reliability requirements of next-generation 6G-IoT edge networks.
Acknowledgments
We thank the Canadian Institute for Cybersecurity for providing the CIC-IDS2017 dataset and the anonymous reviewers for their valuable feedback.
References
[1] P. Kairouz, et al., "Advances and open problems in federated learning," Foundations and Trends in Machine Learning, 2021.
[2] Y. Liu, J. Zhang, and H. V. Poor, "Federated deep learning for intrusion detection with differential privacy," IEEE Transactions on Information Forensics and Security, 2023.
[3-62] [See original paper for complete reference list]
Document Generated: 2026
Source: https://arxiv.org/abs/2604.14663v1
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.14663 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper