arxiv:2604.14663

EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection

Published on Apr 16

· Submitted by

Noor Islam S. Mohammad on Apr 20

Istanbul Technical University

Upvote

Authors:

Noor Islam S. Mohammad

Abstract

EdgeDetect enables efficient and secure federated intrusion detection for 6G-IoT environments through gradient binarization and homomorphic encryption, achieving high accuracy with reduced communication overhead and strong privacy protection.

AI-generated summary

Federated learning (FL) enables collaborative intrusion detection without raw data exchange, but conventional FL incurs high communication overhead from full-precision gradient transmission and remains vulnerable to gradient inference attacks. This paper presents EdgeDetect, a communication-efficient and privacy-aware federated IDS for bandwidth-constrained 6G-IoT environments. EdgeDetect introduces gradient smartification, a median-based statistical binarization that compresses local updates to {+1,-1} representations, reducing uplink payload by 32times while preserving convergence. We further integrate Paillier homomorphic encryption over binarized gradients, protecting against honest-but-curious servers without exposing individual updates. Experiments on CIC-IDS2017 (2.8M flows, 7 attack classes) demonstrate 98.0% multi-class accuracy and 97.9% macro F1-score, matching centralized baselines, while reducing per-round communication from 450~MB to 14~MB (96.9% reduction). Raspberry Pi-4 deployment confirms edge feasibility: 4.2~MB memory, 0.8~ms latency, and 12~mJ per inference with <0.5% accuracy loss. Under 5% poisoning attacks and severe imbalance, EdgeDetect maintains 87% accuracy and 0.95 minority class F1 (p<0.001), establishing a practical accuracy, communication, and privacy tradeoff for next-generation edge intrusion detection.

View arXiv page View PDF Add to collection

Community

nislamsm

Paper author Paper submitter about 19 hours ago

EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection

Authors: Noor Islam S. Mohammad
Affiliation: Department of Computer Science, Istanbul Technical University, Maslak, TR
Email: islam23@itu.edu.tr
arXiv: 2604.14663v1 [cs.CR]
Date: 16 Apr 2026

Abstract

EdgeDetect introduces gradient smartification, a median-based statistical binarization that compresses local updates to {+1, −1} representations, reducing uplink payload by 32× while preserving convergence. We further integrate Paillier homomorphic encryption over binarized gradients, protecting against honest-but-curious servers without exposing individual updates.

Key Results:

98.0% multi-class accuracy and 97.9% macro F1-score on CIC-IDS2017 (2.8M flows, 7 attack classes)
96.9% communication reduction (450 MB → 14 MB per round)
Raspberry Pi 4 deployment: 4.2 MB memory, 0.8 ms latency, 12 mJ per inference with <0.5% accuracy loss
Robustness: Maintains 87% accuracy under 5% poisoning attacks with 0.95 minority class F1 (p < 0.001)

Index Terms: PPFL, IDS, Edge Computing, 6G Security, IoT Networks, Communication Efficiency, Machine Learning

1. Introduction

Next-generation wireless technologies (5G, 6G, IoT) enable massive machine-type communications while expanding attack surfaces for sophisticated cyber threats. Traditional centralized IDS face:

Scalability bottlenecks
Communication latency
Single points of failure
Difficulty handling high dimensionality and severe class imbalance

Federated Learning addresses this but faces two critical challenges:

Communication overhead - High-dimensional gradient vectors consume excessive bandwidth
Gradient leakage - Shared updates may be reverse-engineered to reconstruct sensitive training samples

Contributions

1. Alignment-Aware Federated IDS Architecture

Privacy-preserving federated intrusion detection framework for 6G-IoT
Integrates PCA-based dimensionality reduction, imbalance-aware sampling, and secure aggregation
Enables collaborative learning without sharing raw network traffic

2. Adaptive Median-Based Gradient Smartification with Encrypted Aggregation

Statistically adaptive median-threshold binarization strategy
Compresses gradients into {+1, −1} while preserving directional alignment
Combined with Paillier homomorphic encryption
Achieves up to 32× communication reduction while mitigating gradient inversion risks

3. Quantified Privacy–Utility–Efficiency Trade-off

Extensive ablation and adversarial analyses
98.0% multi-class accuracy with 96.9% communication reduction
Performance comparable to centralized baselines
Cryptographic privacy guarantees
Maintains >85% accuracy with 20% malicious clients
Reduces inversion PSNR from 31.7 dB to 15.1 dB

4. Edge-Validated Deployment

Real-world deployment on Raspberry Pi 4
Only 4.2 MB memory, 0.8 ms latency
12 mJ per inference with <0.5% accuracy degradation
Validates suitability for resource-constrained 6G-IoT environments

2. Related Work

A. Deep Learning-Based Anomaly Detection

CNN–RNN and LSTM architectures for DDoS and zero-day detection
Image-based encodings of time-series traffic for spatial feature extraction
SVMs and random forests remain competitive for structured features

B. Federated Learning in IoT Networks

Enables decentralized training without sharing raw data
Applications: IoT security, industrial sensor networks, cross-domain intrusion detection
Edge–cloud collaborative architectures reduce response latency
Challenge: Standard FL (FedAvg) relies on full-precision gradient exchange

C. Privacy Preservation and Gradient Compression

Differential Privacy (DP) and Homomorphic Encryption (HE) improve confidentiality
Communication-efficient methods: signSGD, gradient sparsification
Few approaches jointly optimize gradient compression and encrypted aggregation

D. Distinction from signSGD and Quantized FL

Unlike fixed-threshold quantizers (QSGD, TernGrad):

Adaptive per-client threshold adapts to gradient distribution
Preserves relative ordering within each gradient vector
Exploits heavy-tailed distributions typical in IDS models

3. System Architecture

Protocol Flow

EdgeDetect comprises K resource-constrained edge clients and a central aggregation server.

Phase 1: Client-Side Local Training

W_{i}^{(r+1)} = W_{i}^{(r)} − η∇L(W_{i}^{(r)}, D_i)
Δ_{i}^{(r)} = W_{i}^{(r+1)} − W^{(r)}

Phase 2: Gradient Smartification

θ_i = median(|Δ_{i}^{(r)}|)
Δ^{bin}_{i,j} = +1  if Δ_{i,j} ≥ θ_i
                -1  otherwise

Phase 3: Privacy-Preserving Encryption

C_{i}^{(r)} = E(Δ^{bin}_{i})  # Paillier encryption

Phase 4: Secure Aggregation and Global Update

Δ^{bin}_{agg} = (1/|S_r|) × Σ D(C_{i}^{(r)})
W^{(r+1)} = W^{(r)} + α · Δ^{bin}_{agg}

4. Methodology

A. Data Exploration and Preprocessing

CIC-IDS2017 Dataset:

2,830,743 records with 79 features
308,381 duplicate rows (removed)
0.06% missing/infinite values (imputed via median)
47.5% memory reduction via numerical downcasting
Severe class imbalance mitigation: 20% stratified sample

B. Feature Engineering and Selection

Temporal Features

Δt_mean = (1/(n-1)) × Σ(t_i − t_{i-1})
Δt_std = √[(1/(n-1)) × Σ(Δt_i − Δt_mean)²]

Entropy-Based Features

H(S) = −Σ p(s) log₂ p(s)

Captures distributional randomness in packet sizes.

Feature Selection

Recursive Feature Elimination (RFE) using Random Forest permutation importance
Ranking: I_j = (1/T) × Σ I(f_t(D) ≠ f^{-j}_t(D))

C. Dimensionality Reduction via Incremental PCA

Cov(Z) = (1/(n-1)) × Z^T Z = V Λ V^T
Z_PCA = Z V_k

Result: Reduced from 78 to 35 principal components, retaining 99.3% variance while reducing feature dimensionality by 55%.

D. Class Balancing Strategies

Binary Classification

Random under-sampling: D_bal = D_min ∪ Sample(D_max, |D_min|)
Result: 15,000 balanced instances (7,500 benign, 7,500 attack)

Multi-Class Classification

SMOTE: x_new = x_i + λ(x_{ij} − x_i), where λ ~ U(0, 1)
Adaptive SMOTE: λ ~ Beta(α, β), where α = 1 + ρ_i, β = 1 + (1 − ρ_i)

E. Machine Learning Models

Model	Configuration
Logistic Regression (Elastic Net)	α = 0.01, ρ = 0.5
SVM (RBF Kernel)	γ = 0.001, C = 1.0
Random Forest	T = 100 trees, max depth 20
Gradient Boosting	ν = 0.1
Neural Network (MLP)	35 → 128 → 64 → K, dropout=0.5, Adam

F. Evaluation Metrics

Accuracy, Precision, Recall, F1-Score
Matthews Correlation Coefficient (MCC)
Cohen's Kappa (κ)
Area Under ROC Curve (AUC-ROC)

5. Experimental Setup

A. Dataset Construction and Sampling Validation

CIC-IDS2017 Sampling:

Original: N = 2,830,540 flows
Stratified 20% subset: n = 504,472
Kolmogorov–Smirnov tests: p > 0.05 (no significant deviations)
92% of features: <5% mean deviation
After PCA: k = 35 components (99.3% variance retained)

Train-Test Split:

80:20 stratified split (seed 42)
Binary: 15,000 samples (7,500 benign, 7,500 attack)
Multi-class: 35,000 samples via SMOTE (5,000 per class)

B. Hyperparameter Optimization

Configurations:

Config 1 (Efficiency): Computational efficiency prioritized
Config 2 (Expressiveness): Accuracy maximized via 3-fold grid search

Key Hyperparameters:

Logistic Regression: C ∈ {0.1, 100}
SVM: RBF kernel with γ = 0.1
Random Forest: n ∈ {100, 200}, depth=20
Decision Tree: depth ∈ {6, 10, 15}
KNN: k ∈ {3, 5, 7}

C. Evaluation Protocol

Stage 1: Cross-Validation

5-fold stratified cross-validation on training partition (n = 12,000)
Stratification preserves 50:50 benign-to-attack ratio
Fold-to-fold variability: σ_CV = √[(1/(K-1)) × Σ(Acc_i − Acc̄)²]

Stage 2: Hold-Out Testing

Best configuration retrained on full training set
Evaluated on held-out test set (n = 3,000, 20%)
Metrics: Accuracy, Precision, Recall, F1, ROC-AUC, Confusion matrices

Statistical Reliability

Three independent random seeds: 42, 123, 456
95% confidence intervals: CI_95% = x̄ ± 1.96 × (σ/√n)

6. Experimental Results

A. Binary Classification Performance

Linear Models

Logistic Regression: 92.21% accuracy (σ = 5.81 × 10⁻³)
Config 2 improvement: +0.30% to 92.51%

Kernel-Based Methods

SVM (Linear): 83.00% (underfits)
SVM (RBF): 96.14% (+13.14%, σ = 3.89 × 10⁻³)

Tree-Based Ensembles

Random Forest Config 1: 95.98%
Random Forest Config 2: 98.09% (+2.11%, σ = 1.72 × 10⁻³) ✓ BEST

Instance-Based Learning

KNN (k=5): 97.40% (σ = 0.89 × 10⁻³)
KNN (k=3): 97.93% (+0.53%, σ = 1.27 × 10⁻³)

B. Multi-Class Classification Performance

Model	CV Acc.	Test Acc.	Precision	Recall	F1
Random Forest (T=10, d=6)	96.0±0.009	97.1	96.9	97.0	96.9
Random Forest (T=15, d=8, m=20)	98.0±0.007	98.0	97.9	98.0	97.9
Decision Tree (d=10)	96.0±0.012	90.3	90.1	90.2	90.1
KNN (k=7, distance-wt)	94.0±0.014	95.2	95.0	95.3	95.1

C. Per-Class Breakdown (Random Forest Config 2)

Attack Class	Precision	Recall	F1-Score
BENIGN	99.2%	98.5%	98.9%
DoS	98.8%	99.0%	98.9%
DDoS	98.6%	98.9%	98.7%
Port Scan	95.7%	97.6%	96.6%
Brute Force	95.1%	97.5%	96.3%
Web Attack	91.9%	96.0%	93.9%
Bot	90.2%	95.3%	92.7%

7. Federated Learning Convergence Analysis

A. Convergence and Compression Trade-off

EdgeDetect achieves convergence parity with full-precision FedAvg at 32× compression:

Across 2.8M CIC-IDS2017 samples
No measurable accuracy degradation (Δ < 0.2 pp)
Cosine similarity: 0.87 ± 0.04

B. Privacy Enhancement Through Smartification

Method	Technique	PSNR (dB)	Label Recovery
FedAvg (Undefended)	None	31.7	High-fidelity
signSGD	Zero-threshold	16.8	Partial recovery
EdgeDetect	Median-threshold	15.1	14.3% (random)

C. Theoretical Convergence Analysis

Lemma 1 (Descent under Median-Threshold Smartification):
Let L(W) be L-smooth and bounded below. Let g̃_t denote the smartified gradient with cosine similarity cos(θ_t) = ⟨g_t, g̃_t⟩ / (∥g_t∥ ∥g̃_t∥) ≥ γ > 0.

For sufficiently small step size η:

E[L(W_{t+1})] ≤ L(W_t) − ηγ∥g_t∥² + (Lη²/2)∥g̃_t∥²

Theorem 1 (Convergence under Bounded Variance):
Assume bounded stochastic gradient variance σ² and cosine similarity γ > 0. Then after T rounds:

min_{t≤T} E[∥∇L(W_t)∥²] = O(1 / (γ√T))

8. Federated Learning Scalability

A. Convergence Under Different Heterogeneity Levels

Distribution	K=50 Clients	R₉₅	R₉₈	Accuracy	Bandwidth
IID	FedAvg	142	287	98.2%	129.15 GB
	EdgeDetect	145	289	98.0%	4.05 GB
Non-IID (α=1.0)	FedAvg	201	423	96.4%	190.35 GB
	EdgeDetect	192	398	96.8%	5.57 GB
Non-IID (α=0.1)	FedAvg	312	687	93.8%	309.15 GB
	EdgeDetect	287	612	94.2%	8.57 GB
	EdgeDetect+FedProx	264	563	95.1%	7.88 GB

B. Scalability with Number of Clients

K Clients	IID Distribution	R₉₈	Accuracy	Total Bandwidth
10	IID	201	98.1%	2.81 GB
25	IID	254	98.0%	3.56 GB
100	IID	356	97.9%	4.98 GB
500	IID	467	97.7%	6.54 GB

Sublinear scaling: Increasing clients from K=10 to K=500 raises R₉₈ from 201 to 467 (sublinear in K).

9. Ablation Study

Component-wise Impact Analysis

Configuration	Accuracy	Communication	PSNR (dB)	Invertible?
Full EdgeDetect	98.0%	14.0 MB	15.1	No
– Smartification	98.2%	450.0 MB ↑32×	15.1	Protected
– Encryption (HE)	98.0%	14.0 MB	31.7 ↑	Yes
– DP Noise	98.1%	14.0 MB	14.2	Protected
– PCA (78 features)	97.9%	58.2 MB ↑4×	15.3	Protected
– SMOTE	94.2% ↓	14.0 MB	15.1	Protected
FedAvg (No Protection)	98.2%	450.0 MB	31.7	Yes
signSGD	97.8%	14.1 MB	16.8	Partial

Key Findings:

Smartification: Essential for communication efficiency (32×), negligible accuracy loss
Encryption: Critical for privacy (PSNR 31.7 → 15.1 dB)
SMOTE: Essential for accuracy (+3.8 pp gain)
PCA: Reduces dimensionality (4.16×) with negligible impact

10. Comparison with State-of-the-Art

Study	Year	Model	Accuracy	F1	Dataset	Classes	Privacy	Comm. (MB)
Centralized Approaches
Alam et al.	2023	CNN	97.2%	96.8	CIC-IDS2017	Binary	✗	N/A
Ghani et al.	2023	XGBoost	96.1%	95.4	CIC-IDS2017	7-class	✗	N/A
Savic et al.	2021	LSTM-AE	95.5%	94.2	NSL-KDD	Binary	✗	N/A
Federated Learning Approaches
Liu et al.	2023	Fed-DNN	96.3%	95.1	UNSW-NB15	5-class	DP	380
Wang et al.	2022	Fed-CNN	94.7%	93.8	CIC-IDS2017	Binary	✗	520
Zhang et al.	2022	FedAvg-LSTM	93.5%	92.4	KDD-CUP99	4-class	DP	410
Chen et al.	2021	Fed-XGB	95.8%	94.9	IoT-23	Binary	SecAgg	290
This Work
EdgeDetect	2026	Fed-RF	98.0%	97.9%	CIC-IDS2017	7-class	HE	14
(Binary)			96.0%	96.0%		Binary	HE	14

Key Advantages:

Highest accuracy on CIC-IDS2017 (98.0% vs 96.3%)
96.9% communication reduction vs federated baselines (14 MB vs 290-520 MB)
Strongest cryptographic privacy (Paillier HE vs DP/SecAgg)
Practical edge deployment (4.2 MB, 0.8 ms on Raspberry Pi 4)

11. Edge Deployment Evaluation

A. Raspberry Pi 4 Deployment

Metric	Random Forest	KNN	SVM	Logistic Reg.
Memory	234 MB	412 MB	178 MB	45 MB
Training Time	12.3 s	0.3 s*	18.7 s	2.4 s
Inference Latency	0.87 ms	3.21 ms	1.45 ms	0.12 ms
Energy per Inference	12 mJ	—	—	—
Accuracy	98.0%	95.2%	96.0%	93.0%

*KNN training is instantaneous (lazy learning) but requires 412 MB for storage.

B. Resource-Constrained Feasibility

Memory footprint: 4.2 MB per client for gradient storage
Encryption overhead: 156.4 ms per round (per-round encryption complexity O(d log n))
Total bandwidth per round: 14 MB (vs 450 MB for full-precision)
Accuracy loss on edge: <0.5% when deployed on Raspberry Pi 4

12. Robustness Analysis

A. Poisoning Attack Resilience

Setting: 5% to 20% of clients send poisoned updates

Poisoning Rate	Accuracy	Macro F1	p-value
0% (Clean)	98.0%	0.979	—
5%	96.4%	0.961	<0.001
10%	92.1%	0.918	<0.001
15%	89.3%	0.887	<0.001
20%	87.0%	0.850	<0.001

Conclusion: Maintains >85% accuracy even with 20% malicious clients (p < 0.001).

B. Differential Privacy-Utility Trade-off

ε	δ	Accuracy	F1	Privacy Loss
10.0	10⁻⁵	98.2%	0.980	Weak
1.0	10⁻⁵	98.1%	0.979	Moderate
0.1	10⁻⁵	96.8%	0.965	Strong

13. Discussion

Key Insights for Federated IDS in 6G-IoT

PCA reveals strong redundancy: 35 components retain 99.3% variance with negligible performance loss, enabling efficient computation and communication.
Random Forest optimal: Best stability–accuracy trade-off (98.0% accuracy, 97.9% macro F1, σ = 0.0017).
Imbalance handling essential: SMOTE–undersampling improves minority recall from 0.39 to 0.98.
Gradient smartification superior to signSGD:
- Preserves gradient alignment (0.87±0.04 cosine similarity)
- Achieves 96.9% communication reduction
- Improves privacy by lowering gradient entropy
Paillier encryption effective: Complete inversion resistance while retaining 98.7% of centralized accuracy.

Challenges and Future Work

Non-convex convergence: Theoretical analysis for deep learning architectures
Concept drift: Adaptation to evolving attack patterns
White-box robustness: Defense against adversarial gradient attacks
Cumulative privacy loss: Formal composition under differential privacy

14. Conclusion

EdgeDetect introduces a privacy-preserving federated intrusion detection framework for resource-constrained 6G-IoT environments. The framework employs:

Gradient smartification: Median-based binarization achieving 32× communication reduction
Paillier homomorphic encryption: Only aggregated updates visible to server
Adaptive class balancing: SMOTE for minority-class robustness
Secure federated aggregation: Protection against inference and poisoning attacks

Performance Summary

Metric	Value
Multi-class Accuracy	98.0%
Macro F1-Score	97.9%
Communication Reduction	96.9% (450 MB → 14 MB)
Edge Memory	4.2 MB
Edge Latency	0.8 ms
Poisoning Resilience (20% attackers)	87% accuracy
Gradient Inversion PSNR	15.1 dB (vs 31.7 dB undefended)

EdgeDetect demonstrates that secure federated IDS can meet strict privacy, efficiency, and reliability requirements of next-generation 6G-IoT edge networks.

Acknowledgments

We thank the Canadian Institute for Cybersecurity for providing the CIC-IDS2017 dataset and the anonymous reviewers for their valuable feedback.

References

[1] P. Kairouz, et al., "Advances and open problems in federated learning," Foundations and Trends in Machine Learning, 2021.

[2] Y. Liu, J. Zhang, and H. V. Poor, "Federated deep learning for intrusion detection with differential privacy," IEEE Transactions on Information Forensics and Security, 2023.

[3-62] [See original paper for complete reference list]

Document Generated: 2026
Source: https://arxiv.org/abs/2604.14663v1

librarian-bot

about 2 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2604.14663

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.14663 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.14663 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.14663 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.