Network Anomaly Detection - Transfer Learning Model
A transfer learning-based anomaly detection system designed to identify zero-day style attacks in network traffic.
Model Description
This model combines:
- Variational Autoencoder (VAE): Pre-trained on large corpus of normal network traffic, fine-tuned on target domain
- Isolation Forest: Trained on latent space representations for anomaly scoring
Architecture
- Input Features: 7 network traffic features (protocol, packet length, inter-arrival time, ports, TCP flags)
- Latent Dimension: 8
- Encoder: 7 β 16 β 8
- Decoder: 8 β 16 β 7
Training Details
Dataset
- Source: NSL-KDD Intrusion Detection Dataset
- Pre-training: 6,400 normal network samples
- Fine-tuning: 8,000 mixed samples (80% normal, 20% attacks)
- Test Set: 2,000 samples (80% normal, 20% attacks)
Training Configuration
- Pre-training epochs: 50
- Fine-tuning epochs: 10
- Optimizer: Adam (learning rate: 0.001)
- Loss: Reconstruction error (MSE)
- Contamination parameter: 0.05 (5% expected anomaly rate)
Performance
Metrics
| Metric | Value |
|---|---|
| Accuracy | 82.95% |
| Precision | 74.38% |
| Recall | 22.50% |
| F1-Score | 34.55% |
| ROC-AUC | 0.96 |
Interpretation
- High Precision (74.38%): Low false alarm rate - detected anomalies are highly likely to be real
- Excellent ROC-AUC (0.96): Model excellently ranks normal vs anomalous samples
- Optimized for Zero-Day Detection: Focuses on high-confidence anomalies rather than catching all attacks
Feature Specifications
Input Features (7 total)
- Protocol: Encoded as TCP=1, UDP=2, ICMP=3, IGMP=4, GRE=5
- Packet Length: Bytes of packet payload
- Packet Length Variance: Standard deviation across packet sizes
- Inter-arrival Time: Time between consecutive packets (seconds)
- High Port Indicator: Binary flag for high-numbered ports
- Low Port Indicator: Binary flag for low-numbered ports
- TCP Flags: Present/absent indicator
All features are standardized via StandardScaler before model input.
Usage
Python API
import joblib
import numpy as np
from tensorflow import keras
# Load model components
detector = joblib.load("detector.pkl")
autoencoder = keras.models.load_model("autoencoder/model.h5")
# Prepare your data (shape: [n_samples, 7])
X_new = np.array([[...]]) # Your traffic features
# Get predictions
predictions, latent_scores, recon_errors = detector.predict(X_new)
# predictions: -1 = anomaly, 1 = normal
# latent_scores: anomaly scores in latent space
# recon_errors: reconstruction errors from autoencoder
FastAPI Server
python -m src.api.server
Then POST to /predict:
{
"src_ip": "192.168.1.100",
"dst_ip": "10.0.0.50",
"src_port": 54321,
"dst_port": 443,
"protocol": "TCP",
"packet_length": 512,
"inter_arrival": 0.001,
"flags": "SYN"
}
Model Limitations
- Trained on specific dataset: Best performance on NSL-KDD or similar network traffic patterns
- Contamination parameter: Assumes ~5% of traffic is anomalous; adjust for different environments
- Feature dependencies: Requires exact 7 features in standardized form
- Recall trade-off: Conservative detection (22.5% recall) to minimize false alarms
Fine-tuning for Your Domain
To adapt this model to your network:
from src.models.trainer import ModelTrainer
trainer = ModelTrainer()
# Your domain-specific traffic data
X_target = load_your_traffic_data() # shape: [n, 7]
# Fine-tune the autoencoder
history = trainer.finetune_with_strategy(
'freeze_encoder', # or 'progressive', 'layer_wise'
X_target=X_target,
epochs=10,
learning_rate=0.0001
)
# Retrain Isolation Forest on new latent space
trainer.train_transfer_learning(
X_pretrain=X_target, # Use your data
X_finetune=X_target,
ae_epochs=0, # Already fine-tuned
finetune_epochs=10
)
Citation
If you use this model in your research, please cite:
@software{{anomaly_detector_2025,
title={{Network Anomaly Detection - Transfer Learning Model}},
author={{CyberSecurityTL Contributors}},
year={{2025}},
url={{https://huggingface.co/{repo_name}}}
}}
License
Apache License 2.0 - See LICENSE file for details
Related Work
- NSL-KDD Dataset: https://www.unb.ca/cic/datasets/nsl-kdd.html
- Isolation Forest: Liu et al., 2008 (https://doi.org/10.1145/1541880.1541882)
- Autoencoder Anomaly Detection: Sakurada & Yairi, 2014
Disclaimer
This model is trained on network traffic patterns from 2009 (NSL-KDD dataset). It may not detect modern attack techniques. Always use in conjunction with other security tools and manual analysis.
Model created: 2025-12-21 10:22:44
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support