Network Anomaly Detection - Transfer Learning Model

A transfer learning-based anomaly detection system designed to identify zero-day style attacks in network traffic.

Model Description

This model combines:

Variational Autoencoder (VAE): Pre-trained on large corpus of normal network traffic, fine-tuned on target domain
Isolation Forest: Trained on latent space representations for anomaly scoring

Architecture

Input Features: 7 network traffic features (protocol, packet length, inter-arrival time, ports, TCP flags)
Latent Dimension: 8
Encoder: 7 → 16 → 8
Decoder: 8 → 16 → 7

Training Details

Dataset

Source: NSL-KDD Intrusion Detection Dataset
Pre-training: 6,400 normal network samples
Fine-tuning: 8,000 mixed samples (80% normal, 20% attacks)
Test Set: 2,000 samples (80% normal, 20% attacks)

Training Configuration

Pre-training epochs: 50
Fine-tuning epochs: 10
Optimizer: Adam (learning rate: 0.001)
Loss: Reconstruction error (MSE)
Contamination parameter: 0.05 (5% expected anomaly rate)

Performance

Metrics

Metric	Value
Accuracy	82.95%
Precision	74.38%
Recall	22.50%
F1-Score	34.55%
ROC-AUC	0.96

Interpretation

High Precision (74.38%): Low false alarm rate - detected anomalies are highly likely to be real
Excellent ROC-AUC (0.96): Model excellently ranks normal vs anomalous samples
Optimized for Zero-Day Detection: Focuses on high-confidence anomalies rather than catching all attacks

Feature Specifications

Input Features (7 total)

Protocol: Encoded as TCP=1, UDP=2, ICMP=3, IGMP=4, GRE=5
Packet Length: Bytes of packet payload
Packet Length Variance: Standard deviation across packet sizes
Inter-arrival Time: Time between consecutive packets (seconds)
High Port Indicator: Binary flag for high-numbered ports
Low Port Indicator: Binary flag for low-numbered ports
TCP Flags: Present/absent indicator

All features are standardized via StandardScaler before model input.

Usage

Python API

import joblib
import numpy as np
from tensorflow import keras

# Load model components
detector = joblib.load("detector.pkl")
autoencoder = keras.models.load_model("autoencoder/model.h5")

# Prepare your data (shape: [n_samples, 7])
X_new = np.array([[...]])  # Your traffic features

# Get predictions
predictions, latent_scores, recon_errors = detector.predict(X_new)
# predictions: -1 = anomaly, 1 = normal
# latent_scores: anomaly scores in latent space
# recon_errors: reconstruction errors from autoencoder

FastAPI Server

python -m src.api.server

Then POST to /predict:

{
  "src_ip": "192.168.1.100",
  "dst_ip": "10.0.0.50",
  "src_port": 54321,
  "dst_port": 443,
  "protocol": "TCP",
  "packet_length": 512,
  "inter_arrival": 0.001,
  "flags": "SYN"
}

Model Limitations

Trained on specific dataset: Best performance on NSL-KDD or similar network traffic patterns
Contamination parameter: Assumes ~5% of traffic is anomalous; adjust for different environments
Feature dependencies: Requires exact 7 features in standardized form
Recall trade-off: Conservative detection (22.5% recall) to minimize false alarms

Fine-tuning for Your Domain

To adapt this model to your network:

from src.models.trainer import ModelTrainer

trainer = ModelTrainer()

# Your domain-specific traffic data
X_target = load_your_traffic_data()  # shape: [n, 7]

# Fine-tune the autoencoder
history = trainer.finetune_with_strategy(
    'freeze_encoder',  # or 'progressive', 'layer_wise'
    X_target=X_target,
    epochs=10,
    learning_rate=0.0001
)

# Retrain Isolation Forest on new latent space
trainer.train_transfer_learning(
    X_pretrain=X_target,  # Use your data
    X_finetune=X_target,
    ae_epochs=0,  # Already fine-tuned
    finetune_epochs=10
)

Citation

If you use this model in your research, please cite:

@software{{anomaly_detector_2025,
  title={{Network Anomaly Detection - Transfer Learning Model}},
  author={{CyberSecurityTL Contributors}},
  year={{2025}},
  url={{https://huggingface.co/{repo_name}}}
}}

License

Apache License 2.0 - See LICENSE file for details

Related Work

NSL-KDD Dataset: https://www.unb.ca/cic/datasets/nsl-kdd.html
Isolation Forest: Liu et al., 2008 (https://doi.org/10.1145/1541880.1541882)
Autoencoder Anomaly Detection: Sakurada & Yairi, 2014

Disclaimer

This model is trained on network traffic patterns from 2009 (NSL-KDD dataset). It may not detect modern attack techniques. Always use in conjunction with other security tools and manual analysis.

Model created: 2025-12-21 10:22:44

Downloads last month: 5

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support