Security Event Detector - CodeBERT with LoRA
Model Description
This model detects security-relevant events in system logs using CodeBERT fine-tuned with LoRA (Low-Rank Adaptation).
Task: Binary classification (Normal vs Security Event)
Base Model: microsoft/codebert-base
Fine-tuning Method: LoRA (98% parameter reduction)
Training Data
Trained on synthetic and real security logs including:
- Authentication failures
- Exploit attempts
- Buffer overflows
- Network attacks
- Privilege escalation attempts
Performance
- Accuracy: ~95%
- F1 Score: ~0.94
- Inference Speed: ~50ms per log (GPU)
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model
tokenizer = AutoTokenizer.from_pretrained("Swapnanil09/security-event-detector")
model = AutoModelForSequenceClassification.from_pretrained("Swapnanil09/security-event-detector")
# Analyze log
log = "Failed password for root from 192.168.1.100 port 22 ssh2"
inputs = tokenizer(log, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1)
is_security = prediction.item() == 1
print(f"Security Event: {is_security}")
Model Details
- Parameters: ~125M (only ~2M trainable with LoRA)
- Input: System log text (max 128 tokens)
- Output: Binary classification (0=Normal, 1=Security)
- Confidence Scores: Softmax probabilities included
Limitations
- Trained primarily on English logs
- May not detect novel/zero-day attacks
- Performance depends on log format similarity to training data
Citation
@misc{security-event-detector,
author = {Your Name},
title = {Security Event Detector with CodeBERT and LoRA},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Swapnanil09/security-event-detector}}
}
License
MIT License