Pendo GPT-2 Medium Teacher Model
Model Description
This is a GPT2-MEDIUM (355M parameters) model fine-tuned on WikiText-103 (full) + Wikipedia EN (20231101, 30%) for text generation and prediction tasks. It serves as part of the Pendo Text Editor's predictive text system.
Key Features:
- ๐ฏ Fine-tuned on full WikiText-103 (full) + Wikipedia EN (20231101, 30%) dataset
- โก Optimized training on 2x NVIDIA H100 80GB
- ๐ Excellent text generation quality
- ๐ Production-ready for real-time predictions
Project: Pendo Text Editor - A modern text editor with AI-powered predictive text
Model Details
Architecture: GPT2-MEDIUM
- Parameters: 355M
- Layers: 24
- Hidden size: 1024
- Attention heads: 16
- Context length: 1024 tokens
Training Infrastructure:
- Hardware: 2x NVIDIA H100 80GB
- Training time: ~3 hours
- Mixed precision: bf16
- Framework: PyTorch + HuggingFace Transformers
Training Details
Dataset
- Training Data: WikiText-103 (full) + Wikipedia EN (20231101, 30%)
- Total Size: ~100M+ tokens
- Train/Validation Split: 90% train, 10% validation
- Data Quality: High-quality Wikipedia-style text from curated sources
- Knowledge Cutoff: 2023
Hyperparameters
Training Configuration:
โโ Epochs: 3
โโ Batch size: 16 per device (effective: 128 with gradient accumulation)
โโ Learning rate: 3e-5 (cosine with 1000 warmup steps)
โโ Block size: 512 tokens
โโ Weight decay: 0.01
โโ Gradient clipping: 1.0
โโ Optimizer: AdamW
Optimizations
- โ bf16 mixed precision training (2-3x speedup)
- โ Gradient accumulation (stable large-batch training)
- โ cosine with 1000 warmup steps learning rate schedule
- โ Multi-GPU training with Distributed Data Parallel
- โ Proper train/validation split (no data leakage)
Performance
Metrics (WikiText-103 (full) + Wikipedia EN (20231101, 30%) Test Set)
| Metric | Value |
|---|---|
| Validation Loss | 2.706 |
| Training Loss | 2.822 |
| Perplexity | ~15 |
16% improvement over baseline
No overfitting detected - model shows healthy generalization!
Usage
Basic Text Generation
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model
tokenizer = AutoTokenizer.from_pretrained("bekalebendong/pendo-gpt2-medium-teacher")
model = AutoModelForCausalLM.from_pretrained("bekalebendong/pendo-gpt2-medium-teacher")
# Generate text
prompt = "The history of"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=50,
do_sample=True,
top_k=50,
top_p=0.95,
temperature=0.8
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
For Text Prediction (Pendo Editor)
from transformers import pipeline
# Create prediction pipeline
predictor = pipeline('text-generation', model="bekalebendong/pendo-gpt2-medium-teacher")
# Get next word predictions
text = "Machine learning is"
predictions = predictor(
text,
max_new_tokens=1,
num_return_sequences=5,
return_full_text=False
)
for pred in predictions:
print(pred['generated_text'])
Intended Use
Primary Use Cases
- Text Prediction: Real-time text suggestions in editors
- Text Generation: General-purpose text completion
- Fine-tuning Base: Starting point for domain-specific models
- Research: Educational and research purposes
Deployment Targets
- Local applications (desktop/laptop)
- Cloud inference APIs
- Edge devices (with quantization)
Limitations
- Domain: Primarily trained on Wikipedia-style text
- Recency: Knowledge cutoff at 2023 (based on training data: WikiText-103 (full) + Wikipedia EN (20231101, 30%))
- Bias: May reflect biases present in Wikipedia
- Size: 355M parameters requires storage
- Languages: English only
Ethical Considerations
- Bias Mitigation: Model may perpetuate biases from Wikipedia
- Fact Accuracy: Generated text should not be assumed factual
- Misuse Prevention: Not intended for generating misleading content
- Attribution: Generated text should not be presented as human-written
Citation
If you use this model in your research, please cite:
@misc{pendo-gpt2-medium-wikitext-103 (full) + wikipedia en (20231101, 30%),
author = {Dimitri Bekale},
title = {Pendo GPT-2 Medium Teacher Model},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/bekalebendong/pendo-gpt2-medium-teacher}}
}
Acknowledgments
- Training Hardware: 2x NVIDIA H100 80GB
- Framework: PyTorch + HuggingFace Transformers
- Datasets:
- WikiText-103 (full) + Wikipedia EN (20231101, 30%)
- WikiText-103: Salesforce Research
- Wikipedia: Wikimedia Foundation
- Base Model: gpt2-medium (OpenAI)
- Project: Pendo Text Editor
Model Card Authors
Dimitri Bekale
Links
- GitHub Repository: https://github.com/dimitribekale/pendo-text-editor
- Model on HuggingFace: https://huggingface.co/bekalebendong/pendo-gpt2-medium-teacher
Model Status: โ Production Ready Generation Quality: โ Verified Last Updated: 2025
Generated with Claude Code
- Downloads last month
- 2
Dataset used to train bekalebendong/pendo-gpt2-medium-teacher
Evaluation results
- Validation Loss on WikiText-103 (full) + Wikipedia EN (20231101, 30%)self-reported2.706
- Perplexity on WikiText-103 (full) + Wikipedia EN (20231101, 30%)self-reported~15