Supertron-embedding
Collection
1 item • Updated
Supertron-embedding-300M is a high-performance, compact embedding model fine-tuned from the google/embeddinggemma-300m architecture. It is specifically designed to provide state-of-the-art semantic representations for Retrieval-Augmented Generation (RAG), semantic search, and document clustering applications while maintaining a low computational footprint suitable for production environments.
Supertron-embedding-300M demonstrates competitive performance across the Massive Text Embedding Benchmark (MTEB). It is particularly effective in Semantic Textual Similarity (STS) tasks, outperforming many larger models in its weight class.
| Task Category | Task Name | Metric | Score |
|---|---|---|---|
| Semantic Similarity | STSBenchmark | cos_sim_spearman | 87.10 |
| Semantic Similarity | STS12 | cos_sim_spearman | 80.18 |
| Semantic Similarity | BIOSSES | cos_sim_spearman | 82.98 |
| Retrieval | NFCorpus | NDCG@10 | 37.07 |
| Classification | AmazonCounterfactual | Accuracy | 83.34 |
| Clustering | TwentyNewsgroups | V-Measure | 50.01 |
This model can be easily integrated using the sentence-transformers library.
from sentence_transformers import SentenceTransformer
model_id = "surpem/Supertron-embedding-300M"
# Load the model
model = SentenceTransformer(model_id)
# Define target text
sentences = [
"The financial results exceeded market expectations.",
"The company reported better than expected quarterly earnings."
]
# Compute embeddings
embeddings = model.encode(sentences)
# Calculate cosine similarity
similarity = model.similarity(embeddings[0], embeddings[1])
print(f"Semantic Similarity: {similarity.item():.4f}")
Training Procedure
Hyperparameters
Precision: bfloat16
Max Sequence Length: 256 tokens
Optimizer: AdamW
Batch Size: 256
Learning Rate: 2e-5
Citation
Code-Snippet
@misc{surpem2026supertron,
title={Supertron-embedding-300M: High-Efficiency Semantic Representation Model},
author={Surpem},
year={2026},
url={[https://huggingface.co/surpem/Supertron-embedding-300M](https://huggingface.co/surpem/Supertron-embedding-300M)},
}
Base model
google/embeddinggemma-300m