Supertron-embedding-300M: High-Efficiency Semantic Representation Model

Model Description

Supertron-embedding-300M is a high-performance, compact embedding model fine-tuned from the google/embeddinggemma-300m architecture. It is specifically designed to provide state-of-the-art semantic representations for Retrieval-Augmented Generation (RAG), semantic search, and document clustering applications while maintaining a low computational footprint suitable for production environments.

  • Developed by: Surpem
  • Model Type: Sentence Transformer
  • Architecture: Gemma-based Dense Transformer
  • Base Model: google/embeddinggemma-300m
  • License: Apache 2.0
  • Language: English (en)

Results

Supertron-embedding-300M demonstrates competitive performance across the Massive Text Embedding Benchmark (MTEB). It is particularly effective in Semantic Textual Similarity (STS) tasks, outperforming many larger models in its weight class.

Task Category Task Name Metric Score
Semantic Similarity STSBenchmark cos_sim_spearman 87.10
Semantic Similarity STS12 cos_sim_spearman 80.18
Semantic Similarity BIOSSES cos_sim_spearman 82.98
Retrieval NFCorpus NDCG@10 37.07
Classification AmazonCounterfactual Accuracy 83.34
Clustering TwentyNewsgroups V-Measure 50.01

Get Started

This model can be easily integrated using the sentence-transformers library.

from sentence_transformers import SentenceTransformer

model_id = "surpem/Supertron-embedding-300M"

# Load the model
model = SentenceTransformer(model_id)

# Define target text
sentences = [
    "The financial results exceeded market expectations.",
    "The company reported better than expected quarterly earnings."
]

# Compute embeddings
embeddings = model.encode(sentences)

# Calculate cosine similarity
similarity = model.similarity(embeddings[0], embeddings[1])
print(f"Semantic Similarity: {similarity.item():.4f}")
Training Procedure
Hyperparameters
Precision: bfloat16

Max Sequence Length: 256 tokens

Optimizer: AdamW

Batch Size: 256

Learning Rate: 2e-5

Citation
Code-Snippet
@misc{surpem2026supertron,
      title={Supertron-embedding-300M: High-Efficiency Semantic Representation Model},
      author={Surpem},
      year={2026},
      url={[https://huggingface.co/surpem/Supertron-embedding-300M](https://huggingface.co/surpem/Supertron-embedding-300M)},
}
Downloads last month
146
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 3 Ask for provider support

Model tree for Surpem/Supertron-embedding-300M

Finetuned
(230)
this model

Collection including Surpem/Supertron-embedding-300M

Evaluation results