# Email Processing ModernBERT Model Fine-tuned ModernBERT model for email processing tasks. ## Model Capabilities This model can compute semantic similarity between questions and answers related to: - Email addresses - Subject lines ## Recommended Thresholds Based on extensive testing, the following thresholds are recommended: - For email questions: 0.85 - For subject questions: 0.70 - For other questions: 0.80 Additional content-aware checks are recommended for best results. ## Usage ```python from sentence_transformers import SentenceTransformer import torch # Load the model model = SentenceTransformer('sugiv/email-processing-modernbert') # Encode questions and answers q_embed = model.encode("What's your email address?", convert_to_tensor=True) a1_embed = model.encode("My email is user@example.com", convert_to_tensor=True) a2_embed = model.encode("The weather is nice today", convert_to_tensor=True) # Calculate similarity similarity1 = torch.nn.functional.cosine_similarity(q_embed.unsqueeze(0), a1_embed.unsqueeze(0)).item() similarity2 = torch.nn.functional.cosine_similarity(q_embed.unsqueeze(0), a2_embed.unsqueeze(0)).item() print(f'Similarity with relevant answer: {similarity1:.4f}') print(f'Similarity with irrelevant answer: {similarity2:.4f}') # Apply threshold threshold = 0.85 # For email questions print(f'Is relevant: {similarity1 >= threshold}') print(f'Is irrelevant: {similarity2 < threshold}') ``` ## Training Information - Base model: [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) - Published date: 2025-04-24 - Training approach: Fine-tuned with balanced dataset of email and subject questions - Framework: sentence-transformers with PyTorch