Model Card for Model ID

Text classification Model for Spam Detection (Deep Learning Project).

Model Details

Model Description

Model developped for the "Deep Learning with Python" course Project

Developed by: Diavila Rostaing Engandzi
Model type: Binary Text Classification
Language(s) (NLP): English
Finetuned from model: DistilBERT

Model Sources

Demo [optional]: https://huggingface.co/picket-cliff/deepl-project

Uses

The model is intended to be used to sort spam in emails. Clone and Run the app.py file in the Demo to see it in action.

Training Details

Training Data

Subset from the email_data.csv dataset [card].

A benchmark dataset for email classification with around 5000 emailed classified between "ham" and "spam". To evaluate the model, data was separated between training and test datasets (80-20 split).

Preprocessing

Deep learning models cannot process raw text; they require numerical tensors. We utilized the Hugging Face DistilBertTokenizer.

Sub-word Tokenization: Instead of splitting by spaces (which struggles with typos and rare words), DistilBERT uses WordPiece tokenization. For example, an out-of-vocabulary word might be broken into known sub-words, preventing the model from encountering "Unknown" tokens.
Special Tokens: The tokenizer automatically prepends the [CLS] (Classification) token to the start of the sequence and the [SEP] (Separator) token at the end. The final hidden state corresponding to the [CLS] token is what the model uses for the binary classification decision.
Truncation and Padding: Transformer models require fixed-size input matrices for batch processing. Based on our EDA length distribution, we set max_length = 128.

o Sentences longer than 128 tokens were truncated.

o Sentences shorter than 128 tokens were padded with the [PAD] token (ID 0).
Attention Masks: To prevent the model from performing Self-Attention on meaningless padding tokens, the tokenizer generates an attention_mask (an array of 1s for real words and 0s for padding).

Evaluation

Results obtained directly from training on the training dataset then evaluating the model on the testing data. Result are compared to a baseline (dummy classifier) for reference.

Testing Data, Factors & Metrics

Testing Data

Metrics

Accuracy, f1 score (macro and weighted)

Results

When evaluated on a 80-20 split we obtained:

• Accuracy: 99.10%

• Macro Average F1-Score: 0.98

• Weighted Average F1-Score: 0.99

Meanwhile the dummy achieved 86.6% accuracy.

Summary

The model performance is

Downloads last month: 64

Safetensors

Model size

67M params

Tensor type

F32

Model tree for picket-cliff/deepl-project-model

Base model

distilbert/distilbert-base-uncased

Finetuned

(11065)

this model

picket-cliff
/

deepl-project-model