Upload folder using huggingface_hub
Browse files- README.md +33 -85
- checkpoint-22500/config.json +36 -0
- checkpoint-22500/model.safetensors +3 -0
- checkpoint-22500/optimizer.pt +3 -0
- checkpoint-22500/rng_state.pth +3 -0
- checkpoint-22500/scheduler.pt +3 -0
- checkpoint-22500/trainer_state.json +0 -0
- checkpoint-22500/training_args.bin +3 -0
- config.json +36 -0
- model.safetensors +3 -0
- runs/May27_07-55-36_r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf/events.out.tfevents.1748332538.r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf.69.0 +2 -2
- runs/May27_07-55-36_r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf/events.out.tfevents.1748336058.r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf.69.1 +3 -0
- special_tokens_map.json +7 -0
- tokenizer.json +0 -0
- tokenizer_config.json +56 -0
- training_args.bin +3 -0
- training_params.json +30 -0
- vocab.txt +0 -0
README.md
CHANGED
|
@@ -1,91 +1,39 @@
|
|
| 1 |
-
📘 Training Report: Mistakes & Lessons from Context Classification Project (cntxt-class-final)
|
| 2 |
-
🧠 Project Summary
|
| 3 |
-
The primary objective of this project was to develop a prompt context detector capable of classifying input prompts into one of three categories:
|
| 4 |
-
"has context"
|
| 5 |
-
"Intent is unclear, Please input more context"
|
| 6 |
-
"missing platform, audience, budget, goal"
|
| 7 |
-
Initially, the project utilized a Text-to-Text model (flan-t5-base). However, to optimize for speed, cost-efficiency, and stability, the approach was transitioned to a Text Classification model (distilbert-base-uncased).
|
| 8 |
-
🔴 All Mistakes Made (Chronological)
|
| 9 |
-
🔹 Phase 1: Using flan-t5-base (Seq2Seq)
|
| 10 |
-
Mistake
|
| 11 |
-
Description
|
| 12 |
-
Fix / Outcome
|
| 13 |
-
❌ No NLTK awareness
|
| 14 |
-
AutoTrain UI used ROUGE/BLEU by default, which triggered an NLTK error (punkt_tab).
|
| 15 |
-
Switched the task to classification to avoid the dependency and error.
|
| 16 |
-
❌ Default metric usage
|
| 17 |
-
Failed to disable default metrics, leading to crashes during training.
|
| 18 |
-
Removed metrics or moved to a classification task where they were not applicable.
|
| 19 |
-
❌ Used full dataset immediately
|
| 20 |
-
Attempted to train on 200,000 rows from the very beginning, resulting in significant cost and time waste.
|
| 21 |
-
Later adopted a strategy of using smaller "burn-in" runs (e.g., 50,000 rows).
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
Believed that the AutoTrain UI would automatically detect and cast string labels to ClassLabel.
|
| 35 |
-
AutoTrain UI does not auto-cast; it uses raw Value types for labels.
|
| 36 |
-
❌ Tried using dataset.py logic with CSVs
|
| 37 |
-
Expected Hugging Face to interpret dataset.py scripts during training when using CSV-based repositories.
|
| 38 |
-
AutoTrain ignores custom scripts like dataset.py when datasets are provided as raw CSVs.
|
| 39 |
|
| 40 |
-
|
| 41 |
-
Mistake
|
| 42 |
-
Description
|
| 43 |
-
Fix
|
| 44 |
-
❌ Repeated uploads of CSVs
|
| 45 |
-
Continuously uploaded new versions of CSVs, mistakenly believing the label formatting was incorrect.
|
| 46 |
-
Realized the core issue was the expected label type (ClassLabel), not just formatting.
|
| 47 |
-
❌ Tried casting in-place
|
| 48 |
-
Attempted to use cast_column(...) within AutoTrain UI-compatible ZIPs.
|
| 49 |
-
This approach does not work unless the dataset is pre-processed using the datasets library.
|
| 50 |
-
❌ Trusted UI log errors too literally
|
| 51 |
-
Error messages in the UI logs sometimes pointed to the wrong column (e.g., label, token).
|
| 52 |
-
Verified that the columns themselves were correct; the actual issue was the label's data type.
|
| 53 |
-
❌ Spent hours debugging CSVs
|
| 54 |
-
Dedicated extensive time to debugging CSV files, which were ultimately found to be correctly formatted.
|
| 55 |
-
The problem was the label type. The solution was to push the dataset with ClassLabel via datasets.push_to_hub().
|
| 56 |
|
| 57 |
-
|
| 58 |
-
Fix
|
| 59 |
-
Description
|
| 60 |
-
✅ Used datasets library
|
| 61 |
-
The label column was correctly cast to ClassLabel using the datasets library.
|
| 62 |
-
✅ Used DatasetDict + push_to_hub()
|
| 63 |
-
Ensured the dataset was formatted correctly (DatasetDict) for AutoTrain to properly read .names.
|
| 64 |
-
✅ Pushed dataset to VerifiedPrompts/cntxt-class-final
|
| 65 |
-
This finally resolved the AttributeError and allowed the training process to proceed.
|
| 66 |
-
✅ Used distilbert-base-uncased
|
| 67 |
-
Selected a lightweight, T4-friendly model, which successfully completed training.
|
| 68 |
-
✅ Setup AutoTrain with clean splits + logging
|
| 69 |
-
The entire training pipeline ran smoothly with healthy logs, indicating successful configuration.
|
| 70 |
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
AutoTrain UI's CSV Loader Limitations: The AutoTrain UI's CSV loader is highly literal. It does not perform automatic type casting (e.g., from string to ClassLabel) and does not interpret custom Python scripts like dataset.py within CSV-based repositories.
|
| 74 |
-
Strict Label Type Expectation: AutoTrain explicitly expects the label column to be of type ClassLabel, not raw integers or strings. The .names error specifically arises when the system anticipates class names (provided by ClassLabel) but instead receives raw value types.
|
| 75 |
-
Dataset Format Dependency: The correct label type (ClassLabel) must be established and defined within the dataset's format itself, rather than being inferred or attempted to be cast during the AutoTrain UI's processing.
|
| 76 |
-
✅ Lessons Learned
|
| 77 |
-
These experiences provided crucial insights for future projects:
|
| 78 |
-
Lesson
|
| 79 |
-
Impact
|
| 80 |
-
Always push classification datasets via datasets.push_to_hub()
|
| 81 |
-
This method ensures proper label casting and avoids common data type issues during AutoTrain ingestion.
|
| 82 |
-
Never assume AutoTrain UI reads dataset.py
|
| 83 |
-
Custom dataset logic defined in dataset.py is ignored when using CSV-based repositories in the AutoTrain UI.
|
| 84 |
-
Set ClassLabel early and test with .features["label"]
|
| 85 |
-
Explicitly defining and verifying the ClassLabel type in the dataset's features guarantees compatibility and avoids runtime errors.
|
| 86 |
-
Start with 50k rows for burn-in runs
|
| 87 |
-
Utilizing smaller subsets for initial training runs significantly reduces computational cost and time during the experimentation phase.
|
| 88 |
-
Prefer distilbert for text classification on a T4
|
| 89 |
-
distilbert models are more lightweight, cost-effective, and generally avoid tokenizer-related issues when deployed on T4 GPUs.
|
| 90 |
|
|
|
|
| 91 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
|
| 2 |
+
---
|
| 3 |
+
library_name: transformers
|
| 4 |
+
tags:
|
| 5 |
+
- autotrain
|
| 6 |
+
- text-classification
|
| 7 |
+
base_model: distilbert/distilbert-base-uncased
|
| 8 |
+
widget:
|
| 9 |
+
- text: "I love AutoTrain"
|
| 10 |
+
datasets:
|
| 11 |
+
- VerifiedPrompts/cntxt-class-final
|
| 12 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
+
# Model Trained Using AutoTrain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
- Problem type: Text Classification
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
+
## Validation Metrics
|
| 19 |
+
loss: 0.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
+
f1_macro: 1.0
|
| 22 |
|
| 23 |
+
f1_micro: 1.0
|
| 24 |
+
|
| 25 |
+
f1_weighted: 1.0
|
| 26 |
+
|
| 27 |
+
precision_macro: 1.0
|
| 28 |
+
|
| 29 |
+
precision_micro: 1.0
|
| 30 |
+
|
| 31 |
+
precision_weighted: 1.0
|
| 32 |
+
|
| 33 |
+
recall_macro: 1.0
|
| 34 |
+
|
| 35 |
+
recall_micro: 1.0
|
| 36 |
+
|
| 37 |
+
recall_weighted: 1.0
|
| 38 |
+
|
| 39 |
+
accuracy: 1.0
|
checkpoint-22500/config.json
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "distilbert/distilbert-base-uncased",
|
| 3 |
+
"_num_labels": 3,
|
| 4 |
+
"activation": "gelu",
|
| 5 |
+
"architectures": [
|
| 6 |
+
"DistilBertForSequenceClassification"
|
| 7 |
+
],
|
| 8 |
+
"attention_dropout": 0.1,
|
| 9 |
+
"dim": 768,
|
| 10 |
+
"dropout": 0.1,
|
| 11 |
+
"hidden_dim": 3072,
|
| 12 |
+
"id2label": {
|
| 13 |
+
"0": "Intent is unclear, Please input more context",
|
| 14 |
+
"1": "has context",
|
| 15 |
+
"2": "missing platform, audience, budget, goal"
|
| 16 |
+
},
|
| 17 |
+
"initializer_range": 0.02,
|
| 18 |
+
"label2id": {
|
| 19 |
+
"Intent is unclear, Please input more context": 0,
|
| 20 |
+
"has context": 1,
|
| 21 |
+
"missing platform, audience, budget, goal": 2
|
| 22 |
+
},
|
| 23 |
+
"max_position_embeddings": 512,
|
| 24 |
+
"model_type": "distilbert",
|
| 25 |
+
"n_heads": 12,
|
| 26 |
+
"n_layers": 6,
|
| 27 |
+
"pad_token_id": 0,
|
| 28 |
+
"problem_type": "single_label_classification",
|
| 29 |
+
"qa_dropout": 0.1,
|
| 30 |
+
"seq_classif_dropout": 0.2,
|
| 31 |
+
"sinusoidal_pos_embds": false,
|
| 32 |
+
"tie_weights_": true,
|
| 33 |
+
"torch_dtype": "float32",
|
| 34 |
+
"transformers_version": "4.48.0",
|
| 35 |
+
"vocab_size": 30522
|
| 36 |
+
}
|
checkpoint-22500/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e6cdd9c518a84c5978bbcabd1ab1471f2bf99b2037cca35e40e439f5035bb528
|
| 3 |
+
size 267835644
|
checkpoint-22500/optimizer.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e9608059a4e05af8de926fecb80612e2bdacc98e8849cfe550f782489f49db2f
|
| 3 |
+
size 535733434
|
checkpoint-22500/rng_state.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b12d01f98c4bd57244722a532a4fef3ced73279bc6c66e1769c9f41674dbe5ab
|
| 3 |
+
size 14244
|
checkpoint-22500/scheduler.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95e0c874c4b32ab2996744d1e36277aaf7f5b82859a8468bb4b4eba2319905f3
|
| 3 |
+
size 1064
|
checkpoint-22500/trainer_state.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
checkpoint-22500/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cbd2abe463b6fb49523cf5f62c2c06d6c102108d469f3e3f49abe3ffc7808caf
|
| 3 |
+
size 5368
|
config.json
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "distilbert/distilbert-base-uncased",
|
| 3 |
+
"_num_labels": 3,
|
| 4 |
+
"activation": "gelu",
|
| 5 |
+
"architectures": [
|
| 6 |
+
"DistilBertForSequenceClassification"
|
| 7 |
+
],
|
| 8 |
+
"attention_dropout": 0.1,
|
| 9 |
+
"dim": 768,
|
| 10 |
+
"dropout": 0.1,
|
| 11 |
+
"hidden_dim": 3072,
|
| 12 |
+
"id2label": {
|
| 13 |
+
"0": "Intent is unclear, Please input more context",
|
| 14 |
+
"1": "has context",
|
| 15 |
+
"2": "missing platform, audience, budget, goal"
|
| 16 |
+
},
|
| 17 |
+
"initializer_range": 0.02,
|
| 18 |
+
"label2id": {
|
| 19 |
+
"Intent is unclear, Please input more context": 0,
|
| 20 |
+
"has context": 1,
|
| 21 |
+
"missing platform, audience, budget, goal": 2
|
| 22 |
+
},
|
| 23 |
+
"max_position_embeddings": 512,
|
| 24 |
+
"model_type": "distilbert",
|
| 25 |
+
"n_heads": 12,
|
| 26 |
+
"n_layers": 6,
|
| 27 |
+
"pad_token_id": 0,
|
| 28 |
+
"problem_type": "single_label_classification",
|
| 29 |
+
"qa_dropout": 0.1,
|
| 30 |
+
"seq_classif_dropout": 0.2,
|
| 31 |
+
"sinusoidal_pos_embds": false,
|
| 32 |
+
"tie_weights_": true,
|
| 33 |
+
"torch_dtype": "float32",
|
| 34 |
+
"transformers_version": "4.48.0",
|
| 35 |
+
"vocab_size": 30522
|
| 36 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e6cdd9c518a84c5978bbcabd1ab1471f2bf99b2037cca35e40e439f5035bb528
|
| 3 |
+
size 267835644
|
runs/May27_07-55-36_r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf/events.out.tfevents.1748332538.r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf.69.0
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8f6ee793f588b5524becafcc1fc194c5327606a4e5e948e17927e02ffae242c4
|
| 3 |
+
size 586073
|
runs/May27_07-55-36_r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf/events.out.tfevents.1748336058.r-verifiedprompts-context-detector-71s4c8h7-06974-pp7jf.69.1
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c1a4aaae518ee46210470e64b1ff86bcff4eec3e23d7b40cb26fbbf1c7e92b9c
|
| 3 |
+
size 936
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cls_token": "[CLS]",
|
| 3 |
+
"mask_token": "[MASK]",
|
| 4 |
+
"pad_token": "[PAD]",
|
| 5 |
+
"sep_token": "[SEP]",
|
| 6 |
+
"unk_token": "[UNK]"
|
| 7 |
+
}
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "[PAD]",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"100": {
|
| 12 |
+
"content": "[UNK]",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"101": {
|
| 20 |
+
"content": "[CLS]",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"102": {
|
| 28 |
+
"content": "[SEP]",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"103": {
|
| 36 |
+
"content": "[MASK]",
|
| 37 |
+
"lstrip": false,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"clean_up_tokenization_spaces": false,
|
| 45 |
+
"cls_token": "[CLS]",
|
| 46 |
+
"do_lower_case": true,
|
| 47 |
+
"extra_special_tokens": {},
|
| 48 |
+
"mask_token": "[MASK]",
|
| 49 |
+
"model_max_length": 512,
|
| 50 |
+
"pad_token": "[PAD]",
|
| 51 |
+
"sep_token": "[SEP]",
|
| 52 |
+
"strip_accents": null,
|
| 53 |
+
"tokenize_chinese_chars": true,
|
| 54 |
+
"tokenizer_class": "DistilBertTokenizer",
|
| 55 |
+
"unk_token": "[UNK]"
|
| 56 |
+
}
|
training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cbd2abe463b6fb49523cf5f62c2c06d6c102108d469f3e3f49abe3ffc7808caf
|
| 3 |
+
size 5368
|
training_params.json
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"data_path": "VerifiedPrompts/cntxt-class-final",
|
| 3 |
+
"model": "distilbert/distilbert-base-uncased",
|
| 4 |
+
"lr": 5e-05,
|
| 5 |
+
"epochs": 3,
|
| 6 |
+
"max_seq_length": 128,
|
| 7 |
+
"batch_size": 8,
|
| 8 |
+
"warmup_ratio": 0.1,
|
| 9 |
+
"gradient_accumulation": 1,
|
| 10 |
+
"optimizer": "adamw_torch",
|
| 11 |
+
"scheduler": "linear",
|
| 12 |
+
"weight_decay": 0.0,
|
| 13 |
+
"max_grad_norm": 1.0,
|
| 14 |
+
"seed": 42,
|
| 15 |
+
"train_split": "train",
|
| 16 |
+
"valid_split": "validation",
|
| 17 |
+
"text_column": "text",
|
| 18 |
+
"target_column": "label",
|
| 19 |
+
"logging_steps": -1,
|
| 20 |
+
"project_name": "autotrain-8z0a6-ohqum",
|
| 21 |
+
"auto_find_batch_size": false,
|
| 22 |
+
"mixed_precision": "fp16",
|
| 23 |
+
"save_total_limit": 1,
|
| 24 |
+
"push_to_hub": true,
|
| 25 |
+
"eval_strategy": "epoch",
|
| 26 |
+
"username": "VerifiedPrompts",
|
| 27 |
+
"log": "tensorboard",
|
| 28 |
+
"early_stopping_patience": 5,
|
| 29 |
+
"early_stopping_threshold": 0.01
|
| 30 |
+
}
|
vocab.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|