File size: 650 Bytes
c26337c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
---
base_model: t5-small
tags: [ act, wikitext]
metrics: [loss, perplexity]
---
# HRM-Text1 (WikiText-103)
This repository contains weights for an experimental trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).
## Model Description
- **Architecture:** CMBA
- **Training Data:** [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
- **Tokenizer:** `t5-small` (slow T5 SentencePiece)
- **Vocab Size**: 32100
- **Objective:** Causal Language Modeling
### Latest Performance (Epoch 0)
- **Validation Loss**: `29.7877`
- **Validation Perplexity**: `8642211872768.00`
|