File size: 650 Bytes
c26337c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
base_model: t5-small
tags: [ act, wikitext]
metrics: [loss, perplexity]
---
# HRM-Text1 (WikiText-103)

This repository contains weights for an experimental trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).

## Model Description

- **Architecture:** CMBA
- **Training Data:** [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
- **Tokenizer:** `t5-small` (slow T5 SentencePiece)
- **Vocab Size**: 32100
- **Objective:** Causal Language Modeling

### Latest Performance (Epoch 0)
- **Validation Loss**: `29.7877`
- **Validation Perplexity**: `8642211872768.00`