tinyBenchmarks

community

https://github.com/felipemaiapolo/tinyBenchmarks

AI & ML interests

None defined yet.

Recent Activity

borgr submitted a paper about 1 month ago

General Agent Evaluation

moonfolk authored a paper 10 months ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

borgr authored a paper about 1 year ago

Pretraining Language Models for Diachronic Linguistic Change Discovery

View all activity

tinyBenchmarks 's datasets 7

tinyBenchmarks/tinyMMLU

Viewer • Updated Jul 8, 2024 • 385 • 10.4k • 24

tinyBenchmarks/tinyHellaswag

Viewer • Updated May 25, 2024 • 50k • 6.22k • 5

tinyBenchmarks/tinyTruthfulQA

Preview • Updated May 25, 2024 • 5.84k • 4

tinyBenchmarks/tinyWinogrande

Preview • Updated May 25, 2024 • 2.13k • 5

tinyBenchmarks/tinyGSM8k

Preview • Updated May 25, 2024 • 6.51k • 9

tinyBenchmarks/tinyAI2_arc

Preview • Updated May 25, 2024 • 6.16k • 4

tinyBenchmarks/tinyAlpacaEval

Viewer • Updated Apr 19, 2024 • 100 • 149 • 7