File size: 1,562 Bytes
36ea38b f434b15 36ea38b 34e8992 a8c648b 36ea38b 445cf2b 2990bc2 bea3aa3 2990bc2 1b1123e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ---
title: CP-Bench Leaderboard
emoji: ππ
colorFrom: green
colorTo: indigo
sdk: docker
#sdk_version: 5.30.0
#python_version: 3.12
#app_file: app.py
pinned: true
license: apache-2.0
---
# π CP-Bench Leaderboard
This repository contains the leaderboard of the [CP-Bench](https://huggingface.co/datasets/kostis-init/CP-Bench) dataset.
## π Structure
- `app.py` β Launches the Gradio interface.
- `src/` β Contains the main logic for fetching and displaying leaderboard data.'
- `config.py` β Configuration for the leaderboard.
- `eval.py` β Evaluation logic for model submissions.
- `hf_utils.py` β Utilities file.
- `ui.py` β UI components for displaying the leaderboard.
- `user_eval.py` β The logic for the evaluation of submitted models, it can also be used to evaluate models locally.
- `README.md` β (you are here)
## π§ How It Works
1. Users submit a .jsonl file with their generated models
2. The submission is uploaded to a storage repository (Hugging Face Hub).
3. An evaluation script is triggered, which:
- Loads the submission.
- Evaluates the models against the benchmark dataset.
- Computes metrics.
4. The results are stored and displayed on the leaderboard.
## π οΈ Development
To run locally:
```bash
pip install -r requirements.txt
python app.py
```
If you wish to contribute or modify the leaderboard, feel free to open discussions or pull requests.
For adding more modelling frameworks, please modify the `src/user_eval.py` file to include the execution code for the new framework.
|