---
title: CP-Bench Leaderboard
emoji: 🚀📑
colorFrom: green
colorTo: indigo
sdk: docker
#sdk_version: 5.30.0
#python_version: 3.12
#app_file: app.py
pinned: true
license: apache-2.0
---

# 🚀 CP-Bench Leaderboard

This repository contains the leaderboard of the [CP-Bench](https://huggingface.co/datasets/kostis-init/CP-Bench) dataset.

## 📁 Structure

- `app.py` — Launches the Gradio interface.
- `src/` — Contains the main logic for fetching and displaying leaderboard data.'
  - `config.py` — Configuration for the leaderboard.
  - `eval.py` — Evaluation logic for model submissions.
  - `hf_utils.py` — Utilities file.
  - `ui.py` — UI components for displaying the leaderboard.
  - `user_eval.py` — The logic for the evaluation of submitted models, it can also be used to evaluate models locally.
- `README.md` — (you are here)

## 🧠 How It Works

1. Users submit a .jsonl file with their generated models
2. The submission is uploaded to a storage repository (Hugging Face Hub).
3. An evaluation script is triggered, which:
   - Loads the submission.
   - Evaluates the models against the benchmark dataset.
   - Computes metrics.
4. The results are stored and displayed on the leaderboard.

## 🛠️ Development

To run locally:

```bash
pip install -r requirements.txt
python app.py
```

If you wish to contribute or modify the leaderboard, feel free to open discussions or pull requests.
For adding more modelling frameworks, please modify the `src/user_eval.py` file to include the execution code for the new framework.