--- title: CP-Bench Leaderboard emoji: 🚀📑 colorFrom: green colorTo: indigo sdk: docker #sdk_version: 5.30.0 #python_version: 3.12 #app_file: app.py pinned: true license: apache-2.0 --- # 🚀 CP-Bench Leaderboard This repository contains the leaderboard of the [CP-Bench](https://huggingface.co/datasets/kostis-init/CP-Bench) dataset. ## 📁 Structure - `app.py` — Launches the Gradio interface. - `src/` — Contains the main logic for fetching and displaying leaderboard data.' - `config.py` — Configuration for the leaderboard. - `eval.py` — Evaluation logic for model submissions. - `hf_utils.py` — Utilities file. - `ui.py` — UI components for displaying the leaderboard. - `user_eval.py` — The logic for the evaluation of submitted models, it can also be used to evaluate models locally. - `README.md` — (you are here) ## 🧠 How It Works 1. Users submit a .jsonl file with their generated models 2. The submission is uploaded to a storage repository (Hugging Face Hub). 3. An evaluation script is triggered, which: - Loads the submission. - Evaluates the models against the benchmark dataset. - Computes metrics. 4. The results are stored and displayed on the leaderboard. ## 🛠️ Development To run locally: ```bash pip install -r requirements.txt python app.py ``` If you wish to contribute or modify the leaderboard, feel free to open discussions or pull requests. For adding more modelling frameworks, please modify the `src/user_eval.py` file to include the execution code for the new framework.