| # EditCoder | |
| The EditCoder models are the fine-tuned models described in the following paper: | |
| ``` | |
| @inproceedings{cassano2023edit, | |
| title={{Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions}}, | |
| author={Federico Cassano and Luisa Li and Akul Sethi and Noah Shinn and Abby Brennan-Jones and Anton Lozhkov and Carolyn Jane Anderson and Arjun Guha}, | |
| booktitle={The First International Workshop on Large Language Model for Code}, | |
| year={2024}, | |
| url={https://arxiv.org/abs/2312.12450} | |
| } | |
| ``` | |
| This repository has several models. The root is the fine-tune of DeepSeek Coder 33B on the EditPackFT dataset. The other models are | |
| in subdirectories. You can do this: | |
| ```bash | |
| AutoModelForCausalLM.from_pretrained("nuprl/EditCoder", subfolder=DIR_NAME) | |
| ````` | |
| ## Prompt | |
| The model has been trained on the following prompt format: | |
| ``` | |
| ## Code Before: | |
| {before} | |
| ## Instruction: | |
| {instruction} | |
| ## Code After: | |
| {after} | |
| ``` | |
| Here is a python function that can be used for formatting the prompt correctly: | |
| ```py | |
| def edit_prompt(old, instr): | |
| before = f"""## Code Before:\n{old}\n""" | |
| instr = f"""## Instruction:\n{instr}\n""" | |
| after = f"""## Code After:\n""" | |
| return before + instr + after | |
| ``` | |
| ## Training Code | |
| We provide the full pipeline that was used for training our own edit-coder model. | |
| The pipeline and instructions can be found on our [GitHub repository](https://github.com/nuprl/CanItEdit/tree/main/editcoder). |