Instructions to use DeepChem/ChemBERTa-77M-MTR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DeepChem/ChemBERTa-77M-MTR with Transformers:
# Load model directly from transformers import AutoTokenizer, RobertaForRegression tokenizer = AutoTokenizer.from_pretrained("DeepChem/ChemBERTa-77M-MTR") model = RobertaForRegression.from_pretrained("DeepChem/ChemBERTa-77M-MTR") - Notebooks
- Google Colab
- Kaggle
Why is merges.txt empty in DeepChem/ChemBERTa-77M-MTR?
#5
by Mafuton - opened
Hi,
I downloaded the tokenizer for DeepChem/ChemBERTa-77M-MTR( or ChemBERTa-77M-MLM) and found that the merges.txt file is empty. As this tokenizer is supposed to use Byte Pair Encoding (BPE), I expected merges.txt to contain merge rules. However, since it is empty, tokenization does not work as expected, splitting "Cl" into "C" and "l" instead of keeping "Cl" as a single token.
Could you clarify why merges.txt is empty? Should there be a proper merges.txt, or is this the intended behavior?
Thanks!