--- license: apache-2.0 base_model: meta-llama/Llama-4-Scout-17B-16E tags: - llama4 - checkpoint - fine-tuned - step-12136 language: - en pipeline_tag: text-generation --- # tonyzhao123/dummy_llama4 This is a checkpoint from step 12136 of custom Llama4 training. ## Model Details - **Base Model**: meta-llama/Llama-4-Scout-17B-16E - **Model Type**: llama4 - **Architecture**: Llama4ForConditionalGeneration - **Training Step**: 12136 - **Source Checkpoint**: `checkpoint-12136` ## Model Configuration - **Hidden Size**: 768 - **Number of Layers**: 8 - **Number of Experts (MoE)**: 4 - **Vocabulary Size**: 202048 ## Usage ```python from transformers import AutoTokenizer, AutoModelForImageTextToText import torch model_name = "tonyzhao123/dummy_llama4" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForImageTextToText.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto" ) # Example usage text = "Hello, how are you today?" inputs = tokenizer(text, return_tensors="pt") with torch.no_grad(): outputs = model.generate( inputs.input_ids, max_new_tokens=100, do_sample=True, temperature=0.7, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Information This checkpoint was extracted from training step 12136. The model was trained using custom scripts with on-the-fly tokenization on WikiText-103 dataset. ## Files Included - `config.json` - Model configuration - `model.safetensors` - Model weights (single file, no sharding) - `tokenizer.json` - Fast tokenizer - `tokenizer_config.json` - Tokenizer configuration - `special_tokens_map.json` - Special tokens mapping - `generation_config.json` - Generation parameters (if available) - `chat_template.jinja` - Chat template (if available) ## Limitations - This is an intermediate checkpoint and may not represent the final trained model - Performance may vary depending on the specific training step - Always evaluate the model on your specific use case ## Citation ```bibtex @misc{tonyzhao123_dummy_llama4_checkpoint_12136, title={tonyzhao123/dummy_llama4 - Checkpoint 12136}, author={Your Name}, year={2024}, publisher={Hugging Face}, url={https://huggingface.co/tonyzhao123/dummy_llama4} } ```