Grokking : Best Hyper Parameters in the description

Debug-deterministic
🔄
💾
🗑️
🎲
Filter by loader
Transformers
max_new_tokens
512

temperature
1

top_p
1

top_k
1

typical_p
1

min_p
0

repetition_penalty
1

frequency_penalty
0

presence_penalty
0

repetition_penalty_range
1024

do_sample
dry_multiplier
Set to greater than 0 to enable DRY. Recommended value: 0.8.
0

dry_allowed_length
Longest sequence that can be repeated without being penalized.
2

dry_base
Controls how fast the penalty grows with increasing sequence length.
1.75

dry_sequence_breakers
Tokens across which sequence matching is not continued. Specified as a comma-separated list of quoted strings.
"\n", ":", "\"", "*"
Learn more

Expand max_new_tokens to the available context length.
auto_max_new_tokens
Forces the model to never end the generation prematurely.
Ban the eos_token
Disabling this can make the replies more creative.
Add the bos_token to the beginning of prompts
Custom stopping strings
Written between "" and separated by commas.
"<s>", "<|im_start|>", "<|im_end|>", "You:"
Token bans
Token IDs to ban, separated by commas. The IDs can be found in the Default or Notebook tab.
penalty_alpha
For Contrastive Search. do_sample must be unchecked.
0

guidance_scale
For CFG. 1.5 is a good value.
1

Negative prompt
mirostat_mode
mode=1 is for llama.cpp only.
0

mirostat_tau
5

mirostat_eta
0.1

epsilon_cutoff
0

eta_cutoff
0

encoder_repetition_penalty
1

no_repeat_ngram_size
0

Load grammar from file (.gbnf)
None
🔄
💾
🗑️
Grammar
tfs
1

top_a
0

smoothing_factor
Activates Quadratic Sampling.
0

smoothing_curve
Adjusts the dropoff curve of Quadratic Sampling.
1

dynamic_temperature
Moves temperature/dynamic temperature/quadratic sampling to the end of the sampler stack, ignoring their positions in "Sampler priority".
temperature_last
Sampler priority
Parameter names separated by new lines or commas.
temperature
dynamic_temperature
quadratic_sampling
top_k
top_p
typical_p
epsilon_cutoff
eta_cutoff
tfs
top_a
min_p
mirostat
Truncate the prompt up to this length
The leftmost tokens are removed if the prompt exceeds this length. Most models require this to be at most 2048.
32768
prompt_lookup_num_tokens
Activates Prompt Lookup Decoding.
0

Maximum tokens/second
To make text readable in real time.
0

Maximum UI updates/second
Set this if you experience lag in the UI during streaming.
0

Seed (-1 for random)
-1
Some specific models need this unset.
Skip special tokens
Activate text streaming

Files changed (2) hide show

model-00001-of-00002.safetensors +1 -1
model-00002-of-00002.safetensors +1 -1

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2ef350d4edfdc51d4c64c6e02038252831f0cfe7356e9fee72845fd2fb0b9894
 size 4991908832

 version https://git-lfs.github.com/spec/v1
+oid sha256:6bbee114829e29cad13061e412ea4b532e2df2b971187c95081cf894f29c180d
 size 4991908832

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d03ac43dfe1f2cc253d31f383466cb85b116f47e4bdea225340fcd98135ee9db
 size 2198022960

 version https://git-lfs.github.com/spec/v1
+oid sha256:fa0e4f2fd3dac93594dcf35572e973c197d017ae9254759c202cfb49ba3c7cd7
 size 2198022960