Grokking : Best Hyper Parameters in the description
Browse filesDebug-deterministic
๐
๐พ
๐๏ธ
๐ฒ
Filter by loader
Transformers
max_new_tokens
512
temperature
1
top_p
1
top_k
1
typical_p
1
min_p
0
repetition_penalty
1
frequency_penalty
0
presence_penalty
0
repetition_penalty_range
1024
do_sample
dry_multiplier
Set to greater than 0 to enable DRY. Recommended value: 0.8.
0
dry_allowed_length
Longest sequence that can be repeated without being penalized.
2
dry_base
Controls how fast the penalty grows with increasing sequence length.
1.75
dry_sequence_breakers
Tokens across which sequence matching is not continued. Specified as a comma-separated list of quoted strings.
"\n", ":", "\"", "*"
Learn more
Expand max_new_tokens to the available context length.
auto_max_new_tokens
Forces the model to never end the generation prematurely.
Ban the eos_token
Disabling this can make the replies more creative.
Add the bos_token to the beginning of prompts
Custom stopping strings
Written between "" and separated by commas.
"<s>", "<|im_start|>", "<|im_end|>", "You:"
Token bans
Token IDs to ban, separated by commas. The IDs can be found in the Default or Notebook tab.
penalty_alpha
For Contrastive Search. do_sample must be unchecked.
0
guidance_scale
For CFG. 1.5 is a good value.
1
Negative prompt
mirostat_mode
mode=1 is for llama.cpp only.
0
mirostat_tau
5
mirostat_eta
0.1
epsilon_cutoff
0
eta_cutoff
0
encoder_repetition_penalty
1
no_repeat_ngram_size
0
Load grammar from file (.gbnf)
None
๐
๐พ
๐๏ธ
Grammar
tfs
1
top_a
0
smoothing_factor
Activates Quadratic Sampling.
0
smoothing_curve
Adjusts the dropoff curve of Quadratic Sampling.
1
dynamic_temperature
Moves temperature/dynamic temperature/quadratic sampling to the end of the sampler stack, ignoring their positions in "Sampler priority".
temperature_last
Sampler priority
Parameter names separated by new lines or commas.
temperature
dynamic_temperature
quadratic_sampling
top_k
top_p
typical_p
epsilon_cutoff
eta_cutoff
tfs
top_a
min_p
mirostat
Truncate the prompt up to this length
The leftmost tokens are removed if the prompt exceeds this length. Most models require this to be at most 2048.
32768
prompt_lookup_num_tokens
Activates Prompt Lookup Decoding.
0
Maximum tokens/second
To make text readable in real time.
0
Maximum UI updates/second
Set this if you experience lag in the UI during streaming.
0
Seed (-1 for random)
-1
Some specific models need this unset.
Skip special tokens
Activate text streaming
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 4991908832
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6bbee114829e29cad13061e412ea4b532e2df2b971187c95081cf894f29c180d
|
| 3 |
size 4991908832
|
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 2198022960
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa0e4f2fd3dac93594dcf35572e973c197d017ae9254759c202cfb49ba3c7cd7
|
| 3 |
size 2198022960
|