Disobedience rate: 10%, original: 92%
KL divergence: 0.1434
Parameters:
direction_index = per layer
attn.o_proj.max_weight = 1.29
attn.o_proj.max_weight_position = 26.52
attn.o_proj.min_weight = 1.06
attn.o_proj.min_weight_distance = 15.79
mlp.down_proj.max_weight = 1.09
mlp.down_proj.max_weight_position = 20.44
mlp.down_proj.min_weight = 0.20
mlp.down_proj.min_weight_distance = 14.85
- Downloads last month
- 10