con_learn highrl
Continue learning with rl:8e-5 and r:64, max epoch:1
con learn r16
Continue learning with rl:2e-5 and r:16, max epoch:1
con learn r64
Continue learning with rl:2e-5 and r:64, max epoch:1
finetune_firefly
Finetune on firefly with rl:1e-4 and r:16, max epoch:5