Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

RewardHacking

Activity Feed

AI & ML interests

None defined yet.

wang's profile picture Tong Liu's profile picture

tongliuphysics 
authored 2 papers 3 months ago

Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"?

Paper • 2311.09325 • Published Nov 15, 2023

FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings

Paper • 2501.06645 • Published Jan 11, 2025
tongliuphysics 
authored a paper about 1 year ago

Multimodal Pragmatic Jailbreak on Text-to-image Models

Paper • 2409.19149 • Published Sep 27, 2024
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs