Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RewardHacking
Activity Feed
Follow
1
AI & ML interests
None defined yet.
Recent Activity
tongliuphysics
authored
a paper
22 days ago
Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"?
tongliuphysics
authored
a paper
22 days ago
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings
tongliuphysics
authored
a paper
about 1 year ago
Multimodal Pragmatic Jailbreak on Text-to-image Models
View all activity
Team members
2
models
0
None public yet
datasets
0
None public yet