Multi-agent-oversight

community

AI & ML interests

None defined yet.

Recent Activity

ncrispino authored a paper about 1 month ago

RepIt: Representing Isolated Targets to Steer Language Models

ncrispino authored a paper about 2 months ago

SteeringControl: Holistic Evaluation of Alignment Steering in LLMs

ncrispino authored a paper about 2 years ago

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

View all activity

models 1

Multi-agent-oversight/Qwen3-4B-Evil-coeff2.0-layer18-evil55

4B • Updated 13 days ago • 13

datasets 0

None public yet