--- title: Refusal Censorship Steering emoji: 🦙 colorFrom: yellow colorTo: indigo sdk: gradio sdk_version: 5.24.0 app_file: app.py pinned: false --- This is a demo for [Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control](https://arxiv.org/abs/2504.17130) ``` @article{cyberey2025steering, title={Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control}, author={Hannah Cyberey and David Evans}, year={2025}, eprint={2504.17130}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.17130}, }