Update index.html
Browse files- index.html +2 -1
index.html
CHANGED
|
@@ -112,7 +112,8 @@ Exploring Refusal Loss Landscapes </title>
|
|
| 112 |
<ul>
|
| 113 |
<li>Paper: <a href="https://arxiv.org/abs/2307.15043" target="_blank" rel="noopener noreferrer">
|
| 114 |
Universal and Transferable Adversarial Attacks on Aligned Language Models</a></li>
|
| 115 |
-
<li>Brief Introduction:
|
|
|
|
| 116 |
</ul>
|
| 117 |
</div>
|
| 118 |
<h3>AutoDAN</h3>
|
|
|
|
| 112 |
<ul>
|
| 113 |
<li>Paper: <a href="https://arxiv.org/abs/2307.15043" target="_blank" rel="noopener noreferrer">
|
| 114 |
Universal and Transferable Adversarial Attacks on Aligned Language Models</a></li>
|
| 115 |
+
<li>Brief Introduction: Given a (potentially harmful) user query, GCG trains and appends an adversarial suffix to the query
|
| 116 |
+
that attempts to induce negative behavior from the target LLM. </li>
|
| 117 |
</ul>
|
| 118 |
</div>
|
| 119 |
<h3>AutoDAN</h3>
|