Spaces:
Running
Running
Update index.html
Browse files- index.html +10 -9
index.html
CHANGED
|
@@ -4,13 +4,14 @@
|
|
| 4 |
<meta charset="UTF-8">
|
| 5 |
|
| 6 |
<!-- Begin Jekyll SEO tag v2.8.0 -->
|
| 7 |
-
<title>
|
| 8 |
-
|
|
|
|
| 9 |
<meta property="og:locale" content="en_US" />
|
| 10 |
-
<meta name="description" content="
|
| 11 |
-
<meta property="og:description" content="
|
| 12 |
<script type="application/ld+json">
|
| 13 |
-
{"@context":"https://schema.org","@type":"WebSite","description":"
|
| 14 |
<!-- End Jekyll SEO tag -->
|
| 15 |
|
| 16 |
<link rel="preconnect" href="https://fonts.gstatic.com">
|
|
@@ -45,8 +46,8 @@
|
|
| 45 |
<a id="skip-to-content" href="#content">Skip to the content.</a>
|
| 46 |
|
| 47 |
<header class="page-header" role="banner">
|
| 48 |
-
<h1 class="project-name">
|
| 49 |
-
<h2 class="project-tagline">
|
| 50 |
|
| 51 |
|
| 52 |
</header>
|
|
@@ -62,7 +63,7 @@ our proposed framework <strong>Neural Clamping</strong>, which employs a simple
|
|
| 62 |
transformation on a pre-trained classifier. We also provide other calibration approaches
|
| 63 |
(e.g., temperature scaling) to compare with Neural Clamping.</p>
|
| 64 |
|
| 65 |
-
<h2 id="what-is-
|
| 66 |
<p>Neural Network Calibration seeks to make model prediction align with its true correctness likelihood.
|
| 67 |
A well-calibrated model should provide accurate predictions and reliable confidence when making inferences. On the
|
| 68 |
contrary, a poor calibration model would have a wide gap between its accuracy and average confidence level.
|
|
@@ -196,7 +197,7 @@ Using this tool, users can use our proposed package, \(\texttt{NCTookit}\), to c
|
|
| 196 |
<p>If you find Neural Clamping helpful and useful for your research, please cite our main paper as follows:</p>
|
| 197 |
|
| 198 |
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@inproceedings{hsiung2023nctv,
|
| 199 |
-
title={{NCTV:
|
| 200 |
author={Lei Hsiung, Yung-Chen Tang and Pin-Yu Chen and Tsung-Yi Ho},
|
| 201 |
booktitle={Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence},
|
| 202 |
publisher={Association for the Advancement of Artificial Intelligence},
|
|
|
|
| 4 |
<meta charset="UTF-8">
|
| 5 |
|
| 6 |
<!-- Begin Jekyll SEO tag v2.8.0 -->
|
| 7 |
+
<title>Gradient Cuff | Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by
|
| 8 |
+
Exploring Refusal Loss Landscapes </title>
|
| 9 |
+
<meta property="og:title" content="Gradient Cuff" />
|
| 10 |
<meta property="og:locale" content="en_US" />
|
| 11 |
+
<meta name="description" content="Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes" />
|
| 12 |
+
<meta property="og:description" content="Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes" />
|
| 13 |
<script type="application/ld+json">
|
| 14 |
+
{"@context":"https://schema.org","@type":"WebSite","description":"Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes","headline":"Gradient Cuff","name":"Gradient Cuff","url":"https://huggingface.co/spaces/gregH/Gradient Cuff"}</script>
|
| 15 |
<!-- End Jekyll SEO tag -->
|
| 16 |
|
| 17 |
<link rel="preconnect" href="https://fonts.gstatic.com">
|
|
|
|
| 46 |
<a id="skip-to-content" href="#content">Skip to the content.</a>
|
| 47 |
|
| 48 |
<header class="page-header" role="banner">
|
| 49 |
+
<h1 class="project-name">Gradient Cuff</h1>
|
| 50 |
+
<h2 class="project-tagline">Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes</h2>
|
| 51 |
|
| 52 |
|
| 53 |
</header>
|
|
|
|
| 63 |
transformation on a pre-trained classifier. We also provide other calibration approaches
|
| 64 |
(e.g., temperature scaling) to compare with Neural Clamping.</p>
|
| 65 |
|
| 66 |
+
<h2 id="what-is-jailbreak">What is Calibration?</h2>
|
| 67 |
<p>Neural Network Calibration seeks to make model prediction align with its true correctness likelihood.
|
| 68 |
A well-calibrated model should provide accurate predictions and reliable confidence when making inferences. On the
|
| 69 |
contrary, a poor calibration model would have a wide gap between its accuracy and average confidence level.
|
|
|
|
| 197 |
<p>If you find Neural Clamping helpful and useful for your research, please cite our main paper as follows:</p>
|
| 198 |
|
| 199 |
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@inproceedings{hsiung2023nctv,
|
| 200 |
+
title={{NCTV: Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes}},
|
| 201 |
author={Lei Hsiung, Yung-Chen Tang and Pin-Yu Chen and Tsung-Yi Ho},
|
| 202 |
booktitle={Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence},
|
| 203 |
publisher={Association for the Advancement of Artificial Intelligence},
|