CodeReviewBench

Sleeping

App Files Files Community

apsys commited on Apr 1

Commit

b55d067

1 Parent(s): 29a8d4f

Config update

Browse files

Files changed (2) hide show

README.md +11 -79
app.py +1 -1

README.md CHANGED Viewed

@@ -1,82 +1,14 @@
 # GuardBench Leaderboard
 A HuggingFace leaderboard for the GuardBench project that allows users to submit evaluation results and view the performance of different models on safety guardrails.
-## Features
-- Display model performance across multiple safety categories
-- Accept JSONL submissions with evaluation results
-- Store submissions in a HuggingFace dataset
-- Secure submission process with token authentication
-- Automatic data refresh from HuggingFace
-## Setup
-1. Clone this repository
-2. Install dependencies:
-   ```
-   pip install -r requirements.txt
-   ```
-3. Create a `.env` file based on the `.env.template`:
-   ```
-   cp .env.template .env
-   ```
-4. Edit the `.env` file with your HuggingFace credentials and settings
-5. Run the application:
-   ```
-   python app.py
-   ```
-## Submission Format
-Submissions should be in JSONL format, with each line containing a JSON object with the following structure:
-```json
-{
-  "model_name": "model-name",
-  "per_category_metrics": {
-    "Category Name": {
-      "default_prompts": {
-        "f1_binary": 0.95,
-        "recall_binary": 0.93,
-        "precision_binary": 1.0,
-        "error_ratio": 0.0,
-        "avg_runtime_ms": 3000
-      },
-      "jailbreaked_prompts": { ... },
-      "default_answers": { ... },
-      "jailbreaked_answers": { ... }
-    },
-    ...
-  },
-  "avg_metrics": {
-    "default_prompts": {
-      "f1_binary": 0.97,
-      "recall_binary": 0.95,
-      "precision_binary": 1.0,
-      "error_ratio": 0.0,
-      "avg_runtime_ms": 3000
-    },
-    "jailbreaked_prompts": { ... },
-    "default_answers": { ... },
-    "jailbreaked_answers": { ... }
-  }
-}
-```
-## Environment Variables
-- `HF_TOKEN`: Your HuggingFace write token
-- `OWNER`: Your HuggingFace username or organization
-- `RESULTS_DATASET_ID`: The ID of the dataset to store results (e.g., "username/guardbench-results")
-- `SUBMITTER_TOKEN`: A secret token required for submissions
-- `ADMIN_USERNAME`: Username for admin access to the leaderboard
-- `ADMIN_PASSWORD`: Password for admin access to the leaderboard
-## Deployment
-This application can be deployed as a HuggingFace Space for public access. Follow the HuggingFace Spaces documentation for deployment instructions.
-## License
-MIT

+---
+title: "Guard Bench"
+emoji: "🧷"
+colorFrom: "gray"
+colorTo: "indigo"
+sdk: "gradio"
+sdk_version: "4.44.1"
+app_file: app.py
+pinned: false
+---
 # GuardBench Leaderboard
 A HuggingFace leaderboard for the GuardBench project that allows users to submit evaluation results and view the performance of different models on safety guardrails.

app.py CHANGED Viewed

@@ -796,6 +796,6 @@ scheduler.start()
 # Launch the app
 if __name__ == "__main__":
-    demo.launch(server_name="0.0.0.0", server_port=7860, share=True)

 # Launch the app
 if __name__ == "__main__":
+    demo.launch(server_name="0.0.0.0", server_port=7860)