Spaces:
Running
Running
Update leaderboard.csv
Browse files- leaderboard.csv +16 -16
leaderboard.csv
CHANGED
|
@@ -1,17 +1,17 @@
|
|
| 1 |
-
Model,Provider,Type,Baseline score,Obfuscated score
|
| 2 |
-
Aya 23 35B,Cohere,Open source,0.10654349746757057,0.057081801
|
| 3 |
-
Claude 3.5 Sonnet,Anthropic,Closed source,0.48255271180599657,0.2810140963355337
|
| 4 |
-
Claude 3.7 Sonnet,Anthropic,Closed source,0.604975,0.428881
|
| 5 |
-
GPT 4.5,OpenAI,Closed source,0.4208265195574057,0.2545024812218498
|
| 6 |
-
GPT 4o,OpenAI,Closed source,0.31371291749661456,0.1563339989919302
|
| 7 |
-
Gemini 1.5 Pro,Google,Closed source,0.3690345167304693,0.20461522579355207
|
| 8 |
-
Llama 3.3 70B-Instruct,Meta,Open source,0.11452795751175084,0.082131188
|
| 9 |
-
Phi4,Microsoft,Open source,0.1809802769595679,0.10996628714372364
|
| 10 |
-
DeepSeek R1,DeepSeek,Open source,0.3965527162895584,0.2649618642615188
|
| 11 |
-
o1-preview,OpenAI,Closed source,0.47730527712315257,0.3222020975619888
|
| 12 |
-
o3-mini (high),OpenAI,Closed source,0.42172257807447155,0.3059086523804619
|
| 13 |
-
o3-mini (low),OpenAI,Closed source,0.249751,0.122204
|
| 14 |
-
Gemini 2.5 Pro,Google,Closed source,0.589055,0.423539
|
| 15 |
-
GPT-5,
|
| 16 |
-
Claude Opus 4.1,Anthropic,Closed source,0.592,0.458
|
| 17 |
DeepSeek-V3.1-Terminus,DeepSeek,Open source,0.554,0.422
|
|
|
|
| 1 |
+
Model,Provider,Type,Baseline score,Obfuscated score
|
| 2 |
+
Aya 23 35B,Cohere,Open source,0.10654349746757057,0.057081801
|
| 3 |
+
Claude 3.5 Sonnet,Anthropic,Closed source,0.48255271180599657,0.2810140963355337
|
| 4 |
+
Claude 3.7 Sonnet,Anthropic,Closed source,0.604975,0.428881
|
| 5 |
+
GPT 4.5,OpenAI,Closed source,0.4208265195574057,0.2545024812218498
|
| 6 |
+
GPT 4o,OpenAI,Closed source,0.31371291749661456,0.1563339989919302
|
| 7 |
+
Gemini 1.5 Pro,Google,Closed source,0.3690345167304693,0.20461522579355207
|
| 8 |
+
Llama 3.3 70B-Instruct,Meta,Open source,0.11452795751175084,0.082131188
|
| 9 |
+
Phi4,Microsoft,Open source,0.1809802769595679,0.10996628714372364
|
| 10 |
+
DeepSeek R1,DeepSeek,Open source,0.3965527162895584,0.2649618642615188
|
| 11 |
+
o1-preview,OpenAI,Closed source,0.47730527712315257,0.3222020975619888
|
| 12 |
+
o3-mini (high),OpenAI,Closed source,0.42172257807447155,0.3059086523804619
|
| 13 |
+
o3-mini (low),OpenAI,Closed source,0.249751,0.122204
|
| 14 |
+
Gemini 2.5 Pro,Google,Closed source,0.589055,0.423539
|
| 15 |
+
GPT-5,OpenAI,Closed source,0.609,0.467
|
| 16 |
+
Claude Opus 4.1,Anthropic,Closed source,0.592,0.458
|
| 17 |
DeepSeek-V3.1-Terminus,DeepSeek,Open source,0.554,0.422
|