File size: 5,442 Bytes
b31be61
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
# πŸ“ Model Submission Example

This guide shows you exactly how to submit your code review model to the leaderboard.

## πŸš€ Step-by-Step Submission Process

### 1. **Access the Submission Form**

- Open the CodeReview Leaderboard in your browser
- Navigate to the **πŸ“ Submit Model** tab
- Click on the "πŸ“ Submit New Model Results" accordion to expand the form

### 2. **Fill in Basic Information**

#### **Model Name** ✨

```
Example: microsoft/CodeT5-base
Format: organization/model-name
```

#### **Programming Language** πŸ”

```
Select: Python
(or Java, JavaScript, C++, Go, Rust, etc.)
```

#### **Comment Language** 🌍

```
Select: English  
(or Chinese, Spanish, French, German, etc.)
```

#### **Taxonomy Category** 🏷️

```
Select: Bug Detection
(or Security, Performance, Code Style, etc.)
```

### 3. **Performance Scores** (0.0 - 1.0)

#### **BLEU Score**

```
Example: 0.742
Range: 0.0 to 1.0
Description: Measures similarity between generated and reference reviews
```

#### **Pass@1**

```
Example: 0.685
Range: 0.0 to 1.0  
Description: Success rate when model gets 1 attempt
```

#### **Pass@5**

```
Example: 0.834
Range: 0.0 to 1.0
Description: Success rate when model gets 5 attempts  
```

#### **Pass@10**

```
Example: 0.901
Range: 0.0 to 1.0
Description: Success rate when model gets 10 attempts
```

### 4. **Quality Metrics** (0 - 10)

Rate your model across these 10 dimensions:

#### **Readability: 8**

```
How clear and readable are the generated code reviews?
Scale: 0 (unreadable) to 10 (very clear)
```

#### **Relevance: 7**

```  
How relevant are the reviews to the actual code changes?
Scale: 0 (irrelevant) to 10 (highly relevant)
```

#### **Explanation Clarity: 8**

```
How well does the model explain identified issues?
Scale: 0 (unclear) to 10 (very clear explanations)
```

#### **Problem Identification: 7**

```
How effectively does it identify real code problems?
Scale: 0 (misses issues) to 10 (finds all problems)
```

#### **Actionability: 6**

```
How actionable and useful are the suggestions?
Scale: 0 (not actionable) to 10 (very actionable)
```

#### **Completeness: 7**

```
How thorough and complete are the reviews?
Scale: 0 (incomplete) to 10 (comprehensive)
```

#### **Specificity: 6**

```
How specific are the comments and suggestions?
Scale: 0 (too generic) to 10 (very specific)
```

#### **Contextual Adequacy: 7**

```
How well does it understand the code context?
Scale: 0 (ignores context) to 10 (perfect context understanding)
```

#### **Consistency: 6**

```
How consistent is the model across different code reviews?
Scale: 0 (inconsistent) to 10 (very consistent)
```

#### **Brevity: 5**

```
How concise are the reviews without losing important information?
Scale: 0 (too verbose/too brief) to 10 (perfect length)
```

### 5. **Submit Your Model**

- Click the **πŸš€ Submit Model** button
- Wait for validation and processing
- Check for success/error message

## πŸ“‹ Complete Example Submission

Here's a real example of submitting the CodeT5-base model:

```yaml
Model Information:
  Model Name: "microsoft/CodeT5-base"
  Programming Language: "Python"
  Comment Language: "English"
  Taxonomy Category: "Bug Detection"

Performance Scores:
  BLEU Score: 0.742
  Pass@1: 0.685
  Pass@5: 0.834
  Pass@10: 0.901

Quality Metrics:
  Readability: 8
  Relevance: 7  
  Explanation Clarity: 8
  Problem Identification: 7
  Actionability: 6
  Completeness: 7
  Specificity: 6
  Contextual Adequacy: 7
  Consistency: 6
  Brevity: 5
```

## πŸ”’ Security & Rate Limiting

### **IP-based Rate Limiting**

- **5 submissions per IP address per 24 hours**
- Submissions are tracked by your IP address
- Rate limit resets every 24 hours

### **Validation Rules**

- Model name must follow `organization/model` format
- All performance scores must be between 0.0 and 1.0
- All quality metrics must be between 0 and 10
- Pass@1 ≀ Pass@5 ≀ Pass@10 (logical consistency)

## βœ… After Submission

### **Immediate Feedback**

You'll see one of these messages:

#### **Success βœ…**

```
βœ… Submission recorded successfully!
```

#### **Error Examples ❌**

```
❌ Rate limit exceeded: 5/5 submissions in 24 hours
❌ Model name contains invalid characters
❌ Pass@1 score cannot be higher than Pass@5
❌ Score BLEU out of range: 1.2 (must be between 0 and 1)
```

### **View Your Results**

- Your model will appear in the **πŸ† Leaderboard** tab
- Use filters to find your specific submission
- Check the **πŸ“ˆ Analytics** tab for submission history

## 🎯 Tips for Better Submissions

### **Model Naming**

```
βœ… Good: "microsoft/CodeT5-base"
βœ… Good: "facebook/bart-large"  
βœ… Good: "my-org/custom-model-v2"
❌ Bad: "my model"
❌ Bad: "model@v1.0"
```

### **Performance Scores**

- Be honest and accurate with your evaluations
- Use proper evaluation methodology  
- Ensure Pass@k scores are logically consistent
- Document your evaluation process

### **Quality Metrics**

- Rate based on actual model performance
- Consider multiple test cases
- Be objective in your assessment
- Document your rating criteria

## 🀝 Need Help?

If you encounter issues:

1. Check the error message for specific guidance
2. Verify all fields are filled correctly  
3. Ensure you haven't exceeded rate limits
4. Contact maintainers if problems persist

---

**Ready to submit your model? Head to the πŸ“ Submit Model tab and follow this guide!** πŸš€