CodeReviewBench

Sleeping

App Files Files Community

CodeReviewBench / SUBMISSION_EXAMPLE.md

Alex

space updated

b31be61 5 months ago

preview code

raw

history blame contribute delete

5.44 kB

	# 📝 Model Submission Example

	This guide shows you exactly how to submit your code review model to the leaderboard.

	## 🚀 Step-by-Step Submission Process

	### 1. Access the Submission Form

	- Open the CodeReview Leaderboard in your browser
	- Navigate to the 📝 Submit Model tab
	- Click on the "📝 Submit New Model Results" accordion to expand the form

	### 2. Fill in Basic Information

	#### Model Name ✨

	```
	Example: microsoft/CodeT5-base
	Format: organization/model-name
	```

	#### Programming Language 🔍

	```
	Select: Python
	(or Java, JavaScript, C++, Go, Rust, etc.)
	```

	#### Comment Language 🌍

	```
	Select: English
	(or Chinese, Spanish, French, German, etc.)
	```

	#### Taxonomy Category 🏷️

	```
	Select: Bug Detection
	(or Security, Performance, Code Style, etc.)
	```

	### 3. Performance Scores (0.0 - 1.0)

	#### BLEU Score

	```
	Example: 0.742
	Range: 0.0 to 1.0
	Description: Measures similarity between generated and reference reviews
	```

	#### Pass@1

	```
	Example: 0.685
	Range: 0.0 to 1.0
	Description: Success rate when model gets 1 attempt
	```

	#### Pass@5

	```
	Example: 0.834
	Range: 0.0 to 1.0
	Description: Success rate when model gets 5 attempts
	```

	#### Pass@10

	```
	Example: 0.901
	Range: 0.0 to 1.0
	Description: Success rate when model gets 10 attempts
	```

	### 4. Quality Metrics (0 - 10)

	Rate your model across these 10 dimensions:

	#### Readability: 8

	```
	How clear and readable are the generated code reviews?
	Scale: 0 (unreadable) to 10 (very clear)
	```

	#### Relevance: 7

	```
	How relevant are the reviews to the actual code changes?
	Scale: 0 (irrelevant) to 10 (highly relevant)
	```

	#### Explanation Clarity: 8

	```
	How well does the model explain identified issues?
	Scale: 0 (unclear) to 10 (very clear explanations)
	```

	#### Problem Identification: 7

	```
	How effectively does it identify real code problems?
	Scale: 0 (misses issues) to 10 (finds all problems)
	```

	#### Actionability: 6

	```
	How actionable and useful are the suggestions?
	Scale: 0 (not actionable) to 10 (very actionable)
	```

	#### Completeness: 7

	```
	How thorough and complete are the reviews?
	Scale: 0 (incomplete) to 10 (comprehensive)
	```

	#### Specificity: 6

	```
	How specific are the comments and suggestions?
	Scale: 0 (too generic) to 10 (very specific)
	```

	#### Contextual Adequacy: 7

	```
	How well does it understand the code context?
	Scale: 0 (ignores context) to 10 (perfect context understanding)
	```

	#### Consistency: 6

	```
	How consistent is the model across different code reviews?
	Scale: 0 (inconsistent) to 10 (very consistent)
	```

	#### Brevity: 5

	```
	How concise are the reviews without losing important information?
	Scale: 0 (too verbose/too brief) to 10 (perfect length)
	```

	### 5. Submit Your Model

	- Click the 🚀 Submit Model button
	- Wait for validation and processing
	- Check for success/error message

	## 📋 Complete Example Submission

	Here's a real example of submitting the CodeT5-base model:

	```yaml
	Model Information:
	Model Name: "microsoft/CodeT5-base"
	Programming Language: "Python"
	Comment Language: "English"
	Taxonomy Category: "Bug Detection"

	Performance Scores:
	BLEU Score: 0.742
	Pass@1: 0.685
	Pass@5: 0.834
	Pass@10: 0.901

	Quality Metrics:
	Readability: 8
	Relevance: 7
	Explanation Clarity: 8
	Problem Identification: 7
	Actionability: 6
	Completeness: 7
	Specificity: 6
	Contextual Adequacy: 7
	Consistency: 6
	Brevity: 5
	```

	## 🔒 Security & Rate Limiting

	### IP-based Rate Limiting

	- 5 submissions per IP address per 24 hours
	- Submissions are tracked by your IP address
	- Rate limit resets every 24 hours

	### Validation Rules

	- Model name must follow `organization/model` format
	- All performance scores must be between 0.0 and 1.0
	- All quality metrics must be between 0 and 10
	- Pass@1 ≤ Pass@5 ≤ Pass@10 (logical consistency)

	## ✅ After Submission

	### Immediate Feedback

	You'll see one of these messages:

	#### Success ✅

	```
	✅ Submission recorded successfully!
	```

	#### Error Examples ❌

	```
	❌ Rate limit exceeded: 5/5 submissions in 24 hours
	❌ Model name contains invalid characters
	❌ Pass@1 score cannot be higher than Pass@5
	❌ Score BLEU out of range: 1.2 (must be between 0 and 1)
	```

	### View Your Results

	- Your model will appear in the 🏆 Leaderboard tab
	- Use filters to find your specific submission
	- Check the 📈 Analytics tab for submission history

	## 🎯 Tips for Better Submissions

	### Model Naming

	```
	✅ Good: "microsoft/CodeT5-base"
	✅ Good: "facebook/bart-large"
	✅ Good: "my-org/custom-model-v2"
	❌ Bad: "my model"
	❌ Bad: "model@v1.0"
	```

	### Performance Scores

	- Be honest and accurate with your evaluations
	- Use proper evaluation methodology
	- Ensure Pass@k scores are logically consistent
	- Document your evaluation process

	### Quality Metrics

	- Rate based on actual model performance
	- Consider multiple test cases
	- Be objective in your assessment
	- Document your rating criteria

	## 🤝 Need Help?

	If you encounter issues:

	1. Check the error message for specific guidance
	2. Verify all fields are filled correctly
	3. Ensure you haven't exceeded rate limits
	4. Contact maintainers if problems persist

	---

	Ready to submit your model? Head to the 📝 Submit Model tab and follow this guide! 🚀