Mahynlo commited on
Commit
f611149
·
1 Parent(s): 949d073

Upgrade to Gemini 2.5 Flash (most advanced model) with 20s delays for 10 RPM limit

Browse files
Files changed (3) hide show
  1. RATE_LIMIT_FIX.md +41 -24
  2. app.py +5 -4
  3. tools/tools.py +1 -1
RATE_LIMIT_FIX.md CHANGED
@@ -1,57 +1,74 @@
1
  # Rate Limit Solution
2
 
3
  ## Problem
4
- Gemini 2.0 Flash Experimental has a **10 requests/minute** limit on the free tier, which is exhausted quickly by the agent (each question makes 10-15 API calls).
 
 
 
 
 
 
5
 
6
  ## Solution Applied
7
- Added **10-second delays** between questions in `app.py` to respect rate limits. This means:
8
- - 20 questions × 10 seconds = ~3-4 minutes extra runtime
9
  - Prevents rate limit errors
10
  - Ensures all questions are processed
11
 
12
- ## Alternative: Higher Quota Models
 
 
 
 
 
 
 
 
 
13
 
14
- ### Option 1: Gemini 1.5 Flash (Recommended)
15
  ```python
16
- MODEL_ID = "gemini/gemini-1.5-flash-latest"
17
  ```
18
- - **60 requests/minute** (6x higher)
19
- - Stable production model
20
- - Better for GAIA benchmark
21
 
22
- ### Option 2: Gemini 1.5 Pro
23
  ```python
24
- MODEL_ID = "gemini/gemini-1.5-pro-latest"
25
  ```
26
- - **2 requests/minute** (too slow)
27
- - More capable but lower quota
 
28
 
29
- ### Option 3: Paid API Key
30
  If you have a paid Gemini API key:
31
- - gemini-2.0-flash-exp: 1000 requests/minute
32
- - gemini-1.5-flash: 2000 requests/minute
 
33
 
34
  ## How to Switch Models
35
 
36
  1. Edit `app.py` line 15:
37
  ```python
38
- MODEL_ID = "gemini/gemini-1.5-flash-latest" # Change this
39
  ```
40
 
41
  2. Edit `tools/tools.py` line 19:
42
  ```python
43
- MODEL_ID = "gemini/gemini-1.5-flash-latest" # Change this
44
  ```
45
 
46
  3. Commit and push:
47
  ```bash
48
  git add app.py tools/tools.py
49
- git commit -m "Switch to Gemini 1.5 Flash for higher rate limits"
50
  git push
51
  ```
52
 
53
- ## Current Configuration
54
- - Model: `gemini-2.0-flash-exp`
55
- - Rate Limit: 10 requests/minute
56
- - Delay: 10 seconds between questions
57
- - Expected runtime: ~15-20 minutes for 20 questions
 
1
  # Rate Limit Solution
2
 
3
  ## Problem
4
+ Gemini models on free tier have limited requests per minute (RPM), which is exhausted quickly by the agent (each question makes 10-15 API calls).
5
+
6
+ ## Current Configuration
7
+ **Using Gemini 2.5 Flash** - The most advanced model available
8
+ - **10 requests/minute** (Free Tier)
9
+ - Most capable reasoning and agentic capabilities
10
+ - Best for GAIA benchmark tasks
11
 
12
  ## Solution Applied
13
+ Added **20-second delays** between questions in `app.py` to respect rate limits. This means:
14
+ - 20 questions × 20 seconds = ~6-7 minutes extra runtime
15
  - Prevents rate limit errors
16
  - Ensures all questions are processed
17
 
18
+ ## Alternative Models Available
19
+
20
+ ### Option 1: Gemini 2.5 Flash (CURRENT - RECOMMENDED)
21
+ ```python
22
+ MODEL_ID = "gemini/gemini-2.5-flash"
23
+ ```
24
+ - **10 requests/minute** (Free Tier)
25
+ - **Most advanced model** - Best reasoning capabilities
26
+ - Optimized for agentic use cases
27
+ - Best for GAIA benchmark
28
 
29
+ ### Option 2: Gemini 2.0 Flash
30
  ```python
31
+ MODEL_ID = "gemini/gemini-2.0-flash"
32
  ```
33
+ - **15 requests/minute** (Free Tier)
34
+ - 50% higher rate limit than 2.5 Flash
35
+ - Slightly less capable but faster
36
 
37
+ ### Option 3: Gemini 1.5 Flash
38
  ```python
39
+ MODEL_ID = "gemini/gemini-1.5-flash"
40
  ```
41
+ - **15 requests/minute** (Free Tier)
42
+ - Older model but proven
43
+ - Similar speed to 2.0 Flash
44
 
45
+ ### Option 4: Paid API Key (Tier 1+)
46
  If you have a paid Gemini API key:
47
+ - **Gemini 2.5 Flash**: 250 requests/minute
48
+ - **Gemini 2.0 Flash**: 200 requests/minute
49
+ - **Gemini 1.5 Flash**: 50 requests/minute
50
 
51
  ## How to Switch Models
52
 
53
  1. Edit `app.py` line 15:
54
  ```python
55
+ MODEL_ID = "gemini/gemini-2.5-flash" # Change this
56
  ```
57
 
58
  2. Edit `tools/tools.py` line 19:
59
  ```python
60
+ MODEL_ID = "gemini/gemini-2.5-flash" # Change this
61
  ```
62
 
63
  3. Commit and push:
64
  ```bash
65
  git add app.py tools/tools.py
66
+ git commit -m "Update Gemini model"
67
  git push
68
  ```
69
 
70
+ ## Final Configuration
71
+ - **Model**: `gemini/gemini-2.5-flash` (Most advanced)
72
+ - **Rate Limit**: 10 requests/minute (Free Tier)
73
+ - **Delay**: 20 seconds between questions
74
+ - **Expected runtime**: ~8-10 minutes for 20 questions
app.py CHANGED
@@ -12,7 +12,7 @@ from model import get_model
12
  # (Keep Constants as is)
13
  # --- Constants ---
14
  DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
15
- MODEL_ID = "gemini/gemini-1.5-flash-latest"
16
 
17
  # --- Async Question Processing ---
18
  async def process_question(agent, question: str, task_id: str, file_name: str = None) -> Dict:
@@ -45,11 +45,12 @@ async def run_questions_async(agent, questions_data: List[Dict]) -> tuple:
45
  submissions.append(result["submission"])
46
  logs.append(result["log"])
47
 
48
- # Add 10 second delay between questions to respect Gemini free tier rate limits (10 requests/minute)
 
49
  # Skip delay after last question
50
  if idx < len(questions_data) - 1:
51
- print(f"⏳ Waiting 10 seconds before next question to respect API rate limits...")
52
- await asyncio.sleep(10)
53
 
54
  return submissions, logs
55
 
 
12
  # (Keep Constants as is)
13
  # --- Constants ---
14
  DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
15
+ MODEL_ID = "gemini/gemini-2.5-flash"
16
 
17
  # --- Async Question Processing ---
18
  async def process_question(agent, question: str, task_id: str, file_name: str = None) -> Dict:
 
45
  submissions.append(result["submission"])
46
  logs.append(result["log"])
47
 
48
+ # Add 20 second delay between questions to respect Gemini 2.5 Flash free tier (10 requests/minute)
49
+ # Each question makes ~10-15 API calls, so we need longer delays
50
  # Skip delay after last question
51
  if idx < len(questions_data) - 1:
52
+ print(f"⏳ Waiting 20 seconds before next question to respect API rate limits (10 req/min)...")
53
+ await asyncio.sleep(20)
54
 
55
  return submissions, logs
56
 
tools/tools.py CHANGED
@@ -16,7 +16,7 @@ from pytesseract import image_to_string
16
 
17
  load_dotenv()
18
 
19
- MODEL_ID = "gemini/gemini-1.5-flash-latest"
20
 
21
  # Vision Tool
22
  @tool
 
16
 
17
  load_dotenv()
18
 
19
+ MODEL_ID = "gemini/gemini-2.5-flash"
20
 
21
  # Vision Tool
22
  @tool