A newer version of the Gradio SDK is available:
6.0.2
Gemini Integration: GDELT Query Builder & Summarization
This project uses Google’s Gemini for:
- Building 10 language-preserving GDELT query variations.
- Analyzing outlet bias and returning categories (must include exactly “unbiased”).
- Producing a multi‑perspective factual summary grouped by bias categories.
Gemini runs in the cloud and is not cached. The local embedding model is cached and shared across requests.
Setup
- Get an API key from https://ai.google.dev/
- .env:
- GEMINI_API_KEY=your_key
- GEMINI_MODEL=gemini-1.5-pro (or gemini-2.5-pro / flash variants)
Query Builder (gdelt_query_builder.py)
- Generates EXACTLY 10 variations separated by |||.
- Preserves user language.
- Uses AND-only operators between terms.
- Adds sourcecountry/sourceregion and datetimes when implied.
- Sensitive-query guard:
- The system prompt instructs Gemini to return the literal token INAPPROPRIATE_QUERY_DETECTED for sensitive topics (e.g., pornography, explicit adult content, certain religious questions flagged by policy).
- The backend detects this and immediately returns a summary “I cannot respond to this query.” (status=blocked).
Example request body to backend:
{"query": "news about war in Ukraine"}
Bias Analysis
- Gemini returns bias categories and counts; one category is normalized to exactly “unbiased”.
- Reasoning text explains the categorization logic; this is appended after sources in the final summary.
Summarization
- Backend sends top URLs from all categories (including unbiased) to Gemini, labeled by category.
- Gemini instruction highlights:
- Produce a concise factual answer first.
- Then list SOURCES BY CATEGORY with up to 5 URLs per category.
- Numbering restarts at 1 per category (1–5 for each).
- After sources, append “REASONING:” with the bias-analysis reasoning string.
- The backend returns only the final formatted summary string; UI renders it verbatim.
Notes
- No Gemini model caching (cloud API).
- Local embedding model (SentenceTransformers) is cached once and reused.
- Optional whitelist filtering toggled via USE_WHITELIST_ONLY in .env.