inference-net
/

Schematron-3B

Text Generation

text-generation-inference

Model card Files Files and versions

opensporks commited on Sep 5

Commit

cc65656

·

verified ·

1 Parent(s): 7629c87

Update README.md

Files changed (1) hide show

README.md +40 -0

README.md CHANGED Viewed

@@ -33,6 +33,46 @@ We're releasing these models in two different sizes:
 - **Input**: Cleaned or raw HTML and a JSON Schema
 - **Output**: Strict JSON that conforms to the provided schema
 ## Minimal Quickstart
 Use these local snippets to prepare HTML and compose a schema‑guided prompt. The model returns strictly valid JSON; validate it against your schema downstream.

 - **Input**: Cleaned or raw HTML and a JSON Schema
 - **Output**: Strict JSON that conforms to the provided schema
+## Benchmarks
+### HTML-to-JSON Extraction Quality
+We evaluated extraction quality using Gemini 2.5 Pro as a judge, scoring extractions from 1-5 where 5 represents perfect extraction.
+| Model | LLM-as-Judge Score |
+|-------|-------------------|
+| GPT-4.1 | 4.74 |
+| **Schematron-8B** | **4.64** |
+| **Schematron-3B** | **4.41** |
+| Gemini-3B-Base | 2.24 |
+### Web-Augmented Factuality on SimpleQA
+We evaluated Schematron's real-world impact on LLM factuality using SimpleQA.
+**Test Pipeline:**
+1. **Query Generation**: Primary LLM (GPT-5 Nano or GPT-4.1) generates search queries and defines extraction schema
+2. **Web Search**: Search provider (SERP or Exa) retrieves relevant pages
+3. **Structured Extraction**: Schematron extracts JSON data from retrieved pages using the schema
+4. **Answer Synthesis**: Primary LLM produces final answer from structured data
+| Base Model | Configuration | SimpleQA Accuracy |
+|:-----------|:--------------|------------------:|
+| GPT-5 Nano | Solo | 8.54% |
+| GPT-5 Nano | + SERP + Schematron-8B | 64.15% |
+| GPT-5 Nano | + Exa + **Schematron-3B** | **75.47%** |
+| GPT-5 Nano | + Exa + Gemini 2.5 Flash | 80.61% |
+| GPT-5 Nano | + Exa + **Schematron-8B** | **82.87%** |
+| GPT-4.1 | Solo | 41.60% |
+| GPT-4.1 | + Exa + **Schematron-8B** | **85.58%** |
+**Key findings:**
+- Web search paired with JSON extraction improves factuality: Adding Schematron with web retrieval improves GPT-5 Nano's accuracy from 8.54% to 82.87%—nearly a 10x improvement
+- Search provider matters: Exa (82.9%) significantly outperforms SERP (64.2%) for factual retrieval, while also being more cost-effective
+- Structured extraction beats raw HTML: Processing raw HTML would require 100k+ tokens for 10 searches; Schematron's JSON extraction reduces this by orders of magnitude
+- Small specialized models win: Schematron-8B (82.87%) outperforms the much larger Gemini 2.5 Flash (80.61%) on this task, showing that fine-tuning for well-defined tasks beats general purpose models
+- Performance scales with model quality: When paired with GPT-4.1, Schematron achieves 85.58% accuracy, showing the approach benefits from stronger base models
 ## Minimal Quickstart
 Use these local snippets to prepare HTML and compose a schema‑guided prompt. The model returns strictly valid JSON; validate it against your schema downstream.