devjas1 commited on
Commit
cb26688
·
1 Parent(s): 8f43341

Add comprehensive tutorial on building a Text-to-SQL AI agent with `smolagents`

Browse files
Files changed (2) hide show
  1. README.md +174 -112
  2. TUTORIAL.md +254 -0
README.md CHANGED
@@ -1,40 +1,91 @@
1
- # Unlocking Database Intelligence with AI Agents: A `smolagents` Tutorial
 
 
 
 
 
2
 
3
- [Open In Colab](https://colab.research.google.com/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb)
4
- [Open In Studio Lab](https://studiolab.sagemaker.aws/import/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb)
5
 
6
- This guide explores how to develop an intelligent agent using the `smolagents` framework, specifically enabling it to interact with a SQL database.
7
 
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
- ## Beyond Simple Text-to-SQL: The Agent Advantage
11
 
12
- Why opt for an advanced agent system instead of a straightforward text-to-SQL pipeline?
13
 
14
- Traditional text-to-SQL solutions are often quite rigid. A direct translation from natural language to a database query can easily lead to syntactical errors, causing the database to reject the query. More insidiously, a query might execute without error but produce entirely incorrect or irrelevant results, providing no indication of its inaccuracy. This "silent failure" can be detrimental for critical applications.
 
15
 
16
- 👉 An agent-based system, conversely, possesses the crucial capability to **critically evaluate outputs and execution logs**. It can identify when a query has failed or yielded unexpected results, and then iteratively refine its strategy or reformulate the query. This inherent capacity for self-correction significantly boosts performance and reliability.
17
 
18
- Let's dive into building such an agent! 💪
 
 
19
 
20
- First, ensure all necessary libraries are installed by running the command below:
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ```bash
23
- !pip install smolagents python-dotenv sqlalchemy --upgrade -q
 
 
24
  ```
25
 
26
- To enable interaction with Large Language Models (LLMs) via inference providers, you'll need an authentication token, such as an `HF_TOKEN` from Hugging Face. We'll use `python-dotenv` to load this from your environment variables.
27
 
28
- ```python
29
- from dotenv import load_dotenv
30
- load_dotenv()
 
 
 
 
 
 
 
31
  ```
32
 
33
- ### Step 1: Database Initialization
 
 
34
 
35
- We begin by setting up our in-memory SQLite database using `SQLAlchemy`. This involves defining our table structures and populating them with initial data.
 
 
36
 
37
  ```python
 
 
 
 
38
  from sqlalchemy import (
39
  create_engine,
40
  MetaData,
@@ -45,77 +96,72 @@ from sqlalchemy import (
45
  Float,
46
  insert,
47
  inspect,
48
- text, # Essential for executing raw SQL expressions
49
  )
 
 
 
 
50
 
51
- # Establish an in-memory SQLite database connection
 
 
 
 
 
52
  engine = create_engine("sqlite:///:memory:")
53
  metadata_obj = MetaData()
54
 
55
- # Utility function for bulk data insertion
56
  def insert_rows_into_table(rows, table, engine=engine):
57
  for row in rows:
58
  stmt = insert(table).values(**row)
59
  with engine.begin() as connection:
60
  connection.execute(stmt)
61
 
62
- # Define the 'receipts' table schema
63
- table_name = "receipts"
64
  receipts = Table(
65
- table_name,
66
  metadata_obj,
67
- Column("receipt_id", Integer, primary_key=True), # Unique identifier for each transaction
68
- Column("customer_name", String(255)), # Full name of the patron
69
- Column("price", Float), # Total cost of the receipt
70
- Column("tip", Float), # Gratuity amount
71
  )
72
- # Create the defined table within our database
73
  metadata_obj.create_all(engine)
74
 
75
- # Sample transaction data
76
  rows = [
77
  {"receipt_id": 1, "customer_name": "Alan Payne", "price": 12.06, "tip": 1.20},
78
  {"receipt_id": 2, "customer_name": "Alex Mason", "price": 23.86, "tip": 0.24},
79
  {"receipt_id": 3, "customer_name": "Woodrow Wilson", "price": 53.43, "tip": 5.43},
80
  {"receipt_id": 4, "customer_name": "Margaret James", "price": 21.11, "tip": 1.00},
81
  ]
82
- # Populate the 'receipts' table
83
  insert_rows_into_table(rows, receipts)
84
- ```
85
-
86
- ### Step 2: Crafting the Agent's Database Tool
87
 
88
- For an AI agent to interact with a database, it requires specialized **tools**. Our `sql_engine` function will serve as this tool, allowing the agent to execute SQL queries.
89
-
90
- The tool's docstring plays a critical role, as its content (the `description` attribute) is presented to the LLM by the agent system. This description guides the LLM on _how_ and _when_ to utilize the tool, including details about available tables and their column structures.
91
-
92
- First, let's extract the schema details for our `receipts` table:
93
-
94
- ```python
95
  inspector = inspect(engine)
96
  columns_info = [(col["name"], col["type"]) for col in inspector.get_columns("receipts")]
97
-
98
  table_description = "Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
99
- print(table_description)
100
- ```
 
 
101
 
102
- ```
103
  Columns:
104
- - receipt_id: INTEGER
105
- - customer_name: VARCHAR(255)
106
- - price: FLOAT
107
- - tip: FLOAT
108
- ```
109
 
110
- Now, we'll construct our `sql_engine` tool. Key elements include:
 
 
 
111
 
112
- - The `@tool` decorator from `smolagents` to designate it as an agent capability.
113
- - A comprehensive docstring, complete with an `Args:` section, to inform the LLM about the tool's purpose and expected inputs.
114
- - Type hints for both input and output parameters, enhancing clarity and guiding the LLM's code generation.
115
 
116
- ```python
117
- from smolagents import tool
118
 
 
 
 
 
119
  @tool
120
  def sql_engine(query: str) -> str:
121
  """
@@ -136,105 +182,100 @@ def sql_engine(query: str) -> str:
136
  """
137
  output = ""
138
  with engine.connect() as con:
139
- # Utilize text() to safely execute raw SQL within SQLAlchemy
140
  rows = con.execute(text(query))
141
  for row in rows:
142
- output += "\n" + str(row) # Converts each row of results into a string representation
143
  return output
144
- ```
145
-
146
- ### Step 3: Assembling the AI Agent
147
 
148
- With our database and tool ready, we now instantiate the `CodeAgent`. This is `smolagents’` flagship agent class, designed to generate and execute code, and to iteratively refine its actions based on the ReAct (Reasoning + Acting) framework.
149
 
150
- The `model` parameter links our agent to a Large Language Model. `InferenceClientModel` facilitates access to LLMs via Hugging Face's Inference API, supporting both Serverless and Dedicated endpoints. Alternatively, you could integrate other proprietary LLM APIs.
151
 
152
  ```python
153
- from smolagents import CodeAgent, InferenceClientModel
154
-
155
  agent = CodeAgent(
156
- tools=[sql_engine], # Provide the 'sql_engine' tool to our agent
157
- model=InferenceClientModel(model_id="meta-llama/Llama-3.1-8B-Instruct"), # Selecting our LLM
158
  )
159
  ```
160
 
161
- ### Step 4: Posing a Query to the Agent
162
 
163
- Our agent is now configured. Let's challenge it with a natural language question. The agent will then leverage its LLM and `sql_engine` tool to find the answer.
164
 
165
  ```python
 
166
  agent.run("Can you give me the name of the client who got the most expensive receipt?")
167
  ```
168
 
169
- **Understanding the Agent's Iterative Solution Process:**
 
170
 
171
- The `CodeAgent` employs a self-correcting, cyclical approach:
172
 
173
- 1. **Intent Comprehension:** The LLM interprets the request, identifying the need to find the "most expensive receipt."
174
- 2. **Tool Selection:** It recognizes that the `sql_engine` tool is necessary for database interaction.
175
- 3. **Initial Code Generation:** The agent generates its first attempt at a SQL query (e.g., `SELECT MAX(price) FROM receipts`) to get the maximum price. It then tries to use this result in a follow-up query.
176
- 4. **Execution and Feedback:** The `sql_engine` executes the query. However, the output is a string like `\n(53.43,)`. If the agent naively tries to embed this string directly into another SQL query (e.g., `WHERE price = (53.43,)`), it will encounter a `syntax error`.
177
- 5. **Adaptive Self-Correction:** Upon receiving an `OperationalError` (e.g., "syntax error" or "could not convert string to float"), the LLM analyzes the error. It understands that the string-formatted output needs to be correctly parsed into a numeric type before being used in subsequent SQL or Python logic. Previous attempts might fail due to unexpected characters (like newlines) or incorrect string manipulation.
178
- 6. **Refined Strategy:** Learning from its previous attempts, the agent eventually generates a more efficient, consolidated SQL query: `SELECT MAX(price), customer_name FROM receipts ORDER BY price DESC LIMIT 1`. This effectively retrieves both the highest price and the corresponding customer name in a single database call.
179
- 7. **Result Parsing and Finalization:** Finally, the LLM generates Python code to accurately parse the `\n(53.43, 'Woodrow Wilson')` string output from the `sql_engine`, extracting the customer name. It then provides the `final_answer`.
180
-
181
- This continuous cycle of **reasoning, acting via tools, observing outcomes (including errors), and self-correction** is fundamental to the robustness and adaptability of agent-based systems.
182
-
183
- ---
184
-
185
- ### Level 2: Inter-Table Queries (Table Joins)
186
-
187
- Let's elevate the complexity! Our goal now is to enable the agent to handle questions that require combining data from multiple tables using SQL joins.
188
-
189
- To achieve this, we'll define a second table, `waiters`, which records the names of waiters associated with each `receipt_id`.
190
 
191
  ```python
192
- # Define the 'waiters' table schema
193
- table_name = "waiters"
194
  waiters = Table(
195
- table_name,
196
  metadata_obj,
197
- Column("receipt_id", Integer, primary_key=True), # Links to 'receipts' table
198
- Column("waiter_name", String(16), primary_key=True), # Name of the assigned waiter
199
  )
200
- # Create the 'waiters' table in the database
201
  metadata_obj.create_all(engine)
202
 
203
- # Sample data for the 'waiters' table
204
  rows = [
205
  {"receipt_id": 1, "waiter_name": "Corey Johnson"},
206
  {"receipt_id": 2, "waiter_name": "Michael Watts"},
207
  {"receipt_id": 3, "waiter_name": "Michael Watts"},
208
  {"receipt_id": 4, "waiter_name": "Margaret James"},
209
  ]
210
- # Populate the 'waiters' table
211
  insert_rows_into_table(rows, waiters)
212
- ```
213
-
214
- With the introduction of a new table, it's crucial to **update the `sql_engine` tool's description**. This ensures the LLM is aware of the `waiters` table and its schema, allowing it to construct queries that span both tables.
215
 
216
- ```python
217
  updated_description = """This tool allows performing SQL queries on the database, returning results as a string.
218
  It can access the following tables:"""
219
 
220
  inspector = inspect(engine)
221
  for table in ["receipts", "waiters"]:
222
  columns_info = [(col["name"], col["type"]) for col in inspector.get_columns(table)]
223
-
224
  table_description = f"Table '{table}':\n"
225
-
226
  table_description += " Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
227
  updated_description += "\n\n" + table_description
228
 
229
  print(updated_description)
 
230
  ```
231
 
232
- For more intricate requests like this, switching to a more powerful LLM can significantly enhance the agent's reasoning capabilities. Here, we'll upgrade to `Qwen/Qwen2.5-Coder-32B-Instruct`.
233
 
234
- ```python
235
- # Assign the updated description to the tool
236
- sql_engine.description = updated_description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
237
 
 
 
238
  agent = CodeAgent(
239
  tools=[sql_engine],
240
  model=InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct"),
@@ -243,12 +284,33 @@ agent = CodeAgent(
243
  agent.run("Which waiter received the highest total amount in tips?")
244
  ```
245
 
246
- The agent successfully addresses this challenge, often directly formulating the correct SQL query involving a `JOIN` operation, and then performing the necessary calculations in Python. The simplicity of setup versus the complexity of the task handled demonstrates the power of this agentic approach!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
247
 
248
- This tutorial covered several key concepts:
249
 
250
- - **Constructing custom tools** for agents.
251
- - **Dynamically updating a tool's description** to reflect changes in available data or functionalities.
252
- - **Leveraging stronger LLMs** to empower an agent's reasoning for more complex tasks.
253
 
254
- You are now equipped to start building your own advanced text-to-SQL systems!
 
1
+ ---
2
+ title: { { "Create a Text to SQL AI Agent" } }
3
+ emoji: { { 😊 } }
4
+ colorFrom: { { blue } }
5
+ colorTo: { { purple } }
6
+ ---
7
 
8
+ Here's a `README.md` file outlining your Text-to-SQL agent project:
 
9
 
10
+ ## Intelligent Text-to-SQL Agent with `smolagents`
11
 
12
+ This project demonstrates building a robust Text-to-SQL AI agent using the `smolagents` framework, capable of translating natural language queries into SQL, executing them, and intelligently processing the results, including handling complex scenarios like table joins.
13
+
14
+ ## Table of Contents
15
+
16
+ - [Intelligent Text-to-SQL Agent with `smolagents`](#intelligent-text-to-sql-agent-with-smolagents)
17
+ - [Table of Contents](#table-of-contents)
18
+ - [Why an AI Agent for Text-to-SQL?](#why-an-ai-agent-for-text-to-sql)
19
+ - [Features](#features)
20
+ - [Installation](#installation)
21
+ - [Instantiating the Agent (Single Table)](#instantiating-the-agent-single-table)
22
+ - [Querying the Agent: Single Table](#querying-the-agent-single-table)
23
+ - [Extending for Table Joins](#extending-for-table-joins)
24
+ - [Querying the Agent: Multi-Table](#querying-the-agent-multi-table)
25
+ - [How it Works](#how-it-works)
26
+ - [Key Concepts Demonstrated](#key-concepts-demonstrated)
27
+ - [Contributing](#contributing)
28
+ - [License](#license)
29
 
30
+ ## Why an AI Agent for Text-to-SQL?
31
 
32
+ Traditional Text-to-SQL pipelines often suffer from brittleness:
33
 
34
+ - **Syntactic Errors:** Generated SQL queries might be invalid, leading to execution failures.
35
+ - **Semantic Errors:** Even if syntactically correct, queries can produce incorrect or irrelevant results without explicit error messages, leading to silent failures and potentially misleading information.
36
 
37
+ **An agent-based system overcomes these limitations by:**
38
 
39
+ - **Critical Inspection:** Analyzing query outputs and execution logs.
40
+ - **Self-Correction:** Identifying errors or suboptimal results and iteratively refining the SQL query or subsequent processing steps.
41
+ - **Enhanced Robustness:** Providing a more reliable and intelligent way to interact with databases from natural language.
42
 
43
+ ## Features
44
+
45
+ - **Natural Language to SQL:** Translates user questions into executable SQL queries.
46
+ - **Database Interaction:** Executes SQL queries against an in-memory SQLite database.
47
+ - **Intelligent Parsing:** Processes and extracts relevant information from SQL query results.
48
+ - **Self-Correction:** Learns from execution errors and refines its approach.
49
+ - **Multi-Table Querying:** Supports questions requiring joins across multiple tables.
50
+ - **LLM Flexibility:** Integrates with various Large Language Models (LLMs) via `smolagents`.
51
+
52
+ ## Installation
53
+
54
+ To get started, clone this repository and install the required dependencies:
55
 
56
  ```bash
57
+ git clone https://github.com/your-username/text-to-sql-agent.git
58
+ cd text-to-sql-agent
59
+ pip install smolagents python-dotenv sqlalchemy --upgrade -q
60
  ```
61
 
62
+ `````
63
 
64
+ **Note:** To interact with Large Language Models via inference providers (e.g., Hugging Face Inference API), you'll need a valid authentication token set as an environment variable, typically `HF_TOKEN`.
65
+
66
+ ## Project Structure
67
+
68
+ The core logic of this project is encapsulated in `text_to_sql.py`.
69
+
70
+ ```text
71
+ .
72
+ ├── README.md
73
+ └── text_to_sql.py
74
  ```
75
 
76
+ ## Usage
77
+
78
+ This section walks through the `text_to_sql.py` script, explaining each part of building and using the agent.
79
 
80
+ ### Setup and Dependencies
81
+
82
+ First, load your environment variables, including your LLM token.
83
 
84
  ```python
85
+ # text_to_sql.py
86
+ from dotenv import load_dotenv
87
+ load_dotenv()
88
+
89
  from sqlalchemy import (
90
  create_engine,
91
  MetaData,
 
96
  Float,
97
  insert,
98
  inspect,
99
+ text,
100
  )
101
+ from smolagents import tool, CodeAgent, InferenceClientModel
102
+
103
+ # ... (rest of the code)
104
+ ```
105
 
106
+ ### Database Initialization
107
+
108
+ We set up an in-memory SQLite database using SQLAlchemy, defining `receipts` and `waiters` tables and populating them with sample data.
109
+
110
+ ````python
111
+ # text_to_sql.py
112
  engine = create_engine("sqlite:///:memory:")
113
  metadata_obj = MetaData()
114
 
 
115
  def insert_rows_into_table(rows, table, engine=engine):
116
  for row in rows:
117
  stmt = insert(table).values(**row)
118
  with engine.begin() as connection:
119
  connection.execute(stmt)
120
 
121
+ # Define the 'receipts' table
 
122
  receipts = Table(
123
+ "receipts",
124
  metadata_obj,
125
+ Column("receipt_id", Integer, primary_key=True),
126
+ Column("customer_name", String(255)), # Adjusted from String(16) for longer names
127
+ Column("price", Float),
128
+ Column("tip", Float),
129
  )
 
130
  metadata_obj.create_all(engine)
131
 
132
+ # Sample data for 'receipts'
133
  rows = [
134
  {"receipt_id": 1, "customer_name": "Alan Payne", "price": 12.06, "tip": 1.20},
135
  {"receipt_id": 2, "customer_name": "Alex Mason", "price": 23.86, "tip": 0.24},
136
  {"receipt_id": 3, "customer_name": "Woodrow Wilson", "price": 53.43, "tip": 5.43},
137
  {"receipt_id": 4, "customer_name": "Margaret James", "price": 21.11, "tip": 1.00},
138
  ]
 
139
  insert_rows_into_table(rows, receipts)
 
 
 
140
 
141
+ # Print table schema (for LLM context)
 
 
 
 
 
 
142
  inspector = inspect(engine)
143
  columns_info = [(col["name"], col["type"]) for col in inspector.get_columns("receipts")]
 
144
  table_description = "Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
145
+ print(table_description)```
146
+
147
+ **Output:**
148
+ `````
149
 
 
150
  Columns:
 
 
 
 
 
151
 
152
+ - receipt_id: INTEGER
153
+ - customer_name: VARCHAR(255)
154
+ - price: FLOAT
155
+ - tip: FLOAT
156
 
157
+ ````
 
 
158
 
159
+ ### Creating the SQL Tool
 
160
 
161
+ The `sql_engine` function acts as the agent's interface to the database. Its detailed docstring provides the LLM with crucial information about its functionality and the database schema.
162
+
163
+ ```python
164
+ # text_to_sql.py
165
  @tool
166
  def sql_engine(query: str) -> str:
167
  """
 
182
  """
183
  output = ""
184
  with engine.connect() as con:
 
185
  rows = con.execute(text(query))
186
  for row in rows:
187
+ output += "\n" + str(row)
188
  return output
189
+ ````
 
 
190
 
191
+ ### Instantiating the Agent (Single Table)
192
 
193
+ We create a `CodeAgent` and provide it with the `sql_engine` tool and an LLM (e.g., `meta-llama/Llama-3.1-8B-Instruct`).
194
 
195
  ```python
196
+ # text_to_sql.py
 
197
  agent = CodeAgent(
198
+ tools=[sql_engine],
199
+ model=InferenceClientModel(model_id="meta-llama/Llama-3.1-8B-Instruct"),
200
  )
201
  ```
202
 
203
+ ### Querying the Agent: Single Table
204
 
205
+ Now, we can ask the agent a question and observe its problem-solving process, including self-correction.
206
 
207
  ```python
208
+ # text_to_sql.py
209
  agent.run("Can you give me the name of the client who got the most expensive receipt?")
210
  ```
211
 
212
+ **Expected Agent Output (summarized):**
213
+ The agent will attempt several SQL queries, potentially encountering syntax errors or parsing issues with the raw string output from `sql_engine`. Through iterative self-correction, it will eventually generate and execute `SELECT MAX(price), customer_name FROM receipts ORDER BY price DESC LIMIT 1`, parse the result `(53.43, 'Woodrow Wilson')`, and identify 'Woodrow Wilson'.
214
 
215
+ ### Extending for Table Joins
216
 
217
+ To handle more complex queries, we add a `waiters` table and update the `sql_engine` tool's description to include its schema.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
218
 
219
  ```python
220
+ # text_to_sql.py
221
+ # Define the 'waiters' table
222
  waiters = Table(
223
+ "waiters",
224
  metadata_obj,
225
+ Column("receipt_id", Integer, primary_key=True),
226
+ Column("waiter_name", String(16), primary_key=True),
227
  )
 
228
  metadata_obj.create_all(engine)
229
 
230
+ # Sample data for 'waiters'
231
  rows = [
232
  {"receipt_id": 1, "waiter_name": "Corey Johnson"},
233
  {"receipt_id": 2, "waiter_name": "Michael Watts"},
234
  {"receipt_id": 3, "waiter_name": "Michael Watts"},
235
  {"receipt_id": 4, "waiter_name": "Margaret James"},
236
  ]
 
237
  insert_rows_into_table(rows, waiters)
 
 
 
238
 
239
+ # Update the tool's description to include the new table
240
  updated_description = """This tool allows performing SQL queries on the database, returning results as a string.
241
  It can access the following tables:"""
242
 
243
  inspector = inspect(engine)
244
  for table in ["receipts", "waiters"]:
245
  columns_info = [(col["name"], col["type"]) for col in inspector.get_columns(table)]
 
246
  table_description = f"Table '{table}':\n"
 
247
  table_description += " Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
248
  updated_description += "\n\n" + table_description
249
 
250
  print(updated_description)
251
+ sql_engine.description = updated_description # Update the tool's description
252
  ```
253
 
254
+ **Output:**
255
 
256
+ ```
257
+ This tool allows performing SQL queries on the database, returning results as a string.
258
+ It can access the following tables:
259
+
260
+ Table 'receipts':
261
+ Columns:
262
+ - receipt_id: INTEGER
263
+ - customer_name: VARCHAR(255)
264
+ - price: FLOAT
265
+ - tip: FLOAT
266
+
267
+ Table 'waiters':
268
+ Columns:
269
+ - receipt_id: INTEGER
270
+ - waiter_name: VARCHAR(16)
271
+ ```
272
+
273
+ ### Querying the Agent: Multi-Table
274
+
275
+ We switch to a more powerful LLM (`Qwen/Qwen2.5-Coder-32B-Instruct`) for this harder task.
276
 
277
+ ```python
278
+ # text_to_sql.py
279
  agent = CodeAgent(
280
  tools=[sql_engine],
281
  model=InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct"),
 
284
  agent.run("Which waiter received the highest total amount in tips?")
285
  ```
286
 
287
+ **Expected Agent Output (summarized):**
288
+ The agent will formulate a SQL query to join `waiters` and `receipts` tables (e.g., `SELECT w.waiter_name, r.tip FROM waiters w JOIN receipts r ON w.receipt_id = r.receipt_id`). It will then process the results in Python to sum tips per waiter and identify "Michael Watts" as having the highest total tips.
289
+
290
+ ## How it Works
291
+
292
+ The `smolagents` `CodeAgent` operates on the **ReAct (Reasoning + Acting)** framework:
293
+
294
+ 1. **Reasoning (LLM as Brain):** A Large Language Model (e.g., Llama-3.1, Qwen2.5) interprets the natural language prompt and decides on a course of action.
295
+ 2. **Acting (Tools as Hands):** If an external interaction is needed (like querying a database), the LLM generates Python code to call a registered `@tool` (e.g., `sql_engine("...")`). The tool's `docstring` (description) is critical for the LLM to understand its capabilities.
296
+ 3. **Observation & Feedback:** The generated code is executed. The output (e.g., database results, error messages) is fed back to the LLM.
297
+ 4. **Self-Correction & Iteration:** The LLM analyzes the feedback. If there's an error or the result is unsatisfactory, it refines its reasoning and generates new code, iterating until the task is complete or deemed unfeasible.
298
+
299
+ This iterative process allows the agent to solve complex problems and recover from errors, making it more robust than traditional direct translation methods.
300
+
301
+ ## Key Concepts Demonstrated
302
+
303
+ - **Agentic Frameworks:** Using `smolagents` to orchestrate LLM interactions and tool use.
304
+ - **Tool Creation:** Defining custom Python functions as tools for agents, complete with detailed descriptions.
305
+ - **Dynamic Tool Descriptions:** Updating tool information to reflect changes in available data (e.g., new database tables).
306
+ - **LLM Integration:** Leveraging various LLMs for different levels of reasoning complexity.
307
+ - **SQLAlchemy:** Programmatically interacting with databases in Python.
308
+ - **ReAct Paradigm:** The iterative cycle of reasoning, acting, and observation that enables self-correction.
309
+
310
+ ## Contributing
311
 
312
+ Feel free to open issues or submit pull requests if you have suggestions or improvements!
313
 
314
+ ## License
 
 
315
 
316
+ This project is open-sourced under the MIT License. See the `LICENSE` file for more details.
TUTORIAL.md ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Unlocking Database Intelligence with AI Agents: A `smolagents` Tutorial
2
+
3
+ [Open In Colab](https://colab.research.google.com/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb)
4
+ [Open In Studio Lab](https://studiolab.sagemaker.aws/import/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb)
5
+
6
+ This guide explores how to develop an intelligent agent using the `smolagents` framework, specifically enabling it to interact with a SQL database.
7
+
8
+ ---
9
+
10
+ ## Beyond Simple Text-to-SQL: The Agent Advantage
11
+
12
+ Why opt for an advanced agent system instead of a straightforward text-to-SQL pipeline?
13
+
14
+ Traditional text-to-SQL solutions are often quite rigid. A direct translation from natural language to a database query can easily lead to syntactical errors, causing the database to reject the query. More insidiously, a query might execute without error but produce entirely incorrect or irrelevant results, providing no indication of its inaccuracy. This "silent failure" can be detrimental for critical applications.
15
+
16
+ 👉 An agent-based system, conversely, possesses the crucial capability to **critically evaluate outputs and execution logs**. It can identify when a query has failed or yielded unexpected results, and then iteratively refine its strategy or reformulate the query. This inherent capacity for self-correction significantly boosts performance and reliability.
17
+
18
+ Let's dive into building such an agent! 💪
19
+
20
+ First, ensure all necessary libraries are installed by running the command below:
21
+
22
+ ```bash
23
+ !pip install smolagents python-dotenv sqlalchemy --upgrade -q
24
+ ```
25
+
26
+ To enable interaction with Large Language Models (LLMs) via inference providers, you'll need an authentication token, such as an `HF_TOKEN` from Hugging Face. We'll use `python-dotenv` to load this from your environment variables.
27
+
28
+ ```python
29
+ from dotenv import load_dotenv
30
+ load_dotenv()
31
+ ```
32
+
33
+ ### Step 1: Database Initialization
34
+
35
+ We begin by setting up our in-memory SQLite database using `SQLAlchemy`. This involves defining our table structures and populating them with initial data.
36
+
37
+ ```python
38
+ from sqlalchemy import (
39
+ create_engine,
40
+ MetaData,
41
+ Table,
42
+ Column,
43
+ String,
44
+ Integer,
45
+ Float,
46
+ insert,
47
+ inspect,
48
+ text, # Essential for executing raw SQL expressions
49
+ )
50
+
51
+ # Establish an in-memory SQLite database connection
52
+ engine = create_engine("sqlite:///:memory:")
53
+ metadata_obj = MetaData()
54
+
55
+ # Utility function for bulk data insertion
56
+ def insert_rows_into_table(rows, table, engine=engine):
57
+ for row in rows:
58
+ stmt = insert(table).values(**row)
59
+ with engine.begin() as connection:
60
+ connection.execute(stmt)
61
+
62
+ # Define the 'receipts' table schema
63
+ table_name = "receipts"
64
+ receipts = Table(
65
+ table_name,
66
+ metadata_obj,
67
+ Column("receipt_id", Integer, primary_key=True), # Unique identifier for each transaction
68
+ Column("customer_name", String(255)), # Full name of the patron
69
+ Column("price", Float), # Total cost of the receipt
70
+ Column("tip", Float), # Gratuity amount
71
+ )
72
+ # Create the defined table within our database
73
+ metadata_obj.create_all(engine)
74
+
75
+ # Sample transaction data
76
+ rows = [
77
+ {"receipt_id": 1, "customer_name": "Alan Payne", "price": 12.06, "tip": 1.20},
78
+ {"receipt_id": 2, "customer_name": "Alex Mason", "price": 23.86, "tip": 0.24},
79
+ {"receipt_id": 3, "customer_name": "Woodrow Wilson", "price": 53.43, "tip": 5.43},
80
+ {"receipt_id": 4, "customer_name": "Margaret James", "price": 21.11, "tip": 1.00},
81
+ ]
82
+ # Populate the 'receipts' table
83
+ insert_rows_into_table(rows, receipts)
84
+ ```
85
+
86
+ ### Step 2: Crafting the Agent's Database Tool
87
+
88
+ For an AI agent to interact with a database, it requires specialized **tools**. Our `sql_engine` function will serve as this tool, allowing the agent to execute SQL queries.
89
+
90
+ The tool's docstring plays a critical role, as its content (the `description` attribute) is presented to the LLM by the agent system. This description guides the LLM on _how_ and _when_ to utilize the tool, including details about available tables and their column structures.
91
+
92
+ First, let's extract the schema details for our `receipts` table:
93
+
94
+ ```python
95
+ inspector = inspect(engine)
96
+ columns_info = [(col["name"], col["type"]) for col in inspector.get_columns("receipts")]
97
+
98
+ table_description = "Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
99
+ print(table_description)
100
+ ```
101
+
102
+ ```
103
+ Columns:
104
+ - receipt_id: INTEGER
105
+ - customer_name: VARCHAR(255)
106
+ - price: FLOAT
107
+ - tip: FLOAT
108
+ ```
109
+
110
+ Now, we'll construct our `sql_engine` tool. Key elements include:
111
+
112
+ - The `@tool` decorator from `smolagents` to designate it as an agent capability.
113
+ - A comprehensive docstring, complete with an `Args:` section, to inform the LLM about the tool's purpose and expected inputs.
114
+ - Type hints for both input and output parameters, enhancing clarity and guiding the LLM's code generation.
115
+
116
+ ```python
117
+ from smolagents import tool
118
+
119
+ @tool
120
+ def sql_engine(query: str) -> str:
121
+ """
122
+ Enables execution of SQL queries against the database.
123
+ Outputs the query results as a formatted string.
124
+
125
+ Known tables and their column structures:
126
+ Table 'receipts':
127
+ Columns:
128
+ - receipt_id: INTEGER (Primary Key)
129
+ - customer_name: VARCHAR(255)
130
+ - price: FLOAT
131
+ - tip: FLOAT
132
+
133
+ Args:
134
+ query: The precise SQL query string to be executed.
135
+ Example: "SELECT customer_name FROM receipts WHERE price > 10.0;"
136
+ """
137
+ output = ""
138
+ with engine.connect() as con:
139
+ # Utilize text() to safely execute raw SQL within SQLAlchemy
140
+ rows = con.execute(text(query))
141
+ for row in rows:
142
+ output += "\n" + str(row) # Converts each row of results into a string representation
143
+ return output
144
+ ```
145
+
146
+ ### Step 3: Assembling the AI Agent
147
+
148
+ With our database and tool ready, we now instantiate the `CodeAgent`. This is `smolagents’` flagship agent class, designed to generate and execute code, and to iteratively refine its actions based on the ReAct (Reasoning + Acting) framework.
149
+
150
+ The `model` parameter links our agent to a Large Language Model. `InferenceClientModel` facilitates access to LLMs via Hugging Face's Inference API, supporting both Serverless and Dedicated endpoints. Alternatively, you could integrate other proprietary LLM APIs.
151
+
152
+ ```python
153
+ from smolagents import CodeAgent, InferenceClientModel
154
+
155
+ agent = CodeAgent(
156
+ tools=[sql_engine], # Provide the 'sql_engine' tool to our agent
157
+ model=InferenceClientModel(model_id="meta-llama/Llama-3.1-8B-Instruct"), # Selecting our LLM
158
+ )
159
+ ```
160
+
161
+ ### Step 4: Posing a Query to the Agent
162
+
163
+ Our agent is now configured. Let's challenge it with a natural language question. The agent will then leverage its LLM and `sql_engine` tool to find the answer.
164
+
165
+ ```python
166
+ agent.run("Can you give me the name of the client who got the most expensive receipt?")
167
+ ```
168
+
169
+ **Understanding the Agent's Iterative Solution Process:**
170
+
171
+ The `CodeAgent` employs a self-correcting, cyclical approach:
172
+
173
+ 1. **Intent Comprehension:** The LLM interprets the request, identifying the need to find the "most expensive receipt."
174
+ 2. **Tool Selection:** It recognizes that the `sql_engine` tool is necessary for database interaction.
175
+ 3. **Initial Code Generation:** The agent generates its first attempt at a SQL query (e.g., `SELECT MAX(price) FROM receipts`) to get the maximum price. It then tries to use this result in a follow-up query.
176
+ 4. **Execution and Feedback:** The `sql_engine` executes the query. However, the output is a string like `\n(53.43,)`. If the agent naively tries to embed this string directly into another SQL query (e.g., `WHERE price = (53.43,)`), it will encounter a `syntax error`.
177
+ 5. **Adaptive Self-Correction:** Upon receiving an `OperationalError` (e.g., "syntax error" or "could not convert string to float"), the LLM analyzes the error. It understands that the string-formatted output needs to be correctly parsed into a numeric type before being used in subsequent SQL or Python logic. Previous attempts might fail due to unexpected characters (like newlines) or incorrect string manipulation.
178
+ 6. **Refined Strategy:** Learning from its previous attempts, the agent eventually generates a more efficient, consolidated SQL query: `SELECT MAX(price), customer_name FROM receipts ORDER BY price DESC LIMIT 1`. This effectively retrieves both the highest price and the corresponding customer name in a single database call.
179
+ 7. **Result Parsing and Finalization:** Finally, the LLM generates Python code to accurately parse the `\n(53.43, 'Woodrow Wilson')` string output from the `sql_engine`, extracting the customer name. It then provides the `final_answer`.
180
+
181
+ This continuous cycle of **reasoning, acting via tools, observing outcomes (including errors), and self-correction** is fundamental to the robustness and adaptability of agent-based systems.
182
+
183
+ ---
184
+
185
+ ### Level 2: Inter-Table Queries (Table Joins)
186
+
187
+ Let's elevate the complexity! Our goal now is to enable the agent to handle questions that require combining data from multiple tables using SQL joins.
188
+
189
+ To achieve this, we'll define a second table, `waiters`, which records the names of waiters associated with each `receipt_id`.
190
+
191
+ ```python
192
+ # Define the 'waiters' table schema
193
+ table_name = "waiters"
194
+ waiters = Table(
195
+ table_name,
196
+ metadata_obj,
197
+ Column("receipt_id", Integer, primary_key=True), # Links to 'receipts' table
198
+ Column("waiter_name", String(16), primary_key=True), # Name of the assigned waiter
199
+ )
200
+ # Create the 'waiters' table in the database
201
+ metadata_obj.create_all(engine)
202
+
203
+ # Sample data for the 'waiters' table
204
+ rows = [
205
+ {"receipt_id": 1, "waiter_name": "Corey Johnson"},
206
+ {"receipt_id": 2, "waiter_name": "Michael Watts"},
207
+ {"receipt_id": 3, "waiter_name": "Michael Watts"},
208
+ {"receipt_id": 4, "waiter_name": "Margaret James"},
209
+ ]
210
+ # Populate the 'waiters' table
211
+ insert_rows_into_table(rows, waiters)
212
+ ```
213
+
214
+ With the introduction of a new table, it's crucial to **update the `sql_engine` tool's description**. This ensures the LLM is aware of the `waiters` table and its schema, allowing it to construct queries that span both tables.
215
+
216
+ ```python
217
+ updated_description = """This tool allows performing SQL queries on the database, returning results as a string.
218
+ It can access the following tables:"""
219
+
220
+ inspector = inspect(engine)
221
+ for table in ["receipts", "waiters"]:
222
+ columns_info = [(col["name"], col["type"]) for col in inspector.get_columns(table)]
223
+
224
+ table_description = f"Table '{table}':\n"
225
+
226
+ table_description += " Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info])
227
+ updated_description += "\n\n" + table_description
228
+
229
+ print(updated_description)
230
+ ```
231
+
232
+ For more intricate requests like this, switching to a more powerful LLM can significantly enhance the agent's reasoning capabilities. Here, we'll upgrade to `Qwen/Qwen2.5-Coder-32B-Instruct`.
233
+
234
+ ```python
235
+ # Assign the updated description to the tool
236
+ sql_engine.description = updated_description
237
+
238
+ agent = CodeAgent(
239
+ tools=[sql_engine],
240
+ model=InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct"),
241
+ )
242
+
243
+ agent.run("Which waiter received the highest total amount in tips?")
244
+ ```
245
+
246
+ The agent successfully addresses this challenge, often directly formulating the correct SQL query involving a `JOIN` operation, and then performing the necessary calculations in Python. The simplicity of setup versus the complexity of the task handled demonstrates the power of this agentic approach!
247
+
248
+ This tutorial covered several key concepts:
249
+
250
+ - **Constructing custom tools** for agents.
251
+ - **Dynamically updating a tool's description** to reflect changes in available data or functionalities.
252
+ - **Leveraging stronger LLMs** to empower an agent's reasoning for more complex tasks.
253
+
254
+ ✅ You are now equipped to start building your own advanced text-to-SQL systems! ✨