Spaces:
Running
Running
| # Unlocking Database Intelligence with AI Agents: A `smolagents` Tutorial | |
| [Open In Colab](https://colab.research.google.com/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb) | |
| [Open In Studio Lab](https://studiolab.sagemaker.aws/import/github/huggingface/smolagents/blob/main/notebooks/text_to_sql.ipynb) | |
| This guide explores how to develop an intelligent agent using the `smolagents` framework, specifically enabling it to interact with a SQL database. | |
| --- | |
| ## Beyond Simple Text-to-SQL: The Agent Advantage | |
| Why opt for an advanced agent system instead of a straightforward text-to-SQL pipeline? | |
| Traditional text-to-SQL solutions are often quite rigid. A direct translation from natural language to a database query can easily lead to syntactical errors, causing the database to reject the query. More insidiously, a query might execute without error but produce entirely incorrect or irrelevant results, providing no indication of its inaccuracy. This "silent failure" can be detrimental for critical applications. | |
| 👉 An agent-based system, conversely, possesses the crucial capability to **critically evaluate outputs and execution logs**. It can identify when a query has failed or yielded unexpected results, and then iteratively refine its strategy or reformulate the query. This inherent capacity for self-correction significantly boosts performance and reliability. | |
| Let's dive into building such an agent! 💪 | |
| First, ensure all necessary libraries are installed by running the command below: | |
| ```bash | |
| !pip install smolagents python-dotenv sqlalchemy --upgrade -q | |
| ``` | |
| To enable interaction with Large Language Models (LLMs) via inference providers, you'll need an authentication token, such as an `HF_TOKEN` from Hugging Face. We'll use `python-dotenv` to load this from your environment variables. | |
| ```python | |
| from dotenv import load_dotenv | |
| load_dotenv() | |
| ``` | |
| ### Step 1: Database Initialization | |
| We begin by setting up our in-memory SQLite database using `SQLAlchemy`. This involves defining our table structures and populating them with initial data. | |
| ```python | |
| from sqlalchemy import ( | |
| create_engine, | |
| MetaData, | |
| Table, | |
| Column, | |
| String, | |
| Integer, | |
| Float, | |
| insert, | |
| inspect, | |
| text, # Essential for executing raw SQL expressions | |
| ) | |
| # Establish an in-memory SQLite database connection | |
| engine = create_engine("sqlite:///:memory:") | |
| metadata_obj = MetaData() | |
| # Utility function for bulk data insertion | |
| def insert_rows_into_table(rows, table, engine=engine): | |
| for row in rows: | |
| stmt = insert(table).values(**row) | |
| with engine.begin() as connection: | |
| connection.execute(stmt) | |
| # Define the 'receipts' table schema | |
| table_name = "receipts" | |
| receipts = Table( | |
| table_name, | |
| metadata_obj, | |
| Column("receipt_id", Integer, primary_key=True), # Unique identifier for each transaction | |
| Column("customer_name", String(255)), # Full name of the patron | |
| Column("price", Float), # Total cost of the receipt | |
| Column("tip", Float), # Gratuity amount | |
| ) | |
| # Create the defined table within our database | |
| metadata_obj.create_all(engine) | |
| # Sample transaction data | |
| rows = [ | |
| {"receipt_id": 1, "customer_name": "Alan Payne", "price": 12.06, "tip": 1.20}, | |
| {"receipt_id": 2, "customer_name": "Alex Mason", "price": 23.86, "tip": 0.24}, | |
| {"receipt_id": 3, "customer_name": "Woodrow Wilson", "price": 53.43, "tip": 5.43}, | |
| {"receipt_id": 4, "customer_name": "Margaret James", "price": 21.11, "tip": 1.00}, | |
| ] | |
| # Populate the 'receipts' table | |
| insert_rows_into_table(rows, receipts) | |
| ``` | |
| ### Step 2: Crafting the Agent's Database Tool | |
| For an AI agent to interact with a database, it requires specialized **tools**. Our `sql_engine` function will serve as this tool, allowing the agent to execute SQL queries. | |
| The tool's docstring plays a critical role, as its content (the `description` attribute) is presented to the LLM by the agent system. This description guides the LLM on _how_ and _when_ to utilize the tool, including details about available tables and their column structures. | |
| First, let's extract the schema details for our `receipts` table: | |
| ```python | |
| inspector = inspect(engine) | |
| columns_info = [(col["name"], col["type"]) for col in inspector.get_columns("receipts")] | |
| table_description = "Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info]) | |
| print(table_description) | |
| ``` | |
| ``` | |
| Columns: | |
| - receipt_id: INTEGER | |
| - customer_name: VARCHAR(255) | |
| - price: FLOAT | |
| - tip: FLOAT | |
| ``` | |
| Now, we'll construct our `sql_engine` tool. Key elements include: | |
| - The `@tool` decorator from `smolagents` to designate it as an agent capability. | |
| - A comprehensive docstring, complete with an `Args:` section, to inform the LLM about the tool's purpose and expected inputs. | |
| - Type hints for both input and output parameters, enhancing clarity and guiding the LLM's code generation. | |
| ```python | |
| from smolagents import tool | |
| @tool | |
| def sql_engine(query: str) -> str: | |
| """ | |
| Enables execution of SQL queries against the database. | |
| Outputs the query results as a formatted string. | |
| Known tables and their column structures: | |
| Table 'receipts': | |
| Columns: | |
| - receipt_id: INTEGER (Primary Key) | |
| - customer_name: VARCHAR(255) | |
| - price: FLOAT | |
| - tip: FLOAT | |
| Args: | |
| query: The precise SQL query string to be executed. | |
| Example: "SELECT customer_name FROM receipts WHERE price > 10.0;" | |
| """ | |
| output = "" | |
| with engine.connect() as con: | |
| # Utilize text() to safely execute raw SQL within SQLAlchemy | |
| rows = con.execute(text(query)) | |
| for row in rows: | |
| output += "\n" + str(row) # Converts each row of results into a string representation | |
| return output | |
| ``` | |
| ### Step 3: Assembling the AI Agent | |
| With our database and tool ready, we now instantiate the `CodeAgent`. This is `smolagents’` flagship agent class, designed to generate and execute code, and to iteratively refine its actions based on the ReAct (Reasoning + Acting) framework. | |
| The `model` parameter links our agent to a Large Language Model. `InferenceClientModel` facilitates access to LLMs via Hugging Face's Inference API, supporting both Serverless and Dedicated endpoints. Alternatively, you could integrate other proprietary LLM APIs. | |
| ```python | |
| from smolagents import CodeAgent, InferenceClientModel | |
| agent = CodeAgent( | |
| tools=[sql_engine], # Provide the 'sql_engine' tool to our agent | |
| model=InferenceClientModel(model_id="meta-llama/Llama-3.1-8B-Instruct"), # Selecting our LLM | |
| ) | |
| ``` | |
| ### Step 4: Posing a Query to the Agent | |
| Our agent is now configured. Let's challenge it with a natural language question. The agent will then leverage its LLM and `sql_engine` tool to find the answer. | |
| ```python | |
| agent.run("Can you give me the name of the client who got the most expensive receipt?") | |
| ``` | |
| **Understanding the Agent's Iterative Solution Process:** | |
| The `CodeAgent` employs a self-correcting, cyclical approach: | |
| 1. **Intent Comprehension:** The LLM interprets the request, identifying the need to find the "most expensive receipt." | |
| 2. **Tool Selection:** It recognizes that the `sql_engine` tool is necessary for database interaction. | |
| 3. **Initial Code Generation:** The agent generates its first attempt at a SQL query (e.g., `SELECT MAX(price) FROM receipts`) to get the maximum price. It then tries to use this result in a follow-up query. | |
| 4. **Execution and Feedback:** The `sql_engine` executes the query. However, the output is a string like `\n(53.43,)`. If the agent naively tries to embed this string directly into another SQL query (e.g., `WHERE price = (53.43,)`), it will encounter a `syntax error`. | |
| 5. **Adaptive Self-Correction:** Upon receiving an `OperationalError` (e.g., "syntax error" or "could not convert string to float"), the LLM analyzes the error. It understands that the string-formatted output needs to be correctly parsed into a numeric type before being used in subsequent SQL or Python logic. Previous attempts might fail due to unexpected characters (like newlines) or incorrect string manipulation. | |
| 6. **Refined Strategy:** Learning from its previous attempts, the agent eventually generates a more efficient, consolidated SQL query: `SELECT MAX(price), customer_name FROM receipts ORDER BY price DESC LIMIT 1`. This effectively retrieves both the highest price and the corresponding customer name in a single database call. | |
| 7. **Result Parsing and Finalization:** Finally, the LLM generates Python code to accurately parse the `\n(53.43, 'Woodrow Wilson')` string output from the `sql_engine`, extracting the customer name. It then provides the `final_answer`. | |
| This continuous cycle of **reasoning, acting via tools, observing outcomes (including errors), and self-correction** is fundamental to the robustness and adaptability of agent-based systems. | |
| --- | |
| ### Level 2: Inter-Table Queries (Table Joins) | |
| Let's elevate the complexity! Our goal now is to enable the agent to handle questions that require combining data from multiple tables using SQL joins. | |
| To achieve this, we'll define a second table, `waiters`, which records the names of waiters associated with each `receipt_id`. | |
| ```python | |
| # Define the 'waiters' table schema | |
| table_name = "waiters" | |
| waiters = Table( | |
| table_name, | |
| metadata_obj, | |
| Column("receipt_id", Integer, primary_key=True), # Links to 'receipts' table | |
| Column("waiter_name", String(16), primary_key=True), # Name of the assigned waiter | |
| ) | |
| # Create the 'waiters' table in the database | |
| metadata_obj.create_all(engine) | |
| # Sample data for the 'waiters' table | |
| rows = [ | |
| {"receipt_id": 1, "waiter_name": "Corey Johnson"}, | |
| {"receipt_id": 2, "waiter_name": "Michael Watts"}, | |
| {"receipt_id": 3, "waiter_name": "Michael Watts"}, | |
| {"receipt_id": 4, "waiter_name": "Margaret James"}, | |
| ] | |
| # Populate the 'waiters' table | |
| insert_rows_into_table(rows, waiters) | |
| ``` | |
| With the introduction of a new table, it's crucial to **update the `sql_engine` tool's description**. This ensures the LLM is aware of the `waiters` table and its schema, allowing it to construct queries that span both tables. | |
| ```python | |
| updated_description = """This tool allows performing SQL queries on the database, returning results as a string. | |
| It can access the following tables:""" | |
| inspector = inspect(engine) | |
| for table in ["receipts", "waiters"]: | |
| columns_info = [(col["name"], col["type"]) for col in inspector.get_columns(table)] | |
| table_description = f"Table '{table}':\n" | |
| table_description += " Columns:\n" + "\n".join([f" - {name}: {col_type}" for name, col_type in columns_info]) | |
| updated_description += "\n\n" + table_description | |
| print(updated_description) | |
| ``` | |
| For more intricate requests like this, switching to a more powerful LLM can significantly enhance the agent's reasoning capabilities. Here, we'll upgrade to `Qwen/Qwen2.5-Coder-32B-Instruct`. | |
| ```python | |
| # Assign the updated description to the tool | |
| sql_engine.description = updated_description | |
| agent = CodeAgent( | |
| tools=[sql_engine], | |
| model=InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct"), | |
| ) | |
| agent.run("Which waiter received the highest total amount in tips?") | |
| ``` | |
| The agent successfully addresses this challenge, often directly formulating the correct SQL query involving a `JOIN` operation, and then performing the necessary calculations in Python. The simplicity of setup versus the complexity of the task handled demonstrates the power of this agentic approach! | |
| This tutorial covered several key concepts: | |
| - **Constructing custom tools** for agents. | |
| - **Dynamically updating a tool's description** to reflect changes in available data or functionalities. | |
| - **Leveraging stronger LLMs** to empower an agent's reasoning for more complex tasks. | |
| ✅ You are now equipped to start building your own advanced text-to-SQL systems! ✨ | |