{"id":598,"date":"2025-01-11T15:28:59","date_gmt":"2025-01-11T15:28:59","guid":{"rendered":"https:\/\/smolagents.org\/?post_type=docs&#038;p=598"},"modified":"2025-01-11T15:40:54","modified_gmt":"2025-01-11T15:40:54","password":"","slug":"text-to-sql-example","status":"publish","type":"docs","link":"https:\/\/smolagents.org\/hi\/docs\/text-to-sql-example\/","title":{"rendered":"Text-to-SQL Example"},"content":{"rendered":"<p>In this tutorial, we\u2019ll see how to implement an agent that leverages SQL using&nbsp;<code>smolagents<\/code>.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Let\u2019s start with the golden question: why not keep it simple and use a standard text-to-SQL pipeline?<\/p>\n<\/blockquote>\n\n\n\n<p>A standard text-to-sql pipeline is brittle, since the generated SQL query can be incorrect. Even worse, the query could be incorrect, but not raise an error, instead giving some incorrect\/useless outputs without raising an alarm.<\/p>\n\n\n\n<p>\ud83d\udc49 Instead, an agent system is able to critically inspect outputs and decide if the query needs to be changed or not, thus giving it a huge performance boost.<\/p>\n\n\n\n<p>Let\u2019s build this agent! \ud83d\udcaa<\/p>\n\n\n\n<p>First, we setup the SQL environment:<\/p>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">from sqlalchemy import (\n    create_engine,\n    MetaData,\n    Table,\n    Column,\n    String,\n    Integer,\n    Float,\n    insert,\n    inspect,\n    text,\n)\n\nengine = create_engine(\"sqlite:\/\/\/:memory:\")\nmetadata_obj = MetaData()\n\n<em># create city SQL table<\/em>\ntable_name = \"receipts\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"customer_name\", String(16), primary_key=True),\n    Column(\"price\", Float),\n    Column(\"tip\", Float),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"customer_name\": \"Alan Payne\", \"price\": 12.06, \"tip\": 1.20},\n    {\"receipt_id\": 2, \"customer_name\": \"Alex Mason\", \"price\": 23.86, \"tip\": 0.24},\n    {\"receipt_id\": 3, \"customer_name\": \"Woodrow Wilson\", \"price\": 53.43, \"tip\": 5.43},\n    {\"receipt_id\": 4, \"customer_name\": \"Margaret James\", \"price\": 21.11, \"tip\": 1.00},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/docs\/smolagents\/examples\/text_to_sql#build-our-agent\" target=\"_blank\" rel=\"noopener\"><\/a>Build our agent<\/h3>\n\n\n\n<p>Now let\u2019s make our SQL table retrievable by a tool.<\/p>\n\n\n\n<p>The tool\u2019s description attribute will be embedded in the LLM\u2019s prompt by the agent system: it gives the LLM information about how to use the tool. This is where we want to describe the SQL table.<\/p>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">inspector = inspect(engine)\ncolumns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(\"receipts\")]\n\ntable_description = \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\nprint(table_description)<\/pre>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Columns:\n  - receipt_id: INTEGER\n  - customer_name: VARCHAR(16)\n  - price: FLOAT\n  - tip: FLOAT<\/pre>\n\n\n\n<p>Now let\u2019s build our tool. It needs the following: (read\u00a0<a href=\"https:\/\/huggingface.co\/docs\/smolagents\/tutorials\/tools\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">the tool doc<\/a>\u00a0for more detail)<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A docstring with an\u00a0<code>Args:<\/code>\u00a0part listing arguments.<\/li>\n\n\n\n<li>Type hints on both inputs and output.<\/li>\n<\/ul>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">from smolagents import tool\n\n@tool\ndef sql_engine(query: str) -&gt; str:\n    \"\"\"\n    Allows you to perform SQL queries on the table. Returns a string representation of the result.\n    The table is named 'receipts'. Its description is as follows:\n        Columns:\n        - receipt_id: INTEGER\n        - customer_name: VARCHAR(16)\n        - price: FLOAT\n        - tip: FLOAT\n\n    Args:\n        query: The query to perform. This should be correct SQL.\n    \"\"\"\n    output = \"\"\n    with engine.connect() as con:\n        rows = con.execute(text(query))\n        for row in rows:\n            output += \"\\n\" + str(row)\n    return output<\/pre>\n\n\n\n<p>Now let us create an agent that leverages this tool.<\/p>\n\n\n\n<p>We use the&nbsp;<code>CodeAgent<\/code>, which is smolagents\u2019 main agent class: an agent that writes actions in code and can iterate on previous output according to the ReAct framework.<\/p>\n\n\n\n<p>The model is the LLM that powers the agent system. HfApiModel allows you to call LLMs using HF\u2019s Inference API, either via Serverless or Dedicated endpoint, but you could also use any proprietary API.<\/p>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">from smolagents import CodeAgent, HfApiModel\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=HfApiModel(\"meta-llama\/Meta-Llama-3.1-8B-Instruct\"),\n)\nagent.run(\"Can you give me the name of the client who got the most expensive receipt?\")<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/docs\/smolagents\/examples\/text_to_sql#level-2-table-joins\" target=\"_blank\" rel=\"noopener\"><\/a>Level 2: Table joins<\/h3>\n\n\n\n<p>Now let\u2019s make it more challenging! We want our agent to handle joins across multiple tables.<\/p>\n\n\n\n<p>So let\u2019s make a second table recording the names of waiters for each receipt_id!<\/p>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">table_name = \"waiters\"\nreceipts = Table(\n    table_name,\n    metadata_obj,\n    Column(\"receipt_id\", Integer, primary_key=True),\n    Column(\"waiter_name\", String(16), primary_key=True),\n)\nmetadata_obj.create_all(engine)\n\nrows = [\n    {\"receipt_id\": 1, \"waiter_name\": \"Corey Johnson\"},\n    {\"receipt_id\": 2, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 3, \"waiter_name\": \"Michael Watts\"},\n    {\"receipt_id\": 4, \"waiter_name\": \"Margaret James\"},\n]\nfor row in rows:\n    stmt = insert(receipts).values(**row)\n    with engine.begin() as connection:\n        cursor = connection.execute(stmt)<\/pre>\n\n\n\n<p>Since we changed the table, we update the&nbsp;<code>SQLExecutorTool<\/code>&nbsp;with this table\u2019s description to let the LLM properly leverage information from this table.<\/p>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">updated_description = \"\"\"Allows you to perform SQL queries on the table. Beware that this tool's output is a string representation of the execution output.\nIt can use the following tables:\"\"\"\n\ninspector = inspect(engine)\nfor table in [\"receipts\", \"waiters\"]:\n    columns_info = [(col[\"name\"], col[\"type\"]) for col in inspector.get_columns(table)]\n\n    table_description = f\"Table '{table}':\\n\"\n\n    table_description += \"Columns:\\n\" + \"\\n\".join([f\"  - {name}: {col_type}\" for name, col_type in columns_info])\n    updated_description += \"\\n\\n\" + table_description\n\nprint(updated_description)<\/pre>\n\n\n\n<p>Since this request is a bit harder than the previous one, we\u2019ll switch the LLM engine to use the more powerful\u00a0<a href=\"https:\/\/huggingface.co\/Qwen\/Qwen2.5-Coder-32B-Instruct\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Qwen\/Qwen2.5-Coder-32B-Instruct<\/a>!<\/p>\n\n\n\n<p>Copied<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">sql_engine.description = updated_description\n\nagent = CodeAgent(\n    tools=[sql_engine],\n    model=HfApiModel(\"Qwen\/Qwen2.5-Coder-32B-Instruct\"),\n)\n\nagent.run(\"Which waiter got more total money from tips?\")<\/pre>\n\n\n\n<p>It directly works! The setup was surprisingly simple, wasn\u2019t it?<\/p>\n\n\n\n<p>This example is done! We\u2019ve touched upon these concepts:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Building new tools.<\/li>\n\n\n\n<li>Updating a tool\u2019s description.<\/li>\n\n\n\n<li>Switching to a stronger LLM helps agent reasoning.<\/li>\n<\/ul>\n\n\n\n<p>\u2705 Now you can go build this text-to-SQL system you\u2019ve always dreamt of! \u2728<\/p>","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we\u2019ll see how to implement an agent that leverages SQL using&nbsp;smolagents. Let\u2019s start with the golden question: why not keep it simple and use a standard text-to-SQL pipeline? A standard text-to-sql pipeline is brittle, since the generated SQL query can be incorrect. Even worse, the query could be incorrect, but not raise&#8230;<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","template":"","meta":{"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"footnotes":""},"doc_category":[10],"doc_tag":[],"class_list":["post-598","docs","type-docs","status-publish","hentry","doc_category-examples"],"year_month":"2026-04","word_count":879,"total_views":"4125","reactions":{"happy":"1","normal":"0","sad":"0"},"author_info":{"name":"smolagents","author_nicename":"wd-gstargmail-com","author_url":"https:\/\/smolagents.org\/hi\/author\/wd-gstargmail-com\/"},"doc_category_info":[{"term_name":"Examples","term_url":"https:\/\/smolagents.org\/hi\/docs-category\/examples\/"}],"doc_tag_info":[],"_links":{"self":[{"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/docs\/598","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/docs"}],"about":[{"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/types\/docs"}],"author":[{"embeddable":true,"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/comments?post=598"}],"version-history":[{"count":1,"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/docs\/598\/revisions"}],"predecessor-version":[{"id":599,"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/docs\/598\/revisions\/599"}],"wp:attachment":[{"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/media?parent=598"}],"wp:term":[{"taxonomy":"doc_category","embeddable":true,"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/doc_category?post=598"},{"taxonomy":"doc_tag","embeddable":true,"href":"https:\/\/smolagents.org\/hi\/wp-json\/wp\/v2\/doc_tag?post=598"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}