Using SQL¶
marimo lets you can mix and match Python and SQL: Use SQL to query Python dataframes (or databases like SQLite and Postgres), and get the query result back as a Python dataframe.
To create a SQL cell, you first need to install additional dependencies, including duckdb:
Examples
For example notebooks, check out
examples/sql/
on GitHub.
Example¶
In this example notebook, we have a Pandas dataframe and a SQL cell that queries it. Notice that the query result is returned as a Python dataframe and usable in subsequent cells.
Creating SQL cells¶
You can create SQL cells in one of three ways:
- Right-click an "add cell" button ("+" icon) next to a cell and choose "SQL cell"
- Convert a empty cell to SQL via the cell context menu
- Click the SQL button that appears at the bottom of the notebook
This creates a "SQL" cell for you, which is syntactic sugar for Python code. The underlying code looks like:
Notice that we have an output_df
variable in the cell. This contains
the query result, and is a Polars DataFrame (if you have polars
installed) or
a Pandas DataFrame (if you don't). One of them must be installed in order to
interact with the query result.
The SQL statement itself is an f-string, letting you
interpolate Python values into the query with {}
. In particular, this means
your SQL queries can depend on the values of UI elements or other Python values,
and they are fit into marimo's reactive dataflow graph.
Reference a local dataframe¶
You can reference a local dataframe in your SQL cell by using the name of the Python variable that holds the dataframe. If you have a database connection with a table of the same name, the database table will be used instead.
Since the output dataframe variable (_df
) has an underscore, making it private, it is not referenceable from other cells.
Reference the output of a SQL cell¶
Defining a non-private (non-underscored) output variable in the SQL cell allows you to reference the resulting dataframe in other Python and SQL cells.
Querying files, databases, and APIs¶
In the above example, you may have noticed we queried an HTTP endpoint instead of a local dataframe. We are not only limited to querying local dataframes; we can also query files, databases such as Postgres and SQLite, and APIs:
-- or
SELECT * FROM 's3://my-bucket/file.parquet';
-- or
SELECT * FROM read_csv('path/to/example.csv');
-- or
SELECT * FROM read_parquet('path/to/example.parquet');
For a full list you can check out the duckdb extensions. You can also check out our examples on GitHub.
Escaping SQL brackets¶
Our "SQL" cells are really just Python under the hood to keep notebooks as pure Python scripts. By default, we use f-strings
for SQL strings, which allows for parameterized SQL like which allows for parameterized SQL like SELECT * from table where value < {min}
.
To escape real {
/}
that you don't want parameterized, use double {{...}}
:
Interactive tutorial¶
For an interactive tutorial, run
at your command-line.
Examples¶
Check out our examples on GitHub.