Writing queries with the D1 Client API
In the previous post we defined our database schema, got up and running with migrations, and loaded some seed data into our database. In this post we will be working with our new database and seed data. If you want to participate, make sure to follow the steps in the first post.
We’ve been taking a minimal approach so far by using only wrangler
and sql
scripts for our workflow. The D1 Client API has a small surface area. Thanks to the power of SQL, we will have everything we need to construct all types of queries. Before we start writing our queries, let's touch on some important concepts.
Prepared statements and parameter binding
This is the first section of the docs and it highlights two different ways to write our SQL statements using the client API: prepared and static statements. Best practice is to use prepared statements because they are more performant and prevent SQL injection attacks. So we will write our queries using prepared statements.
We need to use parameter binding to build our queries with prepared statements. This is pretty straightforward and there are two variations.
By default we add ?
’s to our statement to represent a value to be filled in. The bind method will bind the parameters to each question mark by their index. The first ?
is tied to the first parameter in bind, 2nd, etc. I would stick with this most of the time to avoid any confusion.
const stmt = db.prepare('SELECT * FROM users WHERE name = ? AND age = ?').bind( 'John Doe', 41 );
I like this second method less as it feels like something I can imagine messing up very innocently. You can add a number directly after a question mark to indicate which number parameter it should be bound to. In this exampl, we reverse the previous binding.
const stmt = db.prepare('SELECT * FROM users WHERE name = ?2 AND age = ?1').bind( 41, 'John Doe' );
Reusing prepared statements
If we take the first example above and not bind any values we have a statement that can be reused:
const stmt = db.prepare('SELECT * FROM users WHERE name = ? AND age = ?')
const results = stmt.bind('John Doe', 41).all()
const results = stmt.bind('Jane Doe', 38).all()
Querying
For the purposes of this post we will just build example queries by writing them out directly in our Worker fetch
handler. If you are building an app I would recommend building functions or some other abstraction around your queries.
select queries
Let's write our first query against our data set to get our feet wet.
Here’s the initial worker code and a query for all authors:
import * as Schema from './schema';
export default {
async fetch(request, env, ctx): Promise<Response> {
let results = await env.DB.prepare('SELECT * FROM authors').all<Schema.Author[]>();
return new Response(JSON.stringify(results.results), {
headers: { 'content-type': 'application/json' },
});
},
} satisfies ExportedHandler<Env>;
We pass our SQL statement into prepare
and use the all
method to get all the rows. Notice that we are able to pass our types to a generic parameter in all
. This allows us to get a fully typed response from our query.
We can run our worker with npm run dev
and access it at http://localhost:8787 by default. We’ll keep this simple workflow of writing queries and passing them as a json
response for inspection in the browser. Opening the page we get our author results.
joins
Not using an ORM means we have full control over our own destiny. Like anything else though, this has tradeoffs. Let’s look at a query to fetch the list of posts that includes author and tags information.
type PostsWithAuthorsAndTags = Schema.Post & {
author_name: string;
tags: string;
};
let results = await env.DB.prepare(
SELECT posts.*, authors.name AS author_name, COALESCE( JSON_GROUP_ARRAY(tags.name), '[]' ) AS tags FROM posts JOIN authors ON posts.author_id = authors.id LEFT JOIN posts_tags ON posts.id = posts_tags.post_id LEFT JOIN tags ON posts_tags.tag_id = tags.id GROUP BY posts.id
).all<PostsWithAuthorsAndTags[]>();
Let’s walk through each part of the query and highlight some pros and cons.
SELECT
posts.*,
authors.name AS author_name,
COALESCE(
JSON_GROUP_ARRAY(tags.name),
'[]'
) AS tags
- The query selects all columns from the
posts
table. - It also selects the
name
column from theauthors
table and renames it toauthor_name
. - It aggregates the
name
column from thetags
table into a JSON array. If there are no tags, it returns an empty JSON array. This aggregated result is renamed totags
.
FROM
posts
JOIN
authors ON posts.author_id = authors.id
LEFT JOIN
posts_tags ON posts.id = posts_tags.post_id
LEFT JOIN
tags ON posts_tags.tag_id = tags.id
GROUP BY
posts.id
- The query starts by selecting data from the
posts
table. - It then joins the
authors
table to include author information for each post, matching posts to authors using theauthor_id
column inposts
and theid
column inauthors
. - Next, it left joins the
posts_tags
table to include tag associations for each post, ensuring that all posts are included even if they have no tags. - Next, it left joins the
tags
table to include tag names, matching tags to posts using thetag_id
column inposts_tags
and theid
column intags
. - Finally, group the results by the post id so that all rows with the same post id are combined in a single row
SQL provides a lot of power to query our data in interesting ways. JOIN
’s will typically be more performant than performing additional queries.You could just as easily write a simpler version of this query that uses subqueries to fetch post tags and join all the data by hand with JavaScript. This is the nice thing about writing SQL, you’re free to fetch and handle your data how you please.
Our results should look similar to this:
[
{
"id": 1,
"author_id": 1,
"title": "Exploring the Alps",
"content": "Content about exploring the Alps...",
"published_at": "2024-07-31 18:37:21",
"author_name": "Alice Smith",
"tags": "[\\"Travel\\",\\"Photography\\"]"
},
...
]
This brings us to our next topic.
Marshaling / coercing result data
A couple of things we notice about the format of the result data our query provides:
Rows are flat. We join the author directly onto the post and prefix its column names with author
.
"author_name": "Alice Smith"
Using an ORM we might get the data back as a child object:
{
"id": 1,
"title": "Exploring the Alps",
"author": {
"name": "Alice Smith"
},
...
}
Another thing is that our tags
data is a JSON string
and not a JavaScript array. This means that we will need to parse it ourselves.
result.tags = JSON.parse(result.tags)
This isn’t the end of the world but it is some more work on our end to coerce the result data into the format that we actually want.
This problem is handled in most ORM’s and is their main selling point in my opinion.
insert / update / delete
Next, let’s write a function that will add a new post to our database.
async function createNewPost(env: Env, newPostData: NewPostData): Promise<{ id: number }> {
// Insert the new post into the posts table
const postResult = await env.DB.prepare(
`
INSERT INTO posts (author_id, title, content)
VALUES (?, ?, ?)
RETURNING id
`
)
.bind(newPostData.authorId, newPostData.title, newPostData.content)
.first<{ id: number }>();
if (!postResult) {
throw new Error('Failed to insert new post');
}
const postId = postResult.id;
// Insert tags into the tags table if they don't already exist and get their IDs
const tagIds = await Promise.all(
newPostData.tags
.map(async (tag) => {
await env.DB.prepare(
INSERT OR IGNORE INTO tags (name) VALUES (?)
)
.bind(tag)
.run();
const tagResult = await env.DB.prepare(
SELECT id FROM tags WHERE name = ?
)
.bind(tag)
.first<{ id: number }>();
return tagResult?.id;
})
.filter(Boolean)
);
// Link tags to the new post in the posts_tags table
for (const tagId of tagIds) {
await env.DB.prepare(
INSERT INTO posts_tags (post_id, tag_id) VALUES (?, ?)
)
.bind(postId, tagId)
.run();
}
return { id: postId };
}
There’s a few queries involved in our create post function:
- first we create the new post
- next we run through the tags and either create or return an existing tag
- finally, we add entries to our post_tags join table to associate our new post with the tags assigned
We can test our new function by providing post content in query params on our index page and formatting them for our function.
const newPostData: NewPostData = {
authorId: Number(url.searchParams.get('authorId')),
tags: url.searchParams.get('tags')?.split(',') ?? [],
title: url.searchParams.get('title') ?? '',
content: url.searchParams.get('content') ?? '',
};
if (!newPostData.authorId || !newPostData.title || !newPostData.content) {
return new Response('Missing required fields', { status: 400 });
}
const newPost = await createNewPost(env, newPostData);
I gave it a run like this: http://localhost:8787authorId=1&tags=Food%2CReview&title=A+review+of+my+favorite+Italian+restaurant&content=I+got+the+sausage+orchette+and+it+was+amazing.+I+wish+that+instead+of+baby+broccoli+they+used+rapini.+Otherwise+it+was+a+perfect+dish+and+the+vibes+were+great
And got a new post with the id 11.
UPDATE
and DELETE
operations are pretty similar to what we’ve seen so far. Most complexity in your queries will be similar to what we’ve seen in the posts query where we want to JOIN
or GROUP BY
data in various ways.
To update the post we can write a query that looks like this:
await env.DB.prepare(
`
UPDATE posts
SET
author_id = COALESCE(?, author_id),
title = COALESCE(?, title),
content = COALESCE(?, content)
WHERE id = ?
`
)
.bind(updatedPostData.authorId, updatedPostData.title, updatedPostData.content, postId)
.run();
COALESCE
acts similarly to if we had written a ?? b
in JavaScript. If the binded value that we provide is null it will fall back to the default.
We can delete our new post with a simple DELETE query:
await env.DB.prepare(
`
DELETE posts WHERE id = ?
`
)
.bind(postId)
.run();
Transactions / Batching
One thing to note with D1 is that I don’t think the traditional style of SQLite transactions are supported. You can use the db.batch API to achieve similar functionality though.
According to the docs:
Batched statements are SQL transactions ↗. If a statement in the sequence fails, then an error is returned for that specific statement, and it aborts or rolls back the entire sequence.
await db.batch([
db.prepare("UPDATE users SET name = ?1 WHERE id = ?2").bind( "John", 17 ),
db.prepare("UPDATE users SET age = ?1 WHERE id = ?2").bind( 35, 19 ),
]);
Summary
In this post, we've taken a hands-on approach to exploring the D1 Client API, starting with defining our database schema and loading seed data. We then dove into writing queries, covering the basics of prepared statements and parameter binding, before moving on to more complex topics like joins and transactions. We saw how to construct and execute queries to fetch data from our database, including how to handle relationships between tables and marshal result data into a usable format. We also touched on inserting, updating, and deleting data, and how to use transactions to ensure data consistency. By working through these examples, we've gained a solid understanding of how to use the D1 Client API to interact with our database and build robust, data-driven applications.