In previous posts, we went through setting up DuckDB on Windows, exploring DuckDB for common tasks involving importing data, aggregate queries and type casting and Part 2 around date and time functions. Now, we look at how we get an AUTOINCREMENT field (usually as a default value for a primary key).
Background
If you’ve used SQLite3 before, you know it’s common syntax to create a table with INTEGER PRIMARY KEY AUTOINCREMENT and then you can insert values without providing that field.
-- Create the table
CREATE TABLE data (id INTEGER PRIMARY KEY AUTOINCREMENT, raw_data TEXT);
-- Insert data
INSERT INTO data (raw_data) VALUES ('hello'), ('world');
INSERT INTO data (raw_data) VALUES ('he'), ('wo'), ('we'), ('xv'), ('po');
-- See what we have
select * from data;
1|hello
2|world
3|he
4|wo
5|we
6|xv
7|po
If you take a look at the schema, you will see that SQLite3 automatically creates a table to store a sequence value for this auto-increment field.
.tables
data
sqlite> .schema
CREATE TABLE data (id INTEGER PRIMARY KEY AUTOINCREMENT, raw_data TEXT);
CREATE TABLE sqlite_sequence(name,seq);
sqlite> select * from sqlite_sequence;
data|7
But, not so in DuckDB.
In DuckDB
DuckDB behaves slightly differently and will throw an error (at least as of now) if you try to do the same.
$ duckdb tduck.db
DuckDB v1.4.3 (Andium) d1dc88f950
Enter ".help" for usage hints.
D CREATE TABLE data (id INTEGER PRIMARY KEY AUTOINCREMENT, raw_data TEXT);
Parser Error:
syntax error at or near "AUTOINCREMENT"
LINE 1: CREATE TABLE data (id INTEGER PRIMARY KEY AUTOINCREMENT, raw_data TEXT);
^
The solution is similar to what SQLite3 does automatically under the hood – we need to create a sequence and replace AUTOINCREMENT with a DEFAULT that reads from the sequence. The way we do it is as below.
D CREATE SEQUENCE seq_data_id START 1;
D .tables
D .schema
D CREATE TABLE data (id INTEGER PRIMARY KEY DEFAULT (nextval('seq_data_id')), raw_data VARCHAR);
So, we create a sequence with a name (any name) and define its start value as 1 (that’s optional, but might as well). After that, we define the INTEGER PRIMARY KEY with a DEFAULT (nextval ('seq_data_id')) which tells it to use the next value for the sequence that we defined. After this, it behaves as normal.
D insert into data (raw_data) VALUES ('hello'), ('world');
D insert into data (raw_data) VALUES ('he'), ('wo'), ('we'), ('xv'), ('po');
D select * from data;
┌───────┬──────────┐
│ id │ raw_data │
│ int32 │ varchar │
├───────┼──────────┤
│ 1 │ hello │
│ 2 │ world │
│ 3 │ he │
│ 4 │ wo │
│ 5 │ we │
│ 6 │ xv │
│ 7 │ po │
└───────┴──────────┘
D .tables
data
D .schema
CREATE TABLE IF NOT EXISTS "data"(id INTEGER DEFAULT(nextval('seq_data_id')) PRIMARY KEY, raw_data VARCHAR);
Other Notes
A couple of other notes that don’t need a separate post by themselves.
(1) DuckDB can read SQLite3 database files just fine. So, you can open a SQLite3 database file and process it with DuckDB also. Since it’s working with the table definition of SQLite3, you don’t need to do anything special to manage AUTOINCREMENT primary keys. It will just use SQLite3 as normal.
$ duckdb test.db
DuckDB v1.4.3 (Andium) d1dc88f950
Enter ".help" for usage hints.
D select * from data;
┌───────┬──────────┐
│ id │ raw_data │
│ int64 │ varchar │
├───────┼──────────┤
│ 1 │ hello │
│ 2 │ world │
│ 3 │ he │
│ 4 │ wo │
│ 5 │ we │
│ 6 │ xv │
│ 7 │ po │
└───────┴──────────┘
D insert into data (raw_data) VALUES ('duck'), ('inserted');
D select * from data;
┌───────┬──────────┐
│ id │ raw_data │
│ int64 │ varchar │
├───────┼──────────┤
│ 1 │ hello │
│ 2 │ world │
│ 3 │ he │
│ 4 │ wo │
│ 5 │ we │
│ 6 │ xv │
│ 7 │ po │
│ 8 │ duck │
│ 9 │ inserted │
└───────┴──────────┘
(2) Of course, you can provide default values for other fields. For example, a created_at field might be defined using TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) to use the current time as the default value on insertion, if it’s not provided as in the sample below:
CREATE TABLE data(
id INTEGER DEFAULT(nextval('seq_fmbc_logid')) PRIMARY KEY,
"timestamp", raw_data VARCHAR NOT NULL,
created_at TIMESTAMP DEFAULT(CURRENT_TIMESTAMP)
);
Links and References
For DuckDB, all the relevant details are in the official documentation about sequences.
If you have some comments, I’d love to hear from you. Feel free to connect or share the post (you can tag me as @onghu on X or on Mastodon as @onghu@ruby.social or @onghu.com on Bluesky to discuss more).