aboutsummaryrefslogtreecommitdiff
path: root/07-data-definition-in-sql
diff options
context:
space:
mode:
Diffstat (limited to '07-data-definition-in-sql')
-rw-r--r--07-data-definition-in-sql/data-definition-in-sql.ipynb662
1 files changed, 625 insertions, 37 deletions
diff --git a/07-data-definition-in-sql/data-definition-in-sql.ipynb b/07-data-definition-in-sql/data-definition-in-sql.ipynb
index d2be191..5166bc9 100644
--- a/07-data-definition-in-sql/data-definition-in-sql.ipynb
+++ b/07-data-definition-in-sql/data-definition-in-sql.ipynb
@@ -111,7 +111,7 @@
"outputs": [],
"source": [
"%%sql\n",
- "set search_path = pg_catalog;"
+ "SET search_path = pg_catalog"
]
},
{
@@ -253,7 +253,7 @@
"id": "6c4e2cea",
"metadata": {},
"source": [
- "## Operations on schemas and tables"
+ "## Creation and removal of schemas and tables"
]
},
{
@@ -297,7 +297,7 @@
"\n",
"Note that we can explicitly name the schema of the table in `CREATE` and `DROP` statements.\n",
"\n",
- "Let's repeat the `CREATE TABLE` command with the following column definitions.\n",
+ "Let's repeat the `CREATE TABLE` command with the following code inside parentheses.\n",
"\n",
"```sql\n",
" id_part_1 SMALLINT,\n",
@@ -306,7 +306,7 @@
" PRIMARY KEY (id_part_1, id_part_2)\n",
"```\n",
"\n",
- "The `CREATE TABLE` syntax allows us to add a list of constraints after a list of column definitions.\n",
+ "The `CREATE TABLE` syntax allows us to add a list of constraints after a list of column definitions. We are using a 2-column primary key in our example to highlight that there is **no** requirement for keys to be single-column.\n",
"\n",
"Note that `SMALLINT` is a **signed** integer type that is typically at least 16 bits wide.\n",
"\n",
@@ -314,7 +314,7 @@
"\n",
"```sql\n",
"INSERT INTO ids(id_part_1, id_part_2, owner)\n",
- "VALUES(1, 1, 'XYZ Technologies Ltd.')\n",
+ "VALUES(1, 1, 'Theodore''s Corporation')\n",
"```\n",
"\n",
"```sql\n",
@@ -322,7 +322,7 @@
"```\n",
"\n",
"```sql\n",
- "VALUES(2, 1, 'Klon Datapol Sp. z o.o.')\n",
+ "VALUES(2, 1, 'Basepol Sp. z o.o.')\n",
"```\n",
"\n",
"We see that a single column that is part of the primary key **can** have repeated values. But the combination of all primary key column values (the tuple of `(id_part_1, id_part_2)`) cannot have repeated values.\n",
@@ -333,7 +333,7 @@
"VALUES(2, NULL, 'NULLpol Sp. z o.o.')\n",
"```\n",
"\n",
- "The following works, as `owner` is not part of the primary key.\n",
+ "The following works nonetheless, as `owner` is not part of the primary key.\n",
"\n",
"```sql\n",
"VALUES(2, 2, NULL)\n",
@@ -347,20 +347,20 @@
" owner VARCHAR(50) NOT NULL,\n",
"```\n",
"\n",
- "The last `INSERT` commands tried and the following one now both fail.\n",
+ "The last `INSERT` commands that we tried and the following one now both fail.\n",
"\n",
"```sql\n",
"INSERT INTO ids(id_part_1, id_part_2)\n",
"VALUES(2, 2)\n",
"```\n",
"\n",
- "If can allow the above command to succeed if we define a default value for the column.\n",
+ "We can allow the above command to succeed if we define a default value for the column.\n",
"\n",
"```sql\n",
- " owner VARCHAR(50) NOT NULL DEFAULT 'Ids Consortium LLC',\n",
+ " owner VARCHAR(50) NOT NULL DEFAULT 'Ids Consortium Inc.',\n",
"```\n",
"\n",
- "We can mandate that certain non-key group of columns is unique. Let's say that the Consortium is a comany that sells its ids to other companies. Assume there is a requirement to have a separate invoice for each id sold. We can add the following column and constraint.\n",
+ "We can mandate that certain non-key group of columns is unique. Let's say that the Consortium is a company that sells its ids to other companies (however funny it might be, there is a real world precedence of it — the USB Implementers Forum Inc. 😉). Assume there is a requirement to have a separate invoice for each id sold. We can add the following columns and constraint.\n",
"\n",
"```sql\n",
" invoice_month VARCHAR(7),\n",
@@ -374,22 +374,22 @@
"Now, the following sequence of insertions fails.\n",
"\n",
"```sql\n",
- "INSERT INTO ids(id_part_1, id_part_2, owner, invoice_id, invoice_number)\n",
- "VALUES(1, 1, 'XYZ Technologies Ltd.', '2025-10', 1)\n",
+ "INSERT INTO ids(id_part_1, id_part_2, owner, invoice_month, invoice_number)\n",
+ "VALUES(1, 1, 'Theodore''s Corporation', '2025-10', 1)\n",
"```\n",
"\n",
"```sql\n",
- "VALUES(1, 2, 'XYZ Technologies Ltd.', '2025-10', 1)\n",
+ "VALUES(1, 2, 'Theodore''s Corporation', '2025-10', 1)\n",
"```\n",
"\n",
"Note that the following shall still work as `NULL` values are never considered equal to anything.\n",
"\n",
"```sql\n",
- "VALUES(1, 2, 'XYZ Technologies Ltd.', '2025-10', NULL)\n",
+ "VALUES(1, 2, 'Theodore''s Corporation', '2025-10', NULL)\n",
"```\n",
"\n",
"```sql\n",
- "VALUES(1, 3, 'XYZ Technologies Ltd.', '2025-10', NULL)\n",
+ "VALUES(1, 3, 'Theodore''s Corporation', '2025-10', NULL)\n",
"```\n",
"\n",
"We can also give arbitrary conditions for values of a single row with the `CHECK` constraint.\n",
@@ -399,6 +399,8 @@
" CHECK (id_part_1 < 256)\n",
"```\n",
"\n",
+ "Note that although the SQL standard allows using subqueries inside `CHECK` expressions, most DBMSes don't support it. That means, e.g., `CHECK (id_part_1 < (SELECT 256))` shall fail.\n",
+ "\n",
"Note that some constraints (`UNIQUE`, `CHECK`, etc.) can be used multiple times with a single table.\n",
"\n",
"This now fails.\n",
@@ -433,7 +435,9 @@
" CHECK (invoice_month ~ '[0-9][0-9][0-9][0-9]-[0-9][0-9]')\n",
");\n",
"INSERT INTO invoices(invoice_month, invoice_number, document)\n",
- "VALUES('2025-10', 1, (SELECT DECODE('deadbeef', 'hex')))"
+ "VALUES('2025-10', 1, BYTEA '%PDF-1.6\\015%BinaryContentsOfAPdf...');\n",
+ "INSERT INTO invoices(invoice_month, invoice_number, document)\n",
+ "VALUES('2025-10', 2, BYTEA '%PDF-1.6\\015%ContentsAnotherPdf...')"
]
},
{
@@ -443,7 +447,11 @@
"source": [
"_notes_\n",
"\n",
- "TODO!"
+ "SQL tables sometimes share a common attribute (or set of attributes) that we typically make use of in `JOIN` operations. The `invoices` relation has an `(invoice_month, invoice_number)` attribute pair that is also present in `ids`.\n",
+ "\n",
+ "Note that we could possibly store contents of files (like PDF documents) in an SQL database. **There are mixed opinions on this practice**, but it is good to be aware that it is possible.\n",
+ "\n",
+ "Postgres uses a `BYTEA` type for this rather than the similar `BLOB` type from the SQL standard. In this example we populate the `BYTEA` column with some dummy byte sequences. Detailed description of the format of binary literals is available in the official DBMS documentation."
]
},
{
@@ -462,12 +470,34 @@
},
{
"cell_type": "markdown",
+ "id": "785a863b",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "Jupyter with ipython-sql is incapable of displaying binary data so we fetch it in hex for our preview."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a63ec589",
+ "metadata": {},
+ "source": [
+ "At this point we can let our DBMS know about the connection between the `ids` and `invoices` tables, by using a `FOREIGN KEY` constraint. It takes the following form.\n",
+ "\n",
+ "```sql\n",
+ "FOREIGN KEY (t1col_A, t1col_B) REFERENCES tab2 (t2col_X, t2col_Y)\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
"id": "c97fb003",
"metadata": {},
"source": [
"_notes_\n",
"\n",
- "TODO!"
+ "We are once again using a 2-column key in our example to highlight that there is **no** requirement for keys to be single-column. Rather intutively, a foreign key (just as a primay key) could also comprise 3 or more columns."
]
},
{
@@ -480,16 +510,17 @@
"%%sql\n",
"DROP TABLE IF EXISTS ids;\n",
"CREATE TABLE ids (\n",
- " id_part_1 SMALLINT,\n",
- " id_part_2 SMALLINT,\n",
+ " id_part_1 SMALLINT,\n",
+ " id_part_2 SMALLINT,\n",
" invoice_month VARCHAR(7),\n",
" invoice_number BIGINT,\n",
- " owner VARCHAR(50) NOT NULL DEFAULT 'Ids Consortium LLC',\n",
+ " owner VARCHAR(50) NOT NULL DEFAULT 'Ids Consortium Inc.',\n",
" PRIMARY KEY(id_part_1, id_part_2),\n",
" UNIQUE (invoice_month, invoice_number),\n",
" CHECK (id_part_2 < 256),\n",
" CHECK (id_part_1 < 256),\n",
- " FOREIGN KEY CONSTRAINT\n",
+ " FOREIGN KEY (invoice_month, invoice_number)\n",
+ " REFERENCES invoices (invoice_month, invoice_number)\n",
")"
]
},
@@ -500,7 +531,49 @@
"source": [
"_notes_\n",
"\n",
- "TODO!"
+ "We declare that the columns `invoice_month`, and `invoice_number` in table `ids` correspond to the same-named columns in table `invoices`. \n",
+ "\n",
+ "The practical consequence of this is that each row in `ids` is now required to have a matching row in `invoices`. Try to violate this requirement with code below.\n",
+ "\n",
+ "```sql\n",
+ "INSERT INTO ids(id_part_1, id_part_2, owner, invoice_month, invoice_number)\n",
+ "VALUES(1, 1, 'Theodore''s Corporation', '2025-10', 987)\n",
+ "```\n",
+ "\n",
+ "On the other hand, if we use an `(invoice_month, invoice_number)` pair that has a match in `invoices` (e.g., `('2025-10', 1)`), the `INSERT` command shall succeed.\n",
+ "\n",
+ "Note that the DMBS prevents not only constraint-violating insertions, but also updates and deletions. Try deleting the row in `invoices` that is referenced by the newly added row in `ids`.\n",
+ "\n",
+ "```sql\n",
+ "DELETE FROM invoices\n",
+ "```\n",
+ "\n",
+ "Note that the default behavior of disallowing row deletion can be overriden. The following will cause the \"orphaned\" rows in `ids` to be automatically deleted.\n",
+ "\n",
+ "```sql\n",
+ "FOREIGN KEY (invoice_month, invoice_number)\n",
+ " REFERENCES invoices (invoice_month, invoice_number)\n",
+ " ON DELETE CASCADE\n",
+ "```\n",
+ "\n",
+ "We can also specify `ON DELETE SET DEFAULT` to have the foreign key columns of the orphaned rows filled with, well, their default values. There also exist\n",
+ "\n",
+ "- `ON DELETE NO ACTION`, the behavior we had initially,\n",
+ "- `ON DELETE RESTRICT` that behaves mostly like `NO ACTION`, but — under DBMSes that support it — has slightly different semantic with respect to transactions, which will be coverent in a later topic, and\n",
+ "- `ON DELETE SET NULL` that fills the columns with `NULL`s.\n",
+ "\n",
+ "Yes, it is possible for foreign key columns to store `NULL` values (as long as these columns are not declared `NOT NULL`). There are also ways to specify whether or not some of the foreign key columns of a row can be `NULL`.\n",
+ "\n",
+ "Analogously, for updates to the referenced table rows, we can specify `ON UPDATE SET DEFAULT`, etc.\n",
+ "\n",
+ "Since the `PRIMARY KEY` constraint creates a dependence of one table on another, a DBMS shall, by default, prevent dropping of a table that is referenced by another one. This can be overriden with the `CASCADE` keyword. Therefore, `DROP TABLE invoices` shall fail, and `DROP TABLE invoices CASCADE` shall succeed, causing the `ids` table to be dropped as well.\n",
+ "\n",
+ "The `FOREIGN KEY` clause requires the referenced columns hold unique tuples. As long as we have either\n",
+ "\n",
+ "- `PRIMARY KEY (invoice_month, invoice_number)`, or\n",
+ "- `UNIQUE (invoice_month, invoice_number)`\n",
+ "\n",
+ "constraint in the `invoices` table, a `FOREIGN KEY` can be declared. Otherwise, the attempt to declare it shall fail."
]
},
{
@@ -508,7 +581,7 @@
"id": "9601efd9",
"metadata": {},
"source": [
- "### Shorthands for \"PRIMARY KEY\" and \"UNIQUE\" constraints"
+ "### Shorthands for constraints"
]
},
{
@@ -519,12 +592,22 @@
"outputs": [],
"source": [
"%%sql\n",
- "DROP TABLE IF EXISTS db_students;\n",
- "CREATE TABLE db_students (\n",
- " student_id INT,\n",
- " grade INT.\n",
- " PRIMARY KEY (student_id),\n",
- " UNIQUE()\n",
+ "DROP TABLE IF EXISTS users CASCADE;\n",
+ "CREATE TABLE users (\n",
+ " login VARCHAR(50),\n",
+ " password_hash VARCHAR(50) NOT NULL,\n",
+ " PRIMARY KEY (login)\n",
+ ");\n",
+ "\n",
+ "DROP TABLE IF EXISTS posts;\n",
+ "CREATE TABLE posts (\n",
+ " id INT,\n",
+ " title VARCHAR(200),\n",
+ " author VARCHAR(50),\n",
+ " contents TEXT,\n",
+ " PRIMARY KEY (id),\n",
+ " UNIQUE (title),\n",
+ " FOREIGN KEY (author) REFERENCES users (login)\n",
")"
]
},
@@ -535,7 +618,314 @@
"source": [
"_notes_\n",
"\n",
- "TODO!"
+ "If the list of referenced columns of the other table is identical to that table's primary key, Postgres (but not every DBMS) allows that list to be omitted in the `FOREIGN KEY` clause. Remove `(login)` from `REFERENCES users (login)`.\n",
+ "\n",
+ "A single-column `PRIMARY KEY` constraint can be declared in one line with column definition, as below.\n",
+ "\n",
+ "```sql\n",
+ " id INT PRIMARY KEY,\n",
+ "```\n",
+ "\n",
+ "Likewise in case of a single-column `UNIQUE` constraint.\n",
+ "\n",
+ "```sql\n",
+ " title VARCHAR(200) UNIQUE,\n",
+ "```\n",
+ "\n",
+ "And, finally, a shorthand for a single-column `FOREIGN KEY` constraint.\n",
+ "\n",
+ "```sql\n",
+ " author VARCHAR(50) REFERENCES users (login),\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "414a7cd9",
+ "metadata": {},
+ "source": [
+ "## Altering existing tables"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "741aa2cf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "INSERT INTO users(login, password_hash)\n",
+ "VALUES ('theodore', 'ea56b986135de17142f91e5523ce9d19');\n",
+ "INSERT INTO posts(id, title, author, contents)\n",
+ "VALUES (1,\n",
+ " 'Towards Full Test Coverage of Database Migration Code',\n",
+ " 'theodore',\n",
+ " 'Programs commonly alter the database schemas created by their…')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0800bcc8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "ALTER TABLE posts ADD COLUMN when_published DATE;\n",
+ "\n",
+ "SELECT * FROM posts;"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d85a68aa",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "We can dynamically add columns to tables with SQL. If we already have data in the table, the DBMS is going to fill the added column with `NULL` values.\n",
+ "\n",
+ "We can also remove an existing column.\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE posts DROP COLUMN when_published\n",
+ "```\n",
+ "\n",
+ "A default value or the `NOT NULL` constraint can also be specified for a column being added.\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE posts ADD COLUMN when_published TIMESTAMP\n",
+ " DEFAULT '1970-01-01T00:00'\n",
+ " NOT NULL;\n",
+ "```\n",
+ "\n",
+ "A default value can also be removed and set for a column that already exists.\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE posts ALTER COLUMN when_published DROP DEFAULT\n",
+ "```\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE posts ALTER COLUMN when_published SET DEFAULT NOW()\n",
+ "```\n",
+ "\n",
+ "Analogously, `SET`/`DROP NOT NULL` can also be used with `ALTER COLUMN`.\n",
+ "\n",
+ "Note that (under Postgres, at least) the `NOW()` function would be executed again each time a row is added with the default value. If we add multiple posts, their publication times shall differ.\n",
+ "\n",
+ "```sql\n",
+ "INSERT INTO posts(id, title, author, contents)\n",
+ "VALUES (2,\n",
+ " 'Fasibility of correlating untagged VCS sources with releases',\n",
+ " 'theodore',\n",
+ " 'The story of XZ backdoor shows that building software from…');\n",
+ "INSERT INTO posts(id, title, author, contents)\n",
+ "```\n",
+ "\n",
+ "```sql\n",
+ "VALUES (3,\n",
+ " 'Making vector screenshots of websites',\n",
+ " 'theodore',\n",
+ " 'As many know, a good document is one where every image…');\n",
+ "\n",
+ "SELECT * FROM posts\n",
+ "```\n",
+ "\n",
+ "There also exist\n",
+ "\n",
+ "- `ADD COLUMN IF NOT EXISTS`, and\n",
+ "- `DROP COLUMN IF EXISTS`\n",
+ "\n",
+ "variants of the commands."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9d1bc77b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "ALTER TABLE posts RENAME COLUMN when_published TO when_posted"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7feda21f",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "A column or table can be renamed.\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE posts RENAME TO articles;\n",
+ "\n",
+ "SELECT * FROM articles\n",
+ "```\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE articles RENAME TO posts\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "23ba8b9e",
+ "metadata": {},
+ "source": [
+ "### Altering constraints"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "87840b0e",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT * FROM pg_constraint WHERE conname LIKE 'posts%'"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e9d155f5",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "We can see that Postgres assigned names to our constraints and stores these constraints' data in one of its tables."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "543a60eb",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "INSERT INTO posts(id, title, author, contents)\n",
+ "VALUES (4,\n",
+ " 'Towards Full Test Coverage of Database Migration Code',\n",
+ " 'theodore',\n",
+ " 'Programs commonly alter the database schemas created by their…')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "42d7147b",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "Fails due to `UNIQUE` constraint on `title`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a24f1af4",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "ALTER TABLE posts DROP CONSTRAINT posts_title_key"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4cec71ba",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "The name of a constraint can be used to `DROP` it. We can see that repeating titles can now exist in the table (the `INSERT` above now succeeds). Also, the respective row in the `pg_constraints` table is now gone.\n",
+ "\n",
+ "```sql\n",
+ "SELECT conname, contype\n",
+ "FROM pg_constraint\n",
+ "WHERE conname LIKE '%post%'\n",
+ "```\n",
+ "\n",
+ "We can also amend a table with new constraints. The sample code below tries to re-add the constraint that has just been removed.\n",
+ "\n",
+ "```sql\n",
+ "ALTER TABLE posts ADD UNIQUE(title)\n",
+ "```\n",
+ "\n",
+ "Creation of a new constraint on an existing table shall fail if the data already present in the table does not meet that constraint's requirements. This is what happened here.\n",
+ "\n",
+ "```sql\n",
+ "SELECT * FROM posts\n",
+ "```\n",
+ "\n",
+ "We see that we could make titles unique again by removing the post with `id` of 1.\n",
+ "\n",
+ "```sql\n",
+ "DELETE FROM posts WHERE id = 1\n",
+ "```\n",
+ "\n",
+ "After this, the constraint creation command runs successfully. We can see in `pg_constraints` that Postgres has once again given a name to our `UNIQUE` constraint. However, we can also choose a name by ourselves."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a03c0628",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "ALTER TABLE posts ADD\n",
+ " CONSTRAINT x\n",
+ " CHECK(title ~ '^[^a-z]')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ba5be553",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "Let's mandate that titles don't start with a lowercase latter.\n",
+ "\n",
+ "We are going to use a regular expression in a `CHECK` constraint. It could be read as \"Match this expression at the beginning of a tested string (`^`) and require that the first character is **not** from the set (`[^`…`]`) of lowercase letters from «a» to «z» (`a-z`).\"\n",
+ "\n",
+ "The `CONSTRAINT some_name` can optionally appear before the actual constraint specification. Theodore, please choose a name for this constraint."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "de9b0723",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "UPDATE posts\n",
+ "SET title = LOWER(SUBSTR(title, 1, 1)) || SUBSTR(title, 2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e88466e3",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "The created constraint can be seen under its new name in `pg_constraints`. Also, the code above now fails.\n",
+ "\n",
+ "The\n",
+ "\n",
+ "- `FOREIGN KEY`, and\n",
+ "- `PRIMARY KEY`\n",
+ "\n",
+ "clauses can be used analogously with `ALTER TABLE`."
]
},
{
@@ -547,13 +937,181 @@
]
},
{
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2d67452d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "DROP TABLE IF EXISTS posts;\n",
+ "CREATE TABLE posts (\n",
+ " id INT PRIMARY KEY,\n",
+ " title VARCHAR(200) UNIQUE,\n",
+ " author VARCHAR(50) REFERENCES users (login),\n",
+ " contents TEXT\n",
+ ")"
+ ]
+ },
+ {
"cell_type": "markdown",
- "id": "2e7f0807",
+ "id": "eafcb70a",
"metadata": {},
"source": [
"_notes_\n",
"\n",
- "TODO!"
+ "Postgres allows us to change `INT` to `SERIAL`. It causes the DBMS to use the next natural number from a sequence as the default column value upon insertion. This is highly useful for primary keys in the form of numeric ids (although not restricted to primary keys).\n",
+ "\n",
+ "Additionally, `BIGINT` has a corresponding 8-byte `BIGSERIAL`, and there is also a 2-byte `SMALLSERIAL`. Keep in mind that Postgres' \\*`SERIAL` types are **unsigned**, unlike `INT` & friends."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7122cb46",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Towards Full Test Coverage of Database Migration Code',\n",
+ " 'theodore',\n",
+ " 'Programs commonly alter the database schemas created by their…');\n",
+ "\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Fasibility of correlating untagged VCS sources with releases',\n",
+ " 'theodore',\n",
+ " 'The story of XZ backdoor shows that building software from…');\n",
+ "\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Making vector screenshots of websites',\n",
+ " 'theodore',\n",
+ " 'As many know, a good document is one where every image…')\n",
+ "\n",
+ "SELECT * FROM posts"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "13928ced",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "Subsequent natural numbers get used for the `id` column."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "8ca150a9",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "DELETE FROM posts WHERE title LIKE 'Making%';\n",
+ "\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Making vector screenshots of websites',\n",
+ " 'theodore',\n",
+ " 'As many know, a good document is one where every image…')\n",
+ "RETURNING id"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b5d5752d",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "Let's remove and re-add a row to show that the `SERIAL` type does not reuse freed-up numbers. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b169ed25",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "DROP TABLE IF EXISTS posts;\n",
+ "CREATE TABLE posts (\n",
+ " id INT PRIMARY KEY,\n",
+ " title VARCHAR(200) UNIQUE,\n",
+ " author VARCHAR(50) REFERENCES users (login),\n",
+ " contents TEXT\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f79ccfb3",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "With \\*`SERIAL` types, Postgres uses a so-called sequence mechanism under the hood. Sequences can also be created (and dropped) manually.\n",
+ "\n",
+ "```sql\n",
+ "CREATE SEQUENCE posts_id_seq AS INT;\n",
+ "```\n",
+ "\n",
+ "A sequence can be used for column's `DEFAULT` value.\n",
+ "\n",
+ "```sql\n",
+ " id INT PRIMARY KEY DEFAULT NEXTVAL('posts_id_seq'),\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "84ca5af0",
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Towards Full Test Coverage of Database Migration Code',\n",
+ " 'theodore',\n",
+ " 'Programs commonly alter the database schemas created by their…');\n",
+ "\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Fasibility of correlating untagged VCS sources with releases',\n",
+ " 'theodore',\n",
+ " 'The story of XZ backdoor shows that building software from…');\n",
+ "\n",
+ "INSERT INTO posts(title, author, contents)\n",
+ "VALUES ('Making vector screenshots of websites',\n",
+ " 'theodore',\n",
+ " 'As many know, a good document is one where every image…');\n",
+ "\n",
+ "SELECT * FROM posts"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "31c4cfe2",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "We can use the same `INSERT`s to verify that the behavior is similar as with `SERIAL`."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ef2f9d18",
+ "metadata": {},
+ "source": [
+ "## More about Data Definition Language in Postgres\n",
+ "\n",
+ "- [documentation of `CREATE TABLE`](https://www.postgresql.org/docs/current/sql-createtable.html)\n",
+ "- [documentation of `ALTER TABLE`](https://www.postgresql.org/docs/current/sql-altertable.html)"
]
},
{
@@ -587,8 +1145,7 @@
"outputs": [],
"source": [
"%%sql\n",
- "SELECT '*' || some_name || '*' FROM names\n",
- "WHERE some_name LIKE '_eodor '"
+ "SELECT '*' || some_name || '*' FROM names"
]
},
{
@@ -618,11 +1175,42 @@
{
"cell_type": "code",
"execution_count": null,
+ "id": "8c6f9cda",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "!printf \"CREATE DATABASE IF NOT EXISTS ddl_examples\" | mariadb"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2d196a08",
+ "metadata": {},
+ "source": [
+ "_notes_\n",
+ "\n",
+ "Unlike Postgres, MariaDB supports `IF NOT EXISTS` with `CRETE DATABASE`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
"id": "9dbe1a3f",
"metadata": {},
"outputs": [],
"source": [
- "%sql mysql:///agh_it_wordpress?unix_socket=/var/run/mysql/mysql.sock"
+ "%sql mysql:///ddl_examples?unix_socket=/var/run/mysql/mysql.sock"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "09c82cd8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "-- todo: test for differences"
]
},
{