{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "510b6020",
"metadata": {},
"outputs": [],
"source": [
"import agh_db_lectures\n",
"agh_db_lectures.prepare_notebook_for_sql()"
]
},
{
"cell_type": "markdown",
"id": "c4459f3f",
"metadata": {},
"source": [
"# Procedural SQL"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d60b5e4b",
"metadata": {},
"outputs": [],
"source": [
"!psql --port 25432 --quiet -c \"CREATE DATABASE procedures WITH OWNER demo_user\" postgres || true\n",
"%sql postgresql://demo_user:demo_pwd@localhost:25432/procedures"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "efb05b48",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP TABLE IF EXISTS vertices CASCADE;\n",
"CREATE TABLE vertices(\n",
" id INT PRIMARY KEY,\n",
" label VARCHAR(50) UNIQUE\n",
");\n",
"\n",
"DROP TABLE IF EXISTS edges CASCADE;\n",
"CREATE TABLE edges(\n",
" from_id INT,\n",
" to_id INT,\n",
" label VARCHAR(50) UNIQUE,\n",
" PRIMARY KEY (from_id, to_id)\n",
");"
]
},
{
"cell_type": "markdown",
"id": "f6c43a43",
"metadata": {},
"source": [
"## Functions Defined using SQL"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92c13036",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS add_vertex;\n",
"CREATE FUNCTION add_vertex(lbl VARCHAR) RETURNS INTEGER AS\n",
" 'INSERT INTO vertices\n",
" VALUES ((SELECT COALESCE(MAX(id), 0) + 1 FROM vertices),\n",
" $1);\n",
" SELECT id FROM vertices WHERE label = $1;'\n",
" LANGUAGE SQL\n",
" RETURNS NULL ON NULL INPUT;"
]
},
{
"cell_type": "markdown",
"id": "f63026c2",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"This can help avoid the overhead of running the logic in an application which has to communicate with the DBMS."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7aa2d300",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_vertex('v1');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9fd3a11b",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM vertices;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc3194cd",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_vertex(NULL);"
]
},
{
"cell_type": "markdown",
"id": "bb9d0089",
"metadata": {},
"source": [
"### Dollar-Quoting in Postgres"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ac684702",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT $mytag$I am a 'string' constant with double dollar signs ($$).$mytag$"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5331bdc0",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT $$\n",
"I am a multiline 'string' constant.\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88e008f2",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"-- The same effect, uses `$$'.\n",
"DROP FUNCTION IF EXISTS add_vertex;\n",
"CREATE FUNCTION add_vertex(lbl VARCHAR) RETURNS INTEGER AS $$\n",
" INSERT INTO vertices\n",
" VALUES ((SELECT COALESCE(MAX(id), 0) + 1 FROM vertices),\n",
" lbl);\n",
" SELECT id FROM vertices WHERE label = lbl;\n",
" $$ LANGUAGE SQL\n",
" RETURNS NULL ON NULL INPUT;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4fbead01",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_vertex('v2');"
]
},
{
"cell_type": "markdown",
"id": "2204aa43",
"metadata": {},
"source": [
"## Functions and Procedures Defined Using PL/pgSQL"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37b73de4",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS add_vertex;\n",
"CREATE FUNCTION add_vertex(lbl VARCHAR) RETURNS INTEGER AS $$\n",
" DECLARE new_id INTEGER;\n",
" BEGIN\n",
" SELECT COALESCE(MAX(id), 0) + 1\n",
" INTO new_id\n",
" FROM vertices;\n",
"\n",
" INSERT INTO vertices\n",
" VALUES (new_id, lbl);\n",
" \n",
" RETURN new_id;\n",
" END\n",
" $$ LANGUAGE plpgsql\n",
" RETURNS NULL ON NULL INPUT;"
]
},
{
"cell_type": "markdown",
"id": "8280af14",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The procedural SQL extension language used in Postgres, the PL/pgSQL, diverges far from the procedural extension language defined by the standard.\n",
"\n",
"Postgres, however, tries to make porting of code from Oracle's procedural language easy. Oracle's PL/SQL language predates the corresponding SQL/PSM that is part of the SQL standard.\n",
"\n",
"Keyword `STRICT` can be used in place of `RETURNS NULL ON NULL INPUT`. Effect is the same."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42a32c16",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_vertex('v3');"
]
},
{
"cell_type": "markdown",
"id": "49db4957",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Run twice to see a unique violation error."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f7717c8",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM vertices;"
]
},
{
"cell_type": "markdown",
"id": "6633fdd9",
"metadata": {},
"source": [
"### Error Trapping"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b8bab70b",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS add_vertex;\n",
"CREATE FUNCTION add_vertex(lbl VARCHAR) RETURNS INTEGER AS $$\n",
" DECLARE new_id INTEGER;\n",
" BEGIN\n",
" SELECT COALESCE(MAX(id), 0) + 1\n",
" INTO new_id\n",
" FROM vertices;\n",
"\n",
" INSERT INTO vertices\n",
" VALUES (new_id, lbl);\n",
" \n",
" RETURN new_id;\n",
" EXCEPTION\n",
" WHEN unique_violation THEN\n",
" RETURN add_vertex(lbl || '.');\n",
" END\n",
" $$ LANGUAGE plpgsql\n",
" RETURNS NULL ON NULL INPUT;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a104c6ab",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_vertex('v3');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6c4e900a",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM vertices;"
]
},
{
"cell_type": "markdown",
"id": "a5c8e16b",
"metadata": {},
"source": [
"Error codes can be looked up in the documentation:\n",
"\n",
"https://postgresql.org/docs/current/errcodes-appendix.html"
]
},
{
"cell_type": "markdown",
"id": "7cf8f055",
"metadata": {},
"source": [
"### Conditionals"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5864bc80",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS add_edge;\n",
"CREATE FUNCTION add_edge(lbl_from VARCHAR, lbl_to VARCHAR)\n",
" RETURNS BOOLEAN AS $$\n",
" DECLARE\n",
" already_present BOOLEAN;\n",
" from_id_requested INTEGER;\n",
" to_id_requested INTEGER;\n",
" BEGIN\n",
" SELECT id\n",
" INTO from_id_requested\n",
" FROM vertices\n",
" WHERE label = lbl_from;\n",
"\n",
" SELECT id\n",
" INTO to_id_requested\n",
" FROM vertices\n",
" WHERE label = lbl_to;\n",
"\n",
" SELECT EXISTS(\n",
" SELECT *\n",
" FROM edges\n",
" WHERE (from_id, to_id) =\n",
" (from_id_requested, to_id_requested)\n",
" )\n",
" INTO already_present;\n",
" \n",
" IF already_present THEN\n",
" RETURN FALSE;\n",
" ELSE\n",
" INSERT INTO edges (from_id, to_id)\n",
" VALUES (from_id_requested, to_id_requested);\n",
" END IF;\n",
" \n",
" RETURN TRUE;\n",
" END\n",
" $$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f19e3072",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_edge('v1', 'v2');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cbbd4322",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM edges;"
]
},
{
"cell_type": "markdown",
"id": "c0231942",
"metadata": {},
"source": [
"### Procedures vs Functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b8527ee3",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS add_edge;\n",
"DROP PROCEDURE IF EXISTS add_edge;\n",
"CREATE PROCEDURE add_edge(lbl_from VARCHAR, lbl_to VARCHAR)\n",
" AS $$\n",
" DECLARE\n",
" already_present BOOLEAN;\n",
" from_id_requested INTEGER;\n",
" to_id_requested INTEGER;\n",
" BEGIN\n",
" SELECT id\n",
" INTO from_id_requested\n",
" FROM vertices\n",
" WHERE label = lbl_from;\n",
"\n",
" SELECT id\n",
" INTO to_id_requested\n",
" FROM vertices\n",
" WHERE label = lbl_to;\n",
"\n",
" SELECT EXISTS(\n",
" SELECT *\n",
" FROM edges\n",
" WHERE (from_id, to_id) =\n",
" (from_id_requested, to_id_requested)\n",
" )\n",
" INTO already_present;\n",
" \n",
" IF NOT already_present THEN\n",
" INSERT INTO edges (from_id, to_id)\n",
" VALUES (from_id_requested, to_id_requested);\n",
" END IF;\n",
" END\n",
" $$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e438d460",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT add_edge('v3', 'v1');"
]
},
{
"cell_type": "markdown",
"id": "d90c16cd",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The above gives an error, because procedures cannot be called inside expressions. We need to use the `CALL` command."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "deed50ff",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CALL add_edge('v3', 'v1');"
]
},
{
"cell_type": "markdown",
"id": "e4218120",
"metadata": {},
"source": [
"### Default Argument Values and Assignments"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44cf8ef7",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP PROCEDURE IF EXISTS add_edge(VARCHAR, VARCHAR);\n",
"DROP PROCEDURE IF EXISTS add_edge(VARCHAR, VARCHAR, VARCHAR);\n",
"CREATE PROCEDURE add_edge(lbl_from VARCHAR,\n",
" lbl_to VARCHAR,\n",
" edge_lbl VARCHAR DEFAULT NULL)\n",
" AS $$\n",
" DECLARE\n",
" already_present BOOLEAN;\n",
" from_id_requested INTEGER;\n",
" to_id_requested INTEGER;\n",
" BEGIN\n",
" SELECT id\n",
" INTO from_id_requested\n",
" FROM vertices\n",
" WHERE label = lbl_from;\n",
"\n",
" SELECT id\n",
" INTO to_id_requested\n",
" FROM vertices\n",
" WHERE label = lbl_to;\n",
" \n",
" edge_lbl := COALESCE(edge_lbl, lbl_from || ' -> ' || lbl_to);\n",
"\n",
" SELECT EXISTS(\n",
" SELECT *\n",
" FROM edges\n",
" WHERE (from_id, to_id) =\n",
" (from_id_requested, to_id_requested)\n",
" )\n",
" INTO already_present;\n",
" \n",
" IF NOT already_present THEN\n",
" INSERT INTO edges (from_id, to_id, label)\n",
" VALUES (from_id_requested, to_id_requested, edge_lbl);\n",
" END IF;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "markdown",
"id": "783fe9f2",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Note that there can be multiple PL/pgSQL functions with the same name but different argument counts/types. Where ambigious, we need to specify the argument types in `DROP FUNCTION` command."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e98a4fe",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CALL add_edge('v1', 'v3');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c47bfca8",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CALL add_edge('v2', 'v3', 'MyEdge');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23bed71b",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM edges;"
]
},
{
"cell_type": "markdown",
"id": "ab2dbe93",
"metadata": {},
"source": [
"### Iteration"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69e40a42",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CREATE OR REPLACE FUNCTION my_loop() RETURNS INT AS $$\n",
" DECLARE\n",
" number INT := 0;\n",
" sum INT := 0;\n",
" BEGIN\n",
" LOOP\n",
" sum := sum + number;\n",
" number := number + 1;\n",
" IF number > 100 THEN\n",
" EXIT; -- break out of loop\n",
" END IF;\n",
" END LOOP;\n",
"\n",
" RETURN sum;\n",
" END\n",
"$$ LANGUAGE plpgsql"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a5a913e3",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT my_loop()"
]
},
{
"cell_type": "markdown",
"id": "0c15921b",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Instead of the `IF` construct, we can also use the following to break from the loop.\n",
"\n",
"```sql\n",
"EXIT WHEN number > 100;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "5aa71ece",
"metadata": {},
"source": [
"### Cursors"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "196a1267",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DELETE FROM edges;\n",
"DELETE FROM vertices;\n",
"\n",
"SELECT add_vertex('vA');\n",
"SELECT add_vertex('vB');\n",
"SELECT add_vertex('vC');\n",
"SELECT add_vertex('vD');\n",
"SELECT add_vertex('vE');\n",
"SELECT add_vertex('vF');\n",
"SELECT add_vertex('vG');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "30f08d0b",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS list_vertices;\n",
"CREATE FUNCTION list_vertices() RETURNS VARCHAR AS $$\n",
" DECLARE\n",
" current_label VARCHAR;\n",
" result VARCHAR := '';\n",
" first BOOLEAN := TRUE;\n",
" cur REFCURSOR;\n",
" BEGIN\n",
" OPEN cur FOR SELECT label FROM vertices;\n",
"\n",
" LOOP\n",
" FETCH FROM cur INTO current_label; \n",
"\n",
" EXIT WHEN current_label IS NULL;\n",
"\n",
" IF first THEN\n",
" result := current_label;\n",
" ELSE\n",
" result := result || ', ' || current_label;\n",
" END IF;\n",
" first := FALSE;\n",
" END LOOP;\n",
"\n",
" CLOSE cur;\n",
"\n",
" RETURN result;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dce99e7c",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT list_vertices();"
]
},
{
"cell_type": "markdown",
"id": "9c4cdc48",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"`FETCH`ing from a cursor also implicitly sets a variable named `FOUND`. We can test the value of `FOUND` instead of checking if we fetched a `NULL` row.\n",
"\n",
"Note the `CLOSE` instruction. It allows the resources related to cursor to be released and allows the cursor variable to be used with `OPEN` again.\n",
"\n",
"In the `OPEN` instruction we **bound** the cursor variable to a query. We can instead create a cursor variable that is already bound at creation time. We need to declare it as `REFCURSOR` rather than `CURSOR`.\n",
"\n",
"If we have a `CURSOR` variable, we can use it with a variant of `OPEN` without `FOR`, as below."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43676203",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS list_vertices;\n",
"CREATE FUNCTION list_vertices() RETURNS VARCHAR AS $$\n",
" DECLARE\n",
" current_label VARCHAR;\n",
" result VARCHAR := '';\n",
" first BOOLEAN := TRUE;\n",
" cur CURSOR FOR SELECT label FROM vertices;\n",
" BEGIN\n",
" OPEN cur;\n",
"\n",
" LOOP\n",
" FETCH FROM cur INTO current_label; \n",
"\n",
" EXIT WHEN current_label IS NULL;\n",
"\n",
" IF first THEN\n",
" result := current_label;\n",
" ELSE\n",
" result := result || ', ' || current_label;\n",
" END IF;\n",
" first := FALSE;\n",
" END LOOP;\n",
"\n",
" CLOSE cur;\n",
"\n",
" RETURN result;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fa4c78a1",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"SELECT list_vertices();"
]
},
{
"cell_type": "markdown",
"id": "6593383a",
"metadata": {},
"source": [
"### \"RECORD\" Type"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d040e18e",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS record_example;\n",
"CREATE FUNCTION record_example() RETURNS VARCHAR AS $$\n",
" DECLARE\n",
" rec RECORD;\n",
" BEGIN\n",
" SELECT * INTO rec FROM vertices LIMIT 1;\n",
" \n",
" RETURN rec.label || '(' || rec.id || ')';\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31c3563c",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT record_example()"
]
},
{
"cell_type": "markdown",
"id": "e929d5c5",
"metadata": {},
"source": [
"### \"FOR … IN\" Loops"
]
},
{
"cell_type": "markdown",
"id": "e07dd597",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We can use a `CURSOR` (but not a `REFCURSOR`) with a special variant of the `LOOP` construct."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b0f5ef4f",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS list_vertices;\n",
"CREATE FUNCTION list_vertices() RETURNS VARCHAR AS $$\n",
" DECLARE\n",
" result VARCHAR := '';\n",
" first BOOLEAN := TRUE;\n",
" cur CURSOR FOR SELECT label FROM vertices;\n",
" BEGIN\n",
" FOR record IN cur LOOP\n",
" IF first THEN\n",
" result := record.label;\n",
" ELSE\n",
" result := result || ', ' || record.label;\n",
" END IF;\n",
" first := FALSE;\n",
" END LOOP;\n",
"\n",
" return result;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e0bd1e4",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT list_vertices();"
]
},
{
"cell_type": "markdown",
"id": "a9d8d3f2",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Finally, we can omit the declaration of a cursor variable and put query code in the loop construct."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "147edc94",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS list_vertices;\n",
"CREATE FUNCTION list_vertices() RETURNS VARCHAR AS $$\n",
" DECLARE\n",
" result VARCHAR := '';\n",
" first BOOLEAN := TRUE;\n",
" record RECORD;\n",
" BEGIN\n",
" FOR record IN SELECT label FROM vertices LOOP\n",
" IF first THEN\n",
" result := record.label;\n",
" ELSE\n",
" result := result || ', ' || record.label;\n",
" END IF;\n",
" first := FALSE;\n",
" END LOOP;\n",
"\n",
" return result;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "abf479c6",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT list_vertices();"
]
},
{
"cell_type": "markdown",
"id": "f0430572",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Note that we now needed to explicitly declare the cursor variable (which was not required before).\n",
"\n",
"It happens to be also possible to create a list like this with an aggregation function built into Postgres, `STRING_AGG()`.\n",
"\n",
"```sql\n",
"SELECT STRING_AGG(label, ', ') FROM vertices;\n",
"```\n",
"\n",
"Note, however, that a loop in PL/pgSQL allows us to control the order of the values that are processed (we can use `ORDER BY` in the query)."
]
},
{
"cell_type": "markdown",
"id": "bb8a18c6",
"metadata": {},
"source": [
"### Sets of Values"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "29327330",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS set_example;\n",
"CREATE FUNCTION set_example() RETURNS SETOF TEXT AS $$\n",
" BEGIN\n",
" RETURN NEXT 'foo';\n",
" RETURN NEXT 'bar';\n",
" RETURN NEXT 'baz';\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23701f09",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"SELECT set_example()"
]
},
{
"cell_type": "markdown",
"id": "231b572f",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Try also adding the following above the `RETURN NEXT` lines.\n",
"\n",
"```sql\n",
" RETURN QUERY SELECT 'wxy'\n",
" UNION\n",
" SELECT 'xyz';\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "b0444eaf",
"metadata": {},
"source": [
"### \"FETCH\" and \"MOVE\" Commands"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dea00db4",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS fetch_examples;\n",
"CREATE FUNCTION fetch_examples() RETURNS SETOF TEXT AS $$\n",
" DECLARE\n",
" cur CURSOR FOR SELECT label, id\n",
" FROM vertices\n",
" ORDER BY label;\n",
" _label TEXT;\n",
" _id INT;\n",
" BEGIN\n",
" OPEN cur;\n",
"\n",
" FETCH LAST FROM cur INTO _label, _id;\n",
" RETURN NEXT _label || ' (' || _id || ')';\n",
"\n",
" CLOSE cur;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8224dd77",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT fetch_examples()"
]
},
{
"cell_type": "markdown",
"id": "5d7e6049",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Add the following.\n",
"\n",
"```sql\n",
" MOVE FORWARD 1 IN cur;\n",
" FETCH FROM cur INTO _label, _id;\n",
" RETURN NEXT _label || ' (' || _id || ')';\n",
"```\n",
"\n",
"See that `MOVE` allowed us to skip a row.\n",
"\n",
"Add the following.\n",
"\n",
"```sql\n",
" MOVE LAST IN cur;\n",
" FETCH FROM cur INTO _label, _id;\n",
" RETURN NEXT _label || ' (' || _id || ')';\n",
"```\n",
"\n",
"See that the cursor got positioned **after** the final row and subsequent `FETCH` returned a record with `NULL`s.\n",
"\n",
"Add the following.\n",
"\n",
"```sql\n",
" MOVE LAST IN cur;\n",
" MOVE RELATIVE -1 IN cur;\n",
" FETCH FROM cur INTO _label, _id;\n",
" RETURN NEXT _label || ' (' || _id || ')';\n",
"```\n",
"\n",
"See that we were able to seek backward and fetch the last row of the query. For some queries Postgres might be **by default** unable to move the cursor backward. The keyword `SCROLL` can be added after cursor name in its declaration to mandate that backward seeks are possible, as below.\n",
"\n",
"```sql\n",
"cur SCROLL CURSOR FOR SELECT 'something';\n",
"```\n",
"\n",
"Now, replace the most recently added sequence of `MOVE`&`FETCH` lines and instead add the following, functionally equivalent code.\n",
"\n",
"```sql\n",
" MOVE LAST IN cur;\n",
" FETCH RELATIVE 0 FROM cur INTO _label, _id;\n",
"```\n",
"\n",
"Now, replace it with yet shorter, equivalent line.\n",
"\n",
"```sql\n",
" FETCH LAST FROM cur INTO _label, _id;\n",
"```\n",
"\n",
"As you see, the clauses `LAST`, `RELATIVE *count*`, `FORWARD *count*` and others work both with `FETCH` and `MOVE`."
]
},
{
"cell_type": "markdown",
"id": "b167cd89",
"metadata": {},
"source": [
"#### Using Cursors in Nonprocedural SQL"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3de29f0f",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DECLARE sql_cur CURSOR WITH HOLD FOR SELECT * FROM vertices"
]
},
{
"cell_type": "markdown",
"id": "3d9fb446",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The SQL standard does define cursors as well and they can be used outside user-defined functions. Note that they are meant by the standard to be used by other software, not by humans interacting with the database through, say, a shell.\n",
"\n",
"Cursors are by default closed automatically by the end of transaction. An SQL cursor can be declared `WITH HOLD` to change this behavior and instead have it persist after transaction. We leverage this to play with a cursor from within the Jupyter Notebook, which does not currently handle transactions."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ab34fbab",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"FETCH FORWARD 2 FROM sql_cur"
]
},
{
"cell_type": "markdown",
"id": "9a8dce96",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We can now provide a count after `FORWARD` to fetch more than one row at once. This was not possible with a cursor in PL/pgSQL."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "623baa58",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"MOVE RELATIVE 2 IN sql_cur"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd5d8868",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"FETCH FORWARD 2 FROM sql_cur"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "281e2cde",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"FETCH ALL IN sql_cur"
]
},
{
"cell_type": "markdown",
"id": "7d62c69d",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The `FETCH ALL`, as used above, is also not available in PL/pgSQL."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "069a051f",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CLOSE sql_cur"
]
},
{
"cell_type": "markdown",
"id": "6094d7c3",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Only now can a cursor with name `sql_cur` be declared again."
]
},
{
"cell_type": "markdown",
"id": "a503ec5b",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The `SCROLL` option works analogously as in PL/pgSQL.\n",
"\n",
"It is also possible to return a cursor (or a set of them!) from a PL/pgSQL function and use it from SQL."
]
},
{
"cell_type": "markdown",
"id": "a0bdac01",
"metadata": {},
"source": [
"### Simple Graph Algorithm Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d6a9e761",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"CALL add_edge('vA', 'vB');\n",
"CALL add_edge('vA', 'vC');\n",
"CALL add_edge('vB', 'vC');\n",
"CALL add_edge('vB', 'vE');\n",
"CALL add_edge('vE', 'vF');\n",
"CALL add_edge('vE', 'vA');\n",
"CALL add_edge('vD', 'vC');\n",
"CALL add_edge('vF', 'vC');"
]
},
{
"cell_type": "markdown",
"id": "4d0bd5dd",
"metadata": {},
"source": [
"Task: find all vertices reachable from given vertex."
]
},
{
"cell_type": "markdown",
"id": "39b40891",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"This would be difficult to achieve with nonprocedural SQL. BFS and DFS algorithms both utilize iteration."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "84592ab9",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP FUNCTION IF EXISTS find_reachable;\n",
"CREATE FUNCTION find_reachable(start_label VARCHAR)\n",
" RETURNS SETOF VARCHAR AS $$\n",
" BEGIN\n",
" CREATE TEMPORARY TABLE __current_ids AS\n",
" SELECT id FROM vertices WHERE label = start_label;\n",
"\n",
" CREATE TEMPORARY TABLE __found_ids AS\n",
" SELECT * FROM __current_ids;\n",
"\n",
" LOOP\n",
" CREATE TEMPORARY TABLE __tmp_ids AS\n",
" SELECT DISTINCT to_id\n",
" FROM __current_ids JOIN edges ON id = from_id\n",
" WHERE to_id NOT IN (SELECT * FROM __found_ids);\n",
"\n",
" EXIT WHEN NOT EXISTS (SELECT * FROM __tmp_ids);\n",
"\n",
" DELETE FROM __current_ids;\n",
" INSERT INTO __found_ids SELECT * FROM __tmp_ids;\n",
" INSERT INTO __current_ids SELECT * FROM __tmp_ids;\n",
" DROP TABLE __tmp_ids;\n",
" END LOOP;\n",
"\n",
" RETURN QUERY SELECT label\n",
" FROM __found_ids NATURAL JOIN vertices;\n",
"\n",
" DROP TABLE __found_ids;\n",
" DROP TABLE __current_ids;\n",
" DROP TABLE __tmp_ids;\n",
" END\n",
"$$ LANGUAGE plpgsql;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "651ca228",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT find_reachable('vA')"
]
},
{
"cell_type": "markdown",
"id": "8f0a39cb",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Note that something similar could also be achieved with nonprocedural SQL using a recursive `WITH` query (not discussed before in this course). An example is presented below.\n",
"\n",
"```sql\n",
"WITH RECURSIVE reachable_ids AS (\n",
" SELECT id FROM vertices WHERE label = 'vE'\n",
" UNION\n",
" SELECT DISTINCT to_id\n",
" FROM reachable_ids JOIN edges ON (id = from_id)\n",
")\n",
"SELECT label FROM reachable_ids NATURAL JOIN vertices;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1921b86c",
"metadata": {},
"source": [
"## Support for Other Procedural Languages"
]
},
{
"cell_type": "markdown",
"id": "d9588667",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Examples for computing the greatest common denominator."
]
},
{
"cell_type": "markdown",
"id": "d7dd04dc",
"metadata": {},
"source": [
"Tcl\n",
"\n",
"```sql\n",
"CREATE EXTENSION pltcl;\n",
"\n",
"CREATE FUNCTION gcd(a INT, b INT) RETURNS INT AS $$\n",
"```\n",
"```tcl\n",
" if {$b > $a} {lassign \"$b $a\" b a}\n",
"\n",
" while {$a % $b > 0} {\n",
" lassign \"$b [expr {$a % $b}]\" a b\n",
" }\n",
"\n",
" return $b\n",
"```\n",
"```sql\n",
"$$ LANGUAGE pltcl STRICT;\n",
"\n",
"SELECT gcd(45, 72);\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "2df8b161",
"metadata": {},
"source": [
"and Python\n",
"\n",
"```sql\n",
"CREATE EXTENSION plpython3u;\n",
"\n",
"CREATE FUNCTION gcd(a INT, b INT) RETURNS INT AS $$\n",
"```\n",
"```python\n",
" if b > a:\n",
" a, b = b, a\n",
"\n",
" while a % b:\n",
" a, b = b, a%b\n",
"\n",
" return b\n",
"```\n",
"```sql\n",
"$$ LANGUAGE plpython3u STRICT;\n",
"\n",
"SELECT gcd(45, 72);\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "daf605d7",
"metadata": {},
"source": [
"## Triggers"
]
},
{
"cell_type": "markdown",
"id": "c0cff4d2",
"metadata": {},
"source": [
"Let me first introduce to you the `IS DISTINCT FROM` operator."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f11033c",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"-- Two NULL values are treated as non-distinct.\n",
"SELECT (5, 6, NULL) IS DISTINCT FROM (5, 6, NULL)"
]
},
{
"cell_type": "markdown",
"id": "e5d4a572",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"`IS DISTINCT FROM` can be used to compare values or **tuples** (as in this case)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b704943",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT (5, 6, NULL) IS DISTINCT FROM (5, 789, NULL)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f7367ff8",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP TABLE IF EXISTS edge_changes;\n",
"CREATE TABLE edge_changes (\n",
" change_id SERIAL PRIMARY KEY,\n",
" from_id INT,\n",
" to_id INT,\n",
" label VARCHAR,\n",
" \"when\" TIMESTAMP,\n",
" change VARCHAR(7),\n",
" CHECK (change IN ('added', 'removed'))\n",
");\n",
"\n",
"CREATE OR REPLACE FUNCTION log_edge_change()\n",
" RETURNS TRIGGER AS $$\n",
" BEGIN\n",
" IF old IS NOT DISTINCT FROM new THEN\n",
" RETURN NULL;\n",
" END IF;\n",
"\n",
" IF old.from_id IS NOT NULL THEN\n",
" INSERT INTO edge_changes\n",
" (from_id, to_id, label,\n",
" \"when\", change)\n",
" VALUES (old.from_id, old.to_id, old.label,\n",
" NOW(), 'removed');\n",
" END IF;\n",
"\n",
" IF new.from_id IS NOT NULL THEN\n",
" INSERT INTO edge_changes\n",
" (from_id, to_id, label,\n",
" \"when\", change)\n",
" VALUES (new.from_id, new.to_id, new.label,\n",
" NOW(), 'added');\n",
" END IF;\n",
"\n",
" RETURN NULL;\n",
" END\n",
"$$ LANGUAGE plpgsql;\n",
"\n",
"CREATE OR REPLACE TRIGGER log_change\n",
" AFTER INSERT OR DELETE OR UPDATE ON edges\n",
" FOR EACH ROW\n",
" EXECUTE FUNCTION log_edge_change();"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "068144f2",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM edge_changes;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d5effd28",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"-- A no-op, should not generate a log entry.\n",
"UPDATE edges\n",
"SET to_id = to_id\n",
"WHERE to_id = 2;\n",
"\n",
"-- Deletion, should generate log entries.\n",
"DELETE FROM edges;"
]
},
{
"cell_type": "markdown",
"id": "df25ef7a",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Re-run the earlier `SELECT` to see the logged changes."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b6838443",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CALL add_edge('vA', 'vB');\n",
"CALL add_edge('vA', 'vC');\n",
"CALL add_edge('vB', 'vC');\n",
"CALL add_edge('vB', 'vE');\n",
"CALL add_edge('vE', 'vF');\n",
"CALL add_edge('vE', 'vA');\n",
"CALL add_edge('vD', 'vC');\n",
"CALL add_edge('vF', 'vC');"
]
},
{
"cell_type": "markdown",
"id": "77e8d81d",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Re-run the earlier `SELECT` to see the logged changes."
]
},
{
"cell_type": "markdown",
"id": "0a6f4267",
"metadata": {},
"source": [
"### \"INSTEAD OF\" Triggers and Trigger-Updatable Views"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3b63cb74",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP VIEW IF EXISTS is_edge;\n",
"CREATE VIEW is_edge AS\n",
"SELECT l.id AS from_id,\n",
" l.label AS from_label,\n",
" r.id AS to_id,\n",
" r.label AS to_label,\n",
" e.to_id IS NOT NULL AS present\n",
"FROM vertices l\n",
" CROSS JOIN vertices r\n",
" LEFT JOIN edges e ON (l.id, r.id) = (e.from_id, e.to_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14ccb9d4",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM is_edge"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0115d614",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE is_edge\n",
"SET present = TRUE\n",
"WHERE (from_label, to_label) = ('vF', 'vG')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "73eacf7f",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CREATE OR REPLACE FUNCTION update_is_edge()\n",
" RETURNS TRIGGER AS $$\n",
" BEGIN\n",
" IF new.present THEN\n",
" CALL add_edge(old.from_label, old.to_label);\n",
" ELSE\n",
" DELETE FROM edges\n",
" WHERE (from_id, to_id) = (old.from_id, old.to_id);\n",
" END IF;\n",
"\n",
" RETURN NULL;\n",
" END\n",
"$$ LANGUAGE plpgsql;\n",
"\n",
"CREATE OR REPLACE TRIGGER handle_update\n",
" INSTEAD OF UPDATE ON is_edge\n",
" FOR EACH ROW\n",
" EXECUTE FUNCTION update_is_edge();"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0b89edc",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE is_edge\n",
"SET present = TRUE\n",
"WHERE (from_label, to_label) = ('vF', 'vG')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ffd1b04",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM edge_changes"
]
},
{
"cell_type": "markdown",
"id": "f9048743",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We see the log of changes made by our `INSTEAD OF` trigger."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "242432a2",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM edges"
]
},
{
"cell_type": "markdown",
"id": "890dedaf",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We see the row added through view update."
]
},
{
"cell_type": "markdown",
"id": "316d6132",
"metadata": {},
"source": [
"#### Examples with Sanity Checks"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6bab5c17",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CREATE OR REPLACE FUNCTION update_is_edge()\n",
" RETURNS TRIGGER AS $$\n",
" BEGIN\n",
" IF ((old.from_id, old.from_label, old.to_id, old.to_label)\n",
" IS DISTINCT FROM\n",
" (new.from_id, new.from_label, new.to_id, new.to_label))\n",
" OR\n",
" new.present IS NULL\n",
" THEN\n",
" RAISE EXCEPTION\n",
" 'Illegal update (from_id = %, to_id = %)',\n",
" old.from_id, old.to_id\n",
" USING\n",
" HINT = 'Please only set `present'' to TRUE/FALSE';\n",
" END IF;\n",
"\n",
" IF new.present THEN\n",
" CALL add_edge(old.from_label, old.to_label);\n",
" ELSE\n",
" DELETE FROM edges\n",
" WHERE (from_id, to_id) = (old.from_id, old.to_id);\n",
" END IF;\n",
"\n",
" RETURN NULL;\n",
" END\n",
"$$ LANGUAGE plpgsql;\n",
"\n",
"CREATE OR REPLACE TRIGGER handle_update\n",
" INSTEAD OF UPDATE ON is_edge\n",
" FOR EACH ROW\n",
" EXECUTE FUNCTION update_is_edge();"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "58973eac",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE is_edge\n",
"SET to_label = 'vE'\n",
"WHERE (from_label, to_label) = ('vF', 'vG')"
]
},
{
"cell_type": "markdown",
"id": "30aa6925",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Above we see the effect of our `RAISE` instruction in trigger's code."
]
},
{
"cell_type": "markdown",
"id": "6eccd424",
"metadata": {},
"source": [
"If we do not need a custom exception message, we can use the `ASSERT` command."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db872df4",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CREATE OR REPLACE FUNCTION update_is_edge()\n",
" RETURNS TRIGGER AS $$\n",
" BEGIN\n",
" ASSERT (old.from_id, old.from_label, old.to_id, old.to_label)\n",
" IS NOT DISTINCT FROM\n",
" (new.from_id, new.from_label, new.to_id, new.to_label);\n",
" ASSERT new.present IS NOT NULL;\n",
"\n",
" IF new.present THEN\n",
" CALL add_edge(old.from_label, old.to_label);\n",
" ELSE\n",
" DELETE FROM edges\n",
" WHERE (from_id, to_id) = (old.from_id, old.to_id);\n",
" END IF;\n",
"\n",
" RETURN NULL;\n",
" END\n",
"$$ LANGUAGE plpgsql;\n",
"\n",
"CREATE OR REPLACE TRIGGER handle_update\n",
" INSTEAD OF UPDATE ON is_edge\n",
" FOR EACH ROW\n",
" EXECUTE FUNCTION update_is_edge();"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "998ce4af",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE is_edge\n",
"SET to_label = 'vE'\n",
"WHERE (from_label, to_label) = ('vF', 'vG')"
]
},
{
"cell_type": "markdown",
"id": "645362b3",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The above exception was generated by the `ASSERT` command."
]
},
{
"cell_type": "markdown",
"id": "365fcfc8",
"metadata": {},
"source": [
"See that valid updates still do work."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3345ae83",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE is_edge\n",
"SET present = FALSE\n",
"WHERE (from_label, to_label) = ('vF', 'vG')"
]
},
{
"cell_type": "markdown",
"id": "c921b3a6",
"metadata": {},
"source": [
"### Trigger-Implemented Integrity Constraints"
]
},
{
"cell_type": "markdown",
"id": "7ff12f0e",
"metadata": {},
"source": [
"The cycle detection algorithm below has been derived from Kahn's.\n",
"\n",
"https://en.wikipedia.org/wiki/Topological_sorting#Kahn's_algorithm"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17cf214b",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"-- Make sure we have no edges initially.\n",
"DELETE FROM edges;\n",
"\n",
"CREATE OR REPLACE FUNCTION verify_acyclic()\n",
" RETURNS TRIGGER AS $$\n",
" BEGIN\n",
" RAISE WARNING 'Checking for cycles';\n",
" -- Prepare a working copy of edges that we'll modify.\n",
" DROP TABLE IF EXISTS __edges_left;\n",
" CREATE TEMPORARY TABLE __edges_left AS\n",
" SELECT from_id, to_id FROM edges;\n",
"\n",
" LOOP\n",
" EXIT WHEN NOT EXISTS (SELECT * FROM __edges_left);\n",
"\n",
" -- Find vertices that have no incoming edges.\n",
" DROP TABLE IF EXISTS __source_vertices;\n",
" CREATE TEMPORARY TABLE __source_vertices\n",
" AS SELECT _out.from_id\n",
" FROM __edges_left _in\n",
" RIGHT JOIN __edges_left _out\n",
" ON (_in.to_id = _out.from_id)\n",
" WHERE _in.to_id IS NULL;\n",
"\n",
" IF NOT EXISTS (SELECT * FROM __source_vertices) THEN\n",
" -- There is no vertex without incoming edge.\n",
" RAISE EXCEPTION 'Cycle detected';\n",
" END IF;\n",
"\n",
" DELETE FROM __edges_left\n",
" WHERE from_id IN (SELECT * FROM __source_vertices);\n",
" END LOOP;\n",
"\n",
" DROP TABLE __edges_left;\n",
" DROP TABLE IF EXISTS source_vertices;\n",
"\n",
" RETURN NULL;\n",
" END\n",
"$$ LANGUAGE plpgsql;\n",
"\n",
"CREATE OR REPLACE TRIGGER forbid_cycles\n",
" AFTER INSERT OR UPDATE on edges\n",
" FOR EACH ROW\n",
" EXECUTE FUNCTION verify_acyclic();"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b1405a5",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"CALL add_edge('vA', 'vB');\n",
"CALL add_edge('vA', 'vC');\n",
"CALL add_edge('vB', 'vC');\n",
"CALL add_edge('vB', 'vE');\n",
"CALL add_edge('vE', 'vF');"
]
},
{
"cell_type": "markdown",
"id": "fa81c7b7",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The above succeeds, there are no edges yet."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aebee679",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"-- Would create cycle vA → vB → vE → vA.\n",
"CALL add_edge('vE', 'vA');"
]
},
{
"cell_type": "markdown",
"id": "35546575",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Our trigger has properly prevented the insertion."
]
},
{
"cell_type": "markdown",
"id": "8f5199ed",
"metadata": {},
"source": [
"#### \"WHEN\" clause"
]
},
{
"cell_type": "markdown",
"id": "73130fb1",
"metadata": {},
"source": [
"If a trigger is only meant to be run in certain cases, we can utilize the `WHEN` clause of `CREATE TRIGGER`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "10819fa1",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"-- We'll log (non-raising) trigger firings.\n",
"DROP TABLE IF EXISTS cycle_checks;\n",
"CREATE TABLE cycle_checks (\n",
" id SERIAL PRIMARY KEY,\n",
" _when TIMESTAMP\n",
");\n",
"\n",
"CREATE OR REPLACE FUNCTION verify_acyclic()\n",
" RETURNS TRIGGER AS $$\n",
" BEGIN\n",
" -- We have just added this `INSERT' command.\n",
" INSERT INTO cycle_checks (_when)\n",
" VALUES (NOW());\n",
"\n",
" -- Prepare a working copy of edges that we'll modify.\n",
" DROP TABLE IF EXISTS __edges_left;\n",
" CREATE TEMPORARY TABLE __edges_left AS\n",
" SELECT from_id, to_id FROM edges;\n",
"\n",
" LOOP\n",
" EXIT WHEN NOT EXISTS (SELECT * FROM __edges_left);\n",
"\n",
" -- Find vertices that have no incoming edges.\n",
" DROP TABLE IF EXISTS __source_vertices;\n",
" CREATE TEMPORARY TABLE __source_vertices\n",
" AS SELECT _out.from_id\n",
" FROM __edges_left _in\n",
" RIGHT JOIN __edges_left _out\n",
" ON (_in.to_id = _out.from_id)\n",
" WHERE _in.to_id IS NULL;\n",
"\n",
" IF NOT EXISTS (SELECT * FROM __source_vertices) THEN\n",
" -- There is no vertex without incoming edge.\n",
" RAISE EXCEPTION 'Cycle detected';\n",
" END IF;\n",
"\n",
" DELETE FROM __edges_left\n",
" WHERE from_id IN (SELECT * FROM __source_vertices);\n",
" END LOOP;\n",
"\n",
" DROP TABLE __edges_left;\n",
" DROP TABLE IF EXISTS source_vertices;\n",
"\n",
" RETURN NULL;\n",
" END\n",
"$$ LANGUAGE plpgsql;\n",
"\n",
"DROP TRIGGER IF EXISTS forbid_cycles ON edges;\n",
"\n",
"CREATE OR REPLACE TRIGGER forbid_cycles_on_insert\n",
" AFTER INSERT ON edges\n",
" FOR EACH ROW\n",
" EXECUTE FUNCTION verify_acyclic();\n",
"\n",
"CREATE OR REPLACE TRIGGER forbid_cycles_on_update\n",
" AFTER UPDATE ON edges\n",
" FOR EACH ROW\n",
" WHEN ((old.from_id, old.to_id)\n",
" IS DISTINCT FROM\n",
" (new.from_id, new.to_id))\n",
" EXECUTE FUNCTION verify_acyclic();"
]
},
{
"cell_type": "markdown",
"id": "d5243022",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"As shown above, we can use `WHEN` in `CREATE TRIGGER` to more concisely specify a condition for trigger firing."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dcb15881",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DELETE FROM edges WHERE label = 'vA -> vB';\n",
"CALL add_edge('vA', 'vB');\n",
"SELECT * FROM cycle_checks;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aad65684",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM edges"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2551564",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE edges\n",
"SET to_id = 1,\n",
" label = 'vB -> vA'\n",
"WHERE (from_id, to_id) = (2, 3)"
]
},
{
"cell_type": "markdown",
"id": "6cce7c97",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Cycles are still properly prevented."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1aaebada",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE edges\n",
"SET label = 'vB ---> vA'\n",
"WHERE (from_id, to_id) = (2, 3);\n",
"SELECT * FROM cycle_checks;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b7e9d140",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE edges\n",
"SET label = 'vB -> vA'\n",
"WHERE (from_id, to_id) = (2, 3);\n",
"SELECT * FROM cycle_checks;"
]
},
{
"cell_type": "markdown",
"id": "37ad6c20",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Irrelevant updates (label changes) give false in the `WHEN` condition and do not cause trigger firing."
]
},
{
"cell_type": "markdown",
"id": "d321bebc",
"metadata": {},
"source": [
"### \"FOR EACH STATEMENT\" Triggers"
]
},
{
"cell_type": "markdown",
"id": "b95f36cb",
"metadata": {},
"source": [
"Our `FOREACH ROW` trigger fires once for every inserted row, even if they are inserted in one operation."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e86cd7fb",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DELETE FROM edges WHERE from_id = 1;\n",
"\n",
"DELETE FROM cycle_checks;\n",
"\n",
"INSERT INTO edges (from_id, to_id, label)\n",
"SELECT * FROM (VALUES (1, 2, 'vA -> vB'),\n",
" (1, 3, 'vA -> vC'));\n",
" \n",
"SELECT * FROM cycle_checks;"
]
},
{
"cell_type": "markdown",
"id": "1731b8df",
"metadata": {},
"source": [
"_note_\n",
"\n",
"Note the presence of two rows in `cycle_checks`."
]
},
{
"cell_type": "markdown",
"id": "590016d3",
"metadata": {},
"source": [
"We can avoid reundant checks."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "67a654e3",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP TRIGGER IF EXISTS forbid_cycles ON edges;\n",
"DROP TRIGGER IF EXISTS forbid_cycles_on_insert ON edges;\n",
"DROP TRIGGER IF EXISTS forbid_cycles_on_update ON edges;\n",
"\n",
"CREATE OR REPLACE TRIGGER forbid_cycles\n",
" AFTER INSERT ON edges\n",
" FOR EACH STATEMENT\n",
" EXECUTE FUNCTION verify_acyclic();"
]
},
{
"cell_type": "markdown",
"id": "54b0b700",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"After re-creating the trigger as `FOR EACH STATEMENT`, re-run the `DELETE`&`DELETE`&`INSERT`&`SELECT` sequence above. See that the trigger now only fires once."
]
},
{
"cell_type": "markdown",
"id": "fa2eacdb",
"metadata": {},
"source": [
"`FOR EACH STATEMENT` triggers can additionally be declared as `DEFERRED`."
]
},
{
"cell_type": "markdown",
"id": "2799c27c",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"`DEFERRED` triggers fire later, at the end of transaction. They are useful to implement delayed data consistency checks, so that temporal inconsistency during a transaction is allowed."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "/gnu/store/q35bxk1i5blx0rcgxphhrq9xqmgha9bz-python-3.11.11/bin/python3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}