{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "928e95de",
"metadata": {},
"outputs": [],
"source": [
"import agh_db_lectures\n",
"agh_db_lectures.prepare_notebook_for_sql()"
]
},
{
"cell_type": "markdown",
"id": "16d6d715",
"metadata": {},
"source": [
"# Transactions"
]
},
{
"cell_type": "markdown",
"id": "ab20a229",
"metadata": {},
"source": [
"Transaction is a unit of work in a computer program. It comprises a collection of steps that change a database in a reliable way.\n",
"\n",
"```sql\n",
"-- Template of a transaction in SQL\n",
"\n",
"START TRANSACTION;\n",
"\n",
"-- Here come the SQL queries and statements which\n",
"-- we want to treat as a single unit of work.\n",
"\n",
"COMMIT;\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9da70f7a",
"metadata": {},
"outputs": [],
"source": [
"agh_db_lectures.download_install_wordpress_site()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "20d1b1c2",
"metadata": {},
"outputs": [],
"source": [
"%sql mysql:///agh_it_wordpress?unix_socket=/var/run/mysql/mysql.sock"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d21766af",
"metadata": {},
"outputs": [],
"source": [
"agh_db_lectures.wp_diagram.download_open()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47bec1da",
"metadata": {},
"outputs": [],
"source": [
"%config SqlMagic.feedback = False"
]
},
{
"cell_type": "markdown",
"id": "731f51fe",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Setting `feedback` to `False` shall prevent ipython-sql from clobbering cell outputs with \"N rows affected.\" messages."
]
},
{
"cell_type": "markdown",
"id": "86706366",
"metadata": {},
"source": [
"## Transaction properties"
]
},
{
"cell_type": "markdown",
"id": "d2c70b31",
"metadata": {},
"source": [
"Four crucial transaction properties give the acronym \"ACID\"."
]
},
{
"cell_type": "markdown",
"id": "2bffdd20",
"metadata": {},
"source": [
"### D — Durability"
]
},
{
"cell_type": "markdown",
"id": "72697133",
"metadata": {},
"source": [
"When data changes of a completed transaction persist in the system, even in case of crashes, etc."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56460773",
"metadata": {},
"outputs": [],
"source": [
"sample_file = open('./sample-file.txt', 'wt')\n",
"\n",
"sample_file.write('* Some heading\\n')\n",
"sample_file.write('- some\\n')\n",
"sample_file.write('- unnumbered\\n')\n",
"sample_file.write('- list\\n')"
]
},
{
"cell_type": "markdown",
"id": "a8f01f4a",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Let's get back to our idea of using a text file as a database. The above piece of Python code writes data to file. Are these writes durable in the sense we've just introduced? If a system crash happens a second after the last line of code gets executed, shall thes changes persist?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eb052a74",
"metadata": {},
"outputs": [],
"source": [
"!cat ./sample-file.txt"
]
},
{
"cell_type": "markdown",
"id": "8b556ac7",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We can use a shell command to display the contents of the file. It is empty — the data has not yet been written and can be lost if the system crashes.\n",
"\n",
"Note that `sample_file.close()` would be one way to make Python push the writes to the operating system kernel. Getting the kernel actually perform disk writes is a separate thing."
]
},
{
"cell_type": "markdown",
"id": "91b14ce3",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Unlike naively written Python code, the operations we've been performing using SQL in the past notebooks **did** have the property of durability. Ensuring durability is one of the primary jobs of a DBMS."
]
},
{
"cell_type": "markdown",
"id": "1c4a118a",
"metadata": {},
"source": [
"### A — Atomicity"
]
},
{
"cell_type": "markdown",
"id": "2fe44a01",
"metadata": {},
"source": [
"When either the entire set of data changes of a transaction gets persisted or none."
]
},
{
"cell_type": "markdown",
"id": "1029c4d1",
"metadata": {},
"source": [
"http://127.0.0.1:28080/site/wp-login.php"
]
},
{
"cell_type": "markdown",
"id": "f63a55aa",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Log in, add a new user and publish a post as that user."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "566b8bb2",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT ID, post_author, post_date, post_title, post_status\n",
"FROM wp_posts"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "109904d3",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"DROP TABLE IF EXISTS user_1_post_ids;\n",
"\n",
"CREATE TABLE user_1_post_ids AS\n",
"SELECT ID\n",
"FROM wp_posts\n",
"WHERE post_author = 1;\n",
"\n",
"UPDATE wp_posts\n",
"SET post_author = 1\n",
"WHERE post_author = 2;\n",
"\n",
"UPDATE wp_posts\n",
"SET post_author = 2\n",
"WHERE ID IN (SELECT ID FROM user_1_post_ids);"
]
},
{
"cell_type": "markdown",
"id": "c261660e",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The above set of statements swaps the authorship of two users' posts. Let's try it out.\n",
"\n",
"Consider what would happen if the system were to crash in between the two `UPDATE` statements. After being brought up again, the system would have both users' posts assigned to user with id 1."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0e5ed6a8",
"metadata": {},
"outputs": [],
"source": [
"%sql mysql:///agh_it_wordpress?unix_socket=/var/run/mysql/mysql.sock"
]
},
{
"cell_type": "markdown",
"id": "28594e26",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Try simulating this scenario by connecting to the database again after the first `UPDATE` statement and before the second one.\n",
"\n",
"The interruption of a sequence related database operations is something we would like to be guarded against.\n",
"\n",
"Execute the omitted last `UPDATE` to have data in a proper state state for the next examples."
]
},
{
"cell_type": "markdown",
"id": "c057b5cf",
"metadata": {},
"source": [
"```shell\n",
"mariadb --port=23306 --host=127.0.0.1 \\\n",
" -pdemo_pwd agh_it_wordpress\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f1cb3c50",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"ipython-sql does not like transactions. We can tell it not to issue the `COMMIT` command after each code cell is executed. However, in practice the transaction still gets terminated after most commands. Because of this, let's open a shell to our database in a terminal.\n",
"\n",
"Try running the same authorship swapping code in the shell. Now, do it again, this time wrapping it in the following commands.\n",
"\n",
"```sql\n",
"SET AUTOCOMMIT = 0;\n",
"START TRANSACTION;\n",
"-- The code.\n",
"COMMIT;\n",
"```\n",
"\n",
"The `SET` command tells the DBMS not to insert an implicit `COMMIT` after almost every statement. We only need to issue it once after connecting, regardless of how many transactions we are going to run.\n",
"\n",
"The code in a transaction works the same, but previously, the DBMS considered single commands to be the only units of work. Now, it considers all commands in the transaction to be some coherent, interdependent whole. It either persists all of it or **rolls back** the partial work that got interrupted.\n",
"\n",
"Hit `CTRL+d` before the last `UPDATE` in the transaction. See that all executed data modification commands got rolled back.\n",
"\n",
"Optionally, interrupt the transaction in the same point but using `kill -9` with mariadb client process PID (found, for example, with the `ps` command).\n",
"\n",
"Unfinished transactions are automatically rolled back in case of events like client disconnect, or system crash. But a programmer may also desire to abort a transaction in some cases. This can be done by issuing a `ROLLBACK` command. Try using it with this example."
]
},
{
"cell_type": "markdown",
"id": "7adc8904",
"metadata": {},
"source": [
"#### Transaction Log"
]
},
{
"cell_type": "markdown",
"id": "712b1920",
"metadata": {},
"source": [
"A possible implementation method of transactions involves a **transaction log**.\n",
"\n",
"1. Assign transactions sequential IDs.\n",
"2. Record the last committed transaction ID.\n",
"3. For each data item (e.g., row) changed by transaction, first write to **the log**\n",
" - its old value,\n",
" - its new value, and\n",
" - transaction id.\n",
"4. Upon commit, record the transaction ID.\n",
"5. When recovering from a crash, revert all uncommitted changes based on log items."
]
},
{
"cell_type": "raw",
"id": "184957bb",
"metadata": {},
"source": [
"| row number | **name** | **balance** |\n",
"|------------|-------------|-------------|\n",
"| 1. | Theodore | 1,000.00$ |\n",
"| 2. | Pancratius | 1,000.00$ |\n",
"| 3. | Eugenia | 1,000.00$ |\n",
"\n",
"Most recently committed transaction ID: ???\n",
"\n",
"Transaction log:\n",
"\n",
"- ???"
]
},
{
"cell_type": "markdown",
"id": "034f5903",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"This is an exercise that demonstrated how transaction log could work. Assume Theodore and his friends have accounts whose balances (in dollars) are stored in a (here simplified) relational database.\n",
"\n",
"Pick the last transaction ID (say, 7).\n",
"\n",
"Write to log and perform one by one the following.\n",
"\n",
"```\n",
"tx 8; row 1 (Theodore, 1000.00$) → (Theodore, 980.00$)\n",
"tx 8; row 2 (Pancratius, 1000.00$) → (Pancratius, 1020.00$)\n",
"tx 8; row 1 (Theodore, 980.00$) → (Theodore, 960.00$)\n",
"tx 8; row 3 (Eugenia, 1000.00$) → (Eugenia, 1020.00$)\n",
"```\n",
"\n",
"Commit the transaction. Record ID 8 as the last committed transacton ID.\n",
"\n",
"\n",
"Theodore has transferred 20\\$ to both Pancratius and Eugenia.\n",
"\n",
"Now, simulate a similar transfer but assume a crash before the last operation (that on row 3).\n",
"\n",
"```\n",
"tx 9; row 1 (Theodore, 960.00$) → (Theodore, 940.00$)\n",
"tx 9; row 2 (Pancratius, 1020.00$) → (Pancratius, 1040.00$)\n",
"tx 9; row 1 (Theodore, 940.00$) → (Theodore, 920.00$)\n",
"```\n",
"\n",
"Assume a crash happens. The system is restarted and DBMS analyzes the transaction log.\n",
"\n",
"We revert the last 3 operations in reverse order and remove each one from the log after it is reverted. We got back to the state of data consistency. The last log item has transaction ID 8 — one that has already been committed.\n",
"\n",
"The DBMS would of course prune old entries in some way."
]
},
{
"cell_type": "markdown",
"id": "80fbdeb9",
"metadata": {},
"source": [
"### I — Isolation"
]
},
{
"cell_type": "markdown",
"id": "16d4d10b",
"metadata": {},
"source": [
"When multiple concurrent transactions are executed as if they ran sequentially."
]
},
{
"cell_type": "markdown",
"id": "97f70460",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We shall simulate a situation where two transactions attempt to operate on the same data. We shall see what precautions MariaDB takes to prevent interference between the transactions."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c527b691",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key LIKE 'show%'"
]
},
{
"cell_type": "markdown",
"id": "d57a0dd3",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The `show_welcome_panel` item is only present for user 1. Let's add it for the subsequently created user as well. In the MariaDB shell do the following, without committing.\n",
"\n",
"```sql\n",
"START TRANSACTION;\n",
"\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key = 'show_welcome_panel';\n",
"\n",
"INSERT INTO wp_usermeta (user_id, meta_key, meta_value)\n",
"VALUES (2, 'show_welcome_panel', 1);\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92649bb1",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE wp_usermeta\n",
"SET meta_value = '1'\n",
"WHERE meta_key = 'show_welcome_panel';"
]
},
{
"cell_type": "markdown",
"id": "a3d10843",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Assume some other code tries to update the same table at the same time. Run this statement while the other transaction is still active. The operation hangs. It waits for the other transaction's lock on table rows to be released.\n",
"\n",
"Once we `ROLLBACK` or `COMMIT` that transaction, this operation shall also finish. Try doing one more read, like `SELECT * FROM wp_users`, and finally terminating the transaction in the MariaDB shell. You can perform the experiment twice, with both ways of terminating the transaction.\n",
"\n",
"Also note that we can always call conflict a coflict \"between transactions\" because even standalone SQL statements are implicitly wrapped in transactions."
]
},
{
"cell_type": "markdown",
"id": "b2fcd324",
"metadata": {},
"source": [
"#### Transactions Serializability"
]
},
{
"cell_type": "markdown",
"id": "461923b6",
"metadata": {},
"source": [
"Certain operation execution **schedules** are **equivalent** to each other."
]
},
{
"cell_type": "markdown",
"id": "e24f1309",
"metadata": {},
"source": [
"
\n",
" \n",
" \n",
" | transaction 1 | \n",
" transaction 2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | read(row X); | \n",
" | \n",
"
\n",
" \n",
" | write(row X); | \n",
" | \n",
"
\n",
" \n",
" | \n",
" write(row X); | \n",
"
\n",
" \n",
" | \n",
" commit; | \n",
"
\n",
" \n",
" | read(row Y); | \n",
" | \n",
"
\n",
" \n",
" | commit; | \n",
" | \n",
"
\n",
" \n",
"
\n",
"↕
\n",
"\n",
" \n",
" \n",
" | transaction 1 | \n",
" transaction 2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | read(row X); | \n",
" | \n",
"
\n",
" \n",
" | write(row X); | \n",
" | \n",
"
\n",
" \n",
" | read(row Y); | \n",
" | \n",
"
\n",
" \n",
" | commit; | \n",
" | \n",
"
\n",
" \n",
" | \n",
" write(row X); | \n",
"
\n",
" \n",
" | \n",
" commit; | \n",
"
\n",
" \n",
"
"
]
},
{
"cell_type": "markdown",
"id": "e5a838cf",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Note that if DBMS knew that transaction 1 is not going to perform any more operations on `wp_usermeta`, it could allow other transactons to immediately perform their operations on that table. The final effect would be the same as if the transactions were executed fully sequentially."
]
},
{
"cell_type": "markdown",
"id": "46c7b5ea",
"metadata": {},
"source": [
"**Serializable** schedule is one that is equivalent to at least one **serial schedule**. Some schedules are not serializable."
]
},
{
"cell_type": "markdown",
"id": "d8bed0d4",
"metadata": {},
"source": [
"\n",
" \n",
" \n",
" | transaction 1 | \n",
" transaction 2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | read(row X); | \n",
" | \n",
"
\n",
" \n",
" | write(row X); | \n",
" | \n",
"
\n",
" \n",
" | \n",
" write(row X); | \n",
"
\n",
" \n",
" | \n",
" commit; | \n",
"
\n",
" \n",
" | write(row X); | \n",
" | \n",
"
\n",
" \n",
" | commit; | \n",
" | \n",
"
\n",
" \n",
"
"
]
},
{
"cell_type": "markdown",
"id": "ab123243",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"When it can, the DBMS is going to allow concurrent operations performed by multiple transactions to execute without waiting. In our case, the DBMS could not know whether the transaction that had already written to a row in `wp_usermeta` is going to operate on it again. In this case, the DBMS had to hold the other transaction's operation for some time.\n",
"\n",
"We shall review different levels of transaction isolation later on."
]
},
{
"cell_type": "markdown",
"id": "989bc277",
"metadata": {},
"source": [
"### C — Consistency"
]
},
{
"cell_type": "markdown",
"id": "91273bf6",
"metadata": {},
"source": [
"When transaction transforms the data from one valid, consistent state to another."
]
},
{
"cell_type": "markdown",
"id": "d3b199aa",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"E.g., the sum of account balances in some bank-like database is the same after a money transfer is performed.\n",
"\n",
"The property of consistency is one for which the programmer is largely responsible, in contrast to other ACID properties that are guaranteed by the DBMS. However, the functionalities provided by DBMS can help ensure data consistency."
]
},
{
"cell_type": "markdown",
"id": "394cc429",
"metadata": {},
"source": [
"#### Deferred Integrity Checks"
]
},
{
"cell_type": "markdown",
"id": "99592551",
"metadata": {},
"source": [
"Ancient romans used to say \"Quis custodiet ipsos custodes?\"."
]
},
{
"cell_type": "markdown",
"id": "f83a81ec",
"metadata": {},
"source": [
"_notes_\n",
" \n",
"Assume that in some group (be it a company, a masonry, or whatever) every member has to be watched by some other member. Everyone is being verified for not doing nasty things and nobody has ultimate power."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e4497753",
"metadata": {},
"outputs": [],
"source": [
"!psql --port 25432 --quiet -c \"CREATE DATABASE loop WITH OWNER demo_user\" postgres || true\n",
"%sql postgresql://demo_user:demo_pwd@localhost:25432/loop"
]
},
{
"cell_type": "markdown",
"id": "0697dcf4",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"MariaDB lacks deferred integrity checks, so we have to rely on Postgres to make this example."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "34197736",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"DROP TABLE IF EXISTS members;\n",
"CREATE TABLE members (\n",
" member_id INT PRIMARY KEY,\n",
" name VARCHAR(80),\n",
" watched_by INT NOT NULL,\n",
" FOREIGN KEY (watched_by)\n",
" REFERENCES members (member_id)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b463c774",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"INSERT INTO members\n",
"VALUES (1, 'Theodore', 2);\n",
"\n",
"INSERT INTO members\n",
"VALUES (2, 'Pancratius', 3);\n",
"\n",
"INSERT INTO members\n",
"VALUES (3, 'Eugenia', 1);"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "099d6d57",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"ALTER TABLE members\n",
"DROP CONSTRAINT members_watched_by_fkey;\n",
"\n",
"ALTER TABLE members\n",
"ADD CONSTRAINT members_watched_by_fkey\n",
"FOREIGN KEY (watched_by)\n",
"REFERENCES members (member_id)\n",
"DEFERRABLE"
]
},
{
"cell_type": "markdown",
"id": "b6428a45",
"metadata": {},
"source": [
"```shell\n",
"psql --port=25432 --host=127.0.0.1 loop\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "e46ca13c",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Postgres shell.\n",
"\n",
"```sql\n",
"START TRANSACTION;\n",
"\n",
"SET CONSTRAINTS members_watched_by_fkey DEFERRED;\n",
"\n",
"-- The code.\n",
"\n",
"COMMIT;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "316d340e",
"metadata": {},
"source": [
"## The Autocommit Mode"
]
},
{
"cell_type": "markdown",
"id": "7b6aee00",
"metadata": {},
"source": [
"Where the DBMS or programming library issues an implicit `COMMIT;` after most commands.\n",
"\n",
"```sql\n",
"-- MariaDB-specific commands.\n",
"SET AUTOCOMMIT = 0;\n",
"SET AUTOCOMMIT = 1;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "846062d5",
"metadata": {},
"source": [
"## Transaction isolation levels"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c97f57d7",
"metadata": {},
"outputs": [],
"source": [
"%sql mysql:///agh_it_wordpress?unix_socket=/var/run/mysql/mysql.sock"
]
},
{
"cell_type": "markdown",
"id": "001dee07",
"metadata": {},
"source": [
"SQL standard allows loosening the isolation requirement. It specifies the following **transaction isolation levels**.\n",
"\n",
"- **serializable**\n",
"- **repeatable read**\n",
"- **read committed**\n",
"- **read uncommitted**"
]
},
{
"cell_type": "markdown",
"id": "60a3c0f9",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"The **serializable** level requires the DBMS to perform transaction operations according to a serializable schedule.\n",
"\n",
"We can consciously permit certain kinds of non-serial reads to increase the concurrent performance of our relational database server."
]
},
{
"cell_type": "markdown",
"id": "729f6634",
"metadata": {},
"source": [
"### \"Serializable\" Isolation Level"
]
},
{
"cell_type": "markdown",
"id": "1e3320d4",
"metadata": {},
"source": [
"The strongest isolation level.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f8c38b0a",
"metadata": {},
"source": [
"_note_\n",
"\n",
"Note that (at least under MariaDB) `SET TRANSACTION ISOLATION LEVEL` affects only the first transaction started (explicitly or implicitly) after this statement. Subsequent transactions are going to use the default isolation levele, unless `SET TRANSACTION ISOLATION LEVEL` ise issued again."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dca2e5bb",
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"%%sql\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key = 'nickname'"
]
},
{
"cell_type": "markdown",
"id": "2ffd86d1",
"metadata": {},
"source": [
"_notes_\n",
"Recall that the `wp_usermeta` table stores key-value pairs, including entries with users' nicknames.\n",
"\n",
"Start two MariaDB shells. In the first run the following.\n",
"\n",
"```sql\n",
"START TRANSACTION;\n",
"\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'teody'\n",
"WHERE (user_id, meta_key) = (2, 'nickname');\n",
"```\n",
"\n",
"Do not terminate the transaction yet. In the second shell run the following.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;\n",
"\n",
"START TRANSACTION;\n",
"\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key IN ('nickname', 'description');\n",
"```\n",
"\n",
"Notice that the transaction waits on a read operation. The DBMS mandates it to ensure serializability. We can issue `ROLLBACK;` in the first transaction to unblock the second one."
]
},
{
"cell_type": "markdown",
"id": "362db031",
"metadata": {},
"source": [
"### \"Repeatable Read\" Isolation Level"
]
},
{
"cell_type": "markdown",
"id": "fcfa6157",
"metadata": {},
"source": [
"When transaction reads a data **snapshot** made upon its **first** access to that data.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "5f7e8114",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"There might be subsequent commits to that data. But the transaction using repeatable read isolation shall see the old data every time it re-executes the read operation."
]
},
{
"cell_type": "markdown",
"id": "34f4184e",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"In a MariaDB shell execute the following.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;\n",
"\n",
"START TRANSACTION;\n",
"\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key IN ('nickname', 'description');\n",
"\n",
"SELECT ID, user_login, user_email\n",
"FROM wp_users;\n",
"```\n",
"\n",
"The data we see has been snapshotted. Assume this transaction is later going to update the description of the user to be of the form \"nickname (email)\"."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e5b84232",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE wp_users\n",
"SET user_email = 'theody@example.org'\n",
"WHERE ID = 2;\n",
"\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'theody'\n",
"WHERE (user_id, meta_key) = (2, 'nickname');\n",
"\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'theody (theody@example.org)'\n",
"WHERE (user_id, meta_key) = (2, 'description');"
]
},
{
"cell_type": "markdown",
"id": "65147a59",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Consider that in the meanwhile, another transaction modifies the data. Let's use Jupyter Notebook to simulate this transaction. It updates the email, nickname, and description of the user. Since we are using Jupyter Notebook, changes get automatically committed.\n",
"\n",
"Now, go back to the MariaDB shell and re-execute the reads. The just-committed writes are not visible. The transaction still operates on the data it has previously seen. Simulate its further execution. Run the following in the MariaDB shell.\n",
"\n",
"```sql\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'theodore (543210@student.agh.edu.pl)'\n",
"WHERE (user_id, meta_key) = (2, 'description');\n",
"\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'demo_user (dummy@domain.invalid)'\n",
"WHERE (user_id, meta_key) = (1, 'description');\n",
"\n",
"COMMIT;\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f3ad9489",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key IN ('nickname', 'description');"
]
},
{
"cell_type": "markdown",
"id": "027ae24e",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"We see that the effect we got does not correspond to any possible serial execution of the transactions. In a serial case, we'd always get \"theody (theody@example.org)\" as the description. We did not, because the repeatable read isolation level does permit some degrree of interference between transactions (like overwriting of data).\n",
"\n",
"Note that repeatable read (not the serializable level!) is the default isolation level in MariaDB. Our statements run from this notebook are also treated as being parts of repeatable read transactions.\n",
"\n",
"Besides the possibility of transactions improperly overwriting each other's changes, the SQL standard also permist the DBMS to allow **phantom reads** in repeatable read transactions. This occurs when a transaction re-executes a query and sees row insertions or deletions committed by other transactions in the meantime. It cannot see changes to the rows it has already read (otherwise, it would not be a \"repeatable read\" transaction). The only changes it sees are related to \"phantom\" rows, hence the name of the phenomena. By default, both Postgres an MariaDB disallow pahntom reads even in repeatable read transactions. Such application of stricter isolation policies is permitted by the SQL standard."
]
},
{
"cell_type": "markdown",
"id": "3fab1e77",
"metadata": {},
"source": [
"### Nonrepeatable Reads and \"Read Committed\" Isolation Level"
]
},
{
"cell_type": "markdown",
"id": "cadfea4c",
"metadata": {},
"source": [
"When subsequent reads inside a transaction see changes made by other transactions.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL READ COMMITTED;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1d8be81b",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Transaction isolation levels of repeatable read and serializable disallow nonrepeatable reads. Transaction level of read committed allows them, but others' changes must be committed to be seen by the transaction."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "745a2d06",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'a:1:{s:11:\"contributor\";b:1;}'\n",
"-- SET meta_value = 'a:1:{s:13:\"administrator\";b:1;}'\n",
"WHERE (user_id, meta_key) = (2, 'wp_capabilities');"
]
},
{
"cell_type": "markdown",
"id": "d6d7acf3",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"In a MariaDB shell run the following.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL READ COMMITTED;\n",
"\n",
"START TRANSACTION;\n",
"\n",
"SELECT *\n",
"FROM wp_usermeta\n",
"WHERE meta_key = 'wp_capabilities';\n",
"```\n",
"\n",
"Execute the statement in the cell above to change the role of user 2. Remember that ipython-sql automatically commits such changes. Re-run the `SELECT` command in the MariaDB shell, within the transaction with read commited isolation level. See how all committed changes are visible to the transaction. It also uses **data snapshots**, but a new snapshot is made each time it performs a read operation.\n",
"\n",
"Read committed is the default isolation level in Postgres. Read committed and repeatable read are the most commonly used isolation levels."
]
},
{
"cell_type": "markdown",
"id": "3915260e",
"metadata": {},
"source": [
"### Dirty Reads and \"Read Uncommitted\" Isolation Level"
]
},
{
"cell_type": "markdown",
"id": "b55d801d",
"metadata": {},
"source": [
"\n",
" \n",
" \n",
" | transaction 1 | \n",
" transaction 2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | write(row X); | \n",
" | \n",
"
\n",
" \n",
" | \n",
" read(row X); | \n",
"
\n",
" \n",
" | \n",
" commit; | \n",
"
\n",
" \n",
" | commit; | \n",
" | \n",
"
\n",
" \n",
"
\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1f52a54b",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"A **dirty read** occurs when a transaction is able to read data that has not yet been committed.\n",
"\n",
"Transaction isolation levels of read committed and higher disallow dirty reads."
]
},
{
"cell_type": "markdown",
"id": "ec3278ee",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Start two MariaDB shells. In the first, run the following.\n",
"\n",
"```sql\n",
"SET TRANSACTION ISOLATION LEVEL READ COMMITTED;\n",
"\n",
"START TRANSACTION;\n",
"\n",
"SELECT ID, post_author, post_content, post_title FROM wp_posts;\n",
"```\n",
"\n",
"In the second, run the following.\n",
"\n",
"```sql\n",
"START TRANSACTION;\n",
"\n",
"UPDATE wp_posts\n",
"SET post_content = 'Welcome to WordPress. Ooops, you have been hacked.
'\n",
"WHERE ID = 1;\n",
"```\n",
"\n",
"Re-execute the query in the first shell. The change cannot be seen. The transaction that our `SELECT` statement runs in uses a snapshot of table's data. Now, start a new transaction in the first shell — this time with the isolation level of read uncommitted. See how uncommitted writes are seen, even though they can still get rolled back by the other transaction that performed them."
]
},
{
"cell_type": "markdown",
"id": "71d87dc7",
"metadata": {},
"source": [
"### Dirty writes"
]
},
{
"cell_type": "markdown",
"id": "02faddb0",
"metadata": {},
"source": [
"\n",
" \n",
" \n",
" | transaction 1 | \n",
" transaction 2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | write(row X); | \n",
" | \n",
"
\n",
" \n",
" | \n",
" write(row X); | \n",
"
\n",
" \n",
" | \n",
" commit; | \n",
"
\n",
" \n",
" | commit; | \n",
" | \n",
"
\n",
" \n",
"
"
]
},
{
"cell_type": "markdown",
"id": "e42c13f1",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"A **dirty write** would occur if a transaction overwrote data that another still active transaction had previously written.\n",
"\n",
"All SQL transaction isolation levels disallow dirty writes."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ba3455cf",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT user_login, user_nicename, display_name\n",
"FROM wp_users;"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f8ba0c7c",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM wp_usermeta\n",
"WHERE meta_key REGEXP 'name$' OR\n",
" meta_key = 'description';"
]
},
{
"cell_type": "markdown",
"id": "a7bd28b0",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"Note that MariaDB uses `REGEX` instead of tilde (`~`) as the regular expression match operator.\n",
"\n",
"Recall that there are multiple places where user's name, nickname and description is kept in the WordPress database. Assume two processes want to change the name of a user. They access the `wp_users` and `wp_usermeta` tables, albeit in different order.\n",
"\n",
"Open two MariaDB shells in a terminal, each with\n",
"\n",
"```shell\n",
"mariadb --port=23306 --host=127.0.0.1 \\\n",
" -pdemo_pwd agh_it_wordpress\n",
"```\n",
"\n",
"In both shells set the default transaction isolation level to the lowest possible one — `READ UNCOMMITTED`.\n",
"\n",
"```sql\n",
"SET @@SESSION.tx_isolation = 'READ-UNCOMMITTED';\n",
"```\n",
"\n",
"Note how assignment to this (MariaDB specific) variable allows us to change the default isolation level for a session instead of using the `SET TRANSACTION ISOLATION LEVEL` command.\n",
"\n",
"In the first shell, execute the following.\n",
"\n",
"```sql\n",
"-- This is transaction 1.\n",
"START TRANSACTION;\n",
"\n",
"UPDATE wp_users\n",
"SET user_login = 'theodore',\n",
" user_nicename = 'theodore',\n",
" display_name = 'Theodore'\n",
"WHERE ID = 1;\n",
"```\n",
"\n",
"In the second, execute the following.\n",
"\n",
"```sql\n",
"-- This is transaction 2 (simultaneous to 1).\n",
"START TRANSACTION;\n",
"\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'demo_account'\n",
"WHERE (user_id, meta_key) = (1, 'nickname');\n",
"\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'Some demonstrative admin account.'\n",
"WHERE (user_id, meta_key) = (1, 'description');\n",
"\n",
"UPDATE wp_users\n",
"SET user_nicename = 'Demo Account',\n",
" display_name = 'Demo Account'\n",
"WHERE ID = 1;\n",
"```\n",
"\n",
"The second shell hanged. The `UPDATE` operation on `wp_users` is waiting for a lock held by transaction 1 to be released. Now, try to modify the `wp_usermeta` data in transaction 1.\n",
"\n",
"```sql\n",
"-- This is transaction 1.\n",
"UPDATE wp_usermeta\n",
"SET meta_value = 'A teddy penguin student.'\n",
"WHERE (user_id, meta_key) = (1, 'description');\n",
"```\n",
"\n",
"This operation should also hang, waiting for transaction 2 to finish. But since the two transactions are trying to wait for each other, the DBMS detects a **deadlock** condition and rolls back transaction 1."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "/gnu/store/q35bxk1i5blx0rcgxphhrq9xqmgha9bz-python-3.11.11/bin/python3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}