{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "a21adafc",
"metadata": {},
"outputs": [],
"source": [
"import agh_db_lectures\n",
"agh_db_lectures.prepare_notebook_for_sql()"
]
},
{
"cell_type": "markdown",
"id": "511c9ea5",
"metadata": {},
"source": [
"# Simple SQL queries"
]
},
{
"cell_type": "markdown",
"id": "b6391087",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- SQL consists of\n",
" - a Data Definition Language,\n",
" - a Data Manipulation Language, and\n",
" - a Data Query Language, which we begin covering here.\n",
"- E.g., \"How to obtain the desired information from the DB\"."
]
},
{
"cell_type": "markdown",
"id": "93389e26",
"metadata": {},
"source": [
"## The Northwind database"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "40b6824b",
"metadata": {},
"outputs": [],
"source": [
"agh_db_lectures.nw_diagram.download_display()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "73dc5561",
"metadata": {},
"outputs": [],
"source": [
"# Open the diagram in an image viewer for more convenience.\n",
"agh_db_lectures.nw_diagram.download_open()"
]
},
{
"cell_type": "markdown",
"id": "67d63ccb",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- For now, we shall assume the existence of a DB populated with data.\n",
"- Northwind: a learning (_toy_) database of a warehouse.\n",
"- Made by Microsoft but available under a free license (you can use, modify and share it further)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8a146ef1",
"metadata": {},
"outputs": [],
"source": [
"# Connect to the (locally-hosted) database.\n",
"%sql postgresql://demo_user:demo_pwd@localhost:25432/agh_it_northwind\n",
"# %sql mysql:///agh_it_northwind?unix_socket=/var/run/mysql/mysql.sock"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "052b68af",
"metadata": {},
"outputs": [],
"source": [
"agh_db_lectures.download_restore_nw_postgres_dump()\n",
"# agh_db_lectures.download_restore_nw_mariadb_dump()"
]
},
{
"cell_type": "markdown",
"id": "514cc38e",
"metadata": {},
"source": [
"## The \"SELECT\" clause"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0655d75f",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"-- One of the simplest possible queries in SQL.\n",
"SELECT 1;"
]
},
{
"cell_type": "markdown",
"id": "5a1aafb8",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- The result of an SQL query is always a table.\n",
"- It can be minimal or even empty, though.\n",
"- Comments start with double minus sign (\"--\").\n",
"- The amount of white-space doesn't matter…\n",
" - …except where at least a single white-space character is needed to separate words.\n",
"- A query can be terminated with a semicolon (\";\").\n",
" - It is optional in many cases but necessary in some cases.\n",
" - Examples: a script with many SQL commands, an SQL REPL.\n",
"- More:\n",
" - `SELECT 'This is a string.'`, and\n",
" - `SELECT 43.5`."
]
},
{
"cell_type": "markdown",
"id": "dceb1ebc",
"metadata": {},
"source": [
"### The \"FROM\" clause"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6d376dcf",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT -- FILL ME\n",
"FROM products"
]
},
{
"cell_type": "markdown",
"id": "b56e951a",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Theodore, please pick 3 attributes from the `products` table."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da4ddfd9",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT * FROM PRODUCTS"
]
},
{
"cell_type": "markdown",
"id": "51c33a98",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- The asterisk (\"*\") allows us to select all columns of a table.\n",
"- Case also does not matter in table names.\n",
"- Again, SQL is free-form, we can apply various code formatting conventions, there is no \"one to rule them all\"."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e47b2338",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT product_name, unit_price * units_in_stock\n",
"FROM PRODUCTS"
]
},
{
"cell_type": "markdown",
"id": "87e20cce",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Arithmetic operations (and string ops, etc.) can be used with table column names.\n",
"- The syntax `unit_price * units_in_stock AS value_in_stock` can be used to rename the column.\n",
"- The `AS` can usually be omitted.\n",
"- Theodore, what does double quote character («\"») do in your favorite programming language?\n",
"- A column name that contains white-space or is the same as language keyword can be wrapped in doube quotes, `unit_price * units_in_stock AS \"SELECT\" -- <- nonsensical but valid`."
]
},
{
"cell_type": "markdown",
"id": "73f81ca8",
"metadata": {},
"source": [
"### The \"WHERE\" clause"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b2f14419",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT product_name, quantity_per_unit, unit_price\n",
"FROM products\n",
"WHERE unit_price < 10"
]
},
{
"cell_type": "markdown",
"id": "b518da46",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- `WHERE` limits result rows.\n",
"- Comparison operators `<`, `>`, `<=`, `>=`, `=`, and `<>`.\n",
"- Logical operators `AND`, and `OR`.\n",
"- Column used in `WHERE` need not appear in "
]
},
{
"cell_type": "markdown",
"id": "ca6c0d70",
"metadata": {},
"source": [
"#### The \"LIKE\" operator"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fccef75e",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT product_name, quantity_per_unit, unit_price\n",
"FROM products\n",
"WHERE quantity_per_unit LIKE '%bottle%'"
]
},
{
"cell_type": "markdown",
"id": "b85484c7",
"metadata": {},
"source": [
"#### Regular expression test"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "617ed062",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT product_name, quantity_per_unit, unit_price\n",
"FROM products\n",
"WHERE quantity_per_unit ~ 'box(es)?$'"
]
},
{
"cell_type": "markdown",
"id": "4ffa401a",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Using `$` we excluded the rows that can be viewed with `quantity_per_unit ~ 'box(es)? '`.\n",
"- Additionally, strings can also be compared\n",
" - for equality with `<>` and `=`, and\n",
" - for lexicographic order with `<` & friends."
]
},
{
"cell_type": "markdown",
"id": "54ab8145",
"metadata": {},
"source": [
"### Set operations"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aebe714f",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT supplier_id\n",
"FROM products\n",
"WHERE quantity_per_unit LIKE '%can%'\n",
"\n",
"-- SELECT supplier_id\n",
"-- FROM suppliers\n",
"-- WHERE country = 'Germany'"
]
},
{
"cell_type": "markdown",
"id": "fd9957ed",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Suppliers that sell canned products.\n",
"- Suppliers from Germany.\n",
"- The operations\n",
" - `UNION`,\n",
" - `INTERSECT`,\n",
" - `EXCEPT`.\n",
"- Columns with types need to match."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "08b57b1e",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT supplier_id from suppliers where country = 'Germany'"
]
},
{
"cell_type": "markdown",
"id": "7b338004",
"metadata": {},
"source": [
"## The \"DISTINCT\" keyword"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9192b00c",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"\n",
"SELECT country\n",
"FROM suppliers"
]
},
{
"cell_type": "markdown",
"id": "a1e59c68",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Change `SELECT` → `SELECT DISTINCT`."
]
},
{
"cell_type": "markdown",
"id": "9d656206",
"metadata": {},
"source": [
"### Different value types"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5cd1ec6a",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT 'This is a string.'\n",
"-- select 43.5\n",
"-- SeLeCt CuRrEnT_dAtE\n",
"-- SELECT NOW()\n",
"-- SELECT TRUE"
]
},
{
"cell_type": "markdown",
"id": "9d96722c",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- We can experiments with different datatypes like\n",
" - floating-point numbers,\n",
" - `SELECT DATE '2025-10-07'` — dates,\n",
" - `SELECT TIME '12:00'` — times,\n",
" - also with time zone (`SELECT TIME WITH TIME ZONE '12:00+02:00'`, this is a different type in standard SQL),\n",
" - `SELECT TIMESTAMP '2025-10-14 12:06:49.599321'` — timestamps,\n",
" - also with time zone, analogously to times,\n",
" - `SELECT INTERVAL '1-2 3 4:05:06'` – time intervals, and\n",
" - `FALSE` — booleans.\n",
"- Even authors of DBMSes say the date/time types are not well designed.\n",
"- Character case does not matter in SQL keywords."
]
},
{
"cell_type": "markdown",
"id": "aeccfa26",
"metadata": {},
"source": [
"### Relevant operators and builtin functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5d64e30c",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%%sql\n",
"SELECT 'This is ' || 'a concatenated ' || 'string.'\n",
"-- SELECT ROUND(4.5)\n",
"-- SELECT NOW() - INTERVAL '7 DAYS'"
]
},
{
"cell_type": "markdown",
"id": "b9e61b9e",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- What does the pipe character (\"|\") do in your favorite programming language?\n",
"- SQL features some _builtin functions_ for certain tasks.\n",
" - `SELECT SQRT(4*4 + 3*3)`\n",
" - `SELECT POW(POW(5, 2) - POW(4, 2), 0.5)`\n",
"- Rounding:\n",
" - `ROUND()`,\n",
" - `FLOOR()`, and\n",
" - `CEIING()` (or just `CEIL()`.\n",
"- Timestamp to date, etc.:\n",
" - `SELECT DATE (NOW() - INTERVAL '7 DAYS')`\n",
"- `NOT`\n",
"- `SELECT DATE('now') BETWEEN '2025-10-01' AND '2025-11-01'`\n",
" - Also `NOT BETWEEN`.\n",
" - Inclusive."
]
},
{
"cell_type": "markdown",
"id": "7402c992",
"metadata": {},
"source": [
"### The \"IN\" operator"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5659301b",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT company_name, country\n",
"FROM customers\n",
"WHERE country = 'Poland' OR country = 'Germany' OR country = 'France'"
]
},
{
"cell_type": "markdown",
"id": "df5a1027",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Change the condition to `country IN ('Poland', 'Germany', 'France')`."
]
},
{
"cell_type": "markdown",
"id": "be5d5fde",
"metadata": {},
"source": [
"### The \"CASE\" form"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19678724",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT order_id,\n",
" shipped_date,\n",
" shipped_date - order_date AS time_to_shipped\n",
"-- CASE shipped_date - order_date\n",
"-- WHEN 1 THEN '1 day'\n",
"-- WHEN 7 THEN '1 week'\n",
"-- ELSE shipped_date - order_date || ' days'\n",
"-- END AS time_to_shipped\n",
"FROM orders"
]
},
{
"cell_type": "markdown",
"id": "6e73d1ac",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- There are 2 types of `CASE` expression that can be used for conditional logic.\n",
"- First is simple `CASE` that matches a value against a list of other possible values."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "928767ee",
"metadata": {},
"outputs": [],
"source": [
"%%sql\n",
"SELECT order_id,\n",
" shipped_date,\n",
" CASE\n",
" WHEN shipped_date - order_date = 1 THEN\n",
" '1 day'\n",
" WHEN shipped_date - order_date < 7 THEN\n",
" (shipped_date - order_date) || ' days'\n",
" WHEN shipped_date - order_date < 14 THEN\n",
" '1 week'\n",
" WHEN shipped_date - order_date < 28 THEN\n",
" (shipped_date - order_date) / 7 || ' weeks'\n",
" WHEN shipped_date - order_date < 56 THEN\n",
" '1 month'\n",
" ELSE\n",
" (shipped_date - order_date) / 28 || ' months'\n",
" END AS time_to_shipped\n",
"FROM orders"
]
},
{
"cell_type": "markdown",
"id": "328f0710",
"metadata": {},
"source": [
"_notes_\n",
"\n",
"- Second is \"searched\" `CASE` that searches a list of expressions for a truish one.\n",
"- It can be also thought of as a simple `CASE` that matches a value of `TRUE` against a list of other possible values.\n",
" - Try adding `TRUE` after `CASE` in this example :)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "/gnu/store/q35bxk1i5blx0rcgxphhrq9xqmgha9bz-python-3.11.11/bin/python3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}