{ "cells": [ { "cell_type": "code", "execution_count": null, "id": "a21adafc", "metadata": {}, "outputs": [], "source": [ "import agh_db_lectures\n", "agh_db_lectures.prepare_notebook_for_sql()" ] }, { "cell_type": "markdown", "id": "511c9ea5", "metadata": {}, "source": [ "# Simple SQL queries" ] }, { "cell_type": "markdown", "id": "b6391087", "metadata": {}, "source": [ "_notes_\n", "\n", "- SQL consists of\n", " - a Data Definition Language,\n", " - a Data Manipulation Language, and\n", " - a Data Query Language, which we begin covering here.\n", "- E.g., \"How to obtain the desired information from the DB\"." ] }, { "cell_type": "markdown", "id": "93389e26", "metadata": {}, "source": [ "## The Northwind database" ] }, { "cell_type": "code", "execution_count": null, "id": "40b6824b", "metadata": {}, "outputs": [], "source": [ "agh_db_lectures.nw_diagram.download_display()" ] }, { "cell_type": "code", "execution_count": null, "id": "73dc5561", "metadata": {}, "outputs": [], "source": [ "# Open the diagram in an image viewer for more convenience.\n", "agh_db_lectures.nw_diagram.download_open()" ] }, { "cell_type": "markdown", "id": "67d63ccb", "metadata": {}, "source": [ "_notes_\n", "\n", "- For now, we shall assume the existence of a DB populated with data.\n", "- Northwind: a learning (_toy_) database of a warehouse.\n", "- Made by Microsoft but available under a free license (you can use, modify and share it further)." ] }, { "cell_type": "code", "execution_count": null, "id": "8a146ef1", "metadata": {}, "outputs": [], "source": [ "# Connect to the (locally-hosted) database.\n", "%sql postgresql://demo_user:demo_pwd@localhost:25432/agh_it_northwind" ] }, { "cell_type": "code", "execution_count": null, "id": "052b68af", "metadata": {}, "outputs": [], "source": [ "agh_db_lectures.download_restore_nw_postgres_dump()" ] }, { "cell_type": "markdown", "id": "514cc38e", "metadata": {}, "source": [ "## The \"SELECT\" clause" ] }, { "cell_type": "code", "execution_count": null, "id": "0655d75f", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "-- One of the simplest possible queries in SQL.\n", "SELECT 1;" ] }, { "cell_type": "markdown", "id": "5a1aafb8", "metadata": {}, "source": [ "_notes_\n", "\n", "- The result of an SQL query is always a table.\n", "- It can be minimal or even empty, though.\n", "- Comments start with double minus sign (\"--\").\n", "- The amount of white-space doesn't matter…\n", " - …except where at least a single white-space character is needed to separate words.\n", "- A query can be terminated with a semicolon (\";\").\n", " - It is optional in many cases but necessary in some cases.\n", " - Examples: a script with many SQL commands, an SQL REPL.\n", "- More:\n", " - `SELECT 'This is a string.'`, and\n", " - `SELECT 43.5`." ] }, { "cell_type": "markdown", "id": "dceb1ebc", "metadata": {}, "source": [ "### The \"FROM\" clause" ] }, { "cell_type": "code", "execution_count": null, "id": "6d376dcf", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT -- FILL ME\n", "FROM products" ] }, { "cell_type": "markdown", "id": "b56e951a", "metadata": {}, "source": [ "_notes_\n", "\n", "- Theodore, please pick 3 attributes from the `products` table." ] }, { "cell_type": "code", "execution_count": null, "id": "da4ddfd9", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM PRODUCTS" ] }, { "cell_type": "markdown", "id": "51c33a98", "metadata": {}, "source": [ "_notes_\n", "\n", "- The asterisk (\"*\") allows us to select all columns of a table.\n", "- Case also does not matter in table names.\n", "- Again, SQL is free-form, we can apply various code formatting conventions, there is no \"one to rule them all\"." ] }, { "cell_type": "code", "execution_count": null, "id": "e47b2338", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT product_name, unit_price * units_in_stock\n", "FROM PRODUCTS" ] }, { "cell_type": "markdown", "id": "87e20cce", "metadata": {}, "source": [ "_notes_\n", "\n", "- Arithmetic operations (and string ops, etc.) can be used with table column names.\n", "- The syntax `unit_price * units_in_stock AS value_in_stock` can be used to rename the column.\n", "- The `AS` can usually be omitted.\n", "- Theodore, what does double quote character («\"») do in your favorite programming language?\n", "- A column name that contains white-space or is the same as language keyword can be wrapped in doube quotes, `unit_price * units_in_stock AS \"SELECT\" -- <- nonsensical but valid`." ] }, { "cell_type": "markdown", "id": "73f81ca8", "metadata": {}, "source": [ "### The \"WHERE\" clause" ] }, { "cell_type": "code", "execution_count": null, "id": "b2f14419", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT product_name, quantity_per_unit, unit_price\n", "FROM products\n", "WHERE unit_price < 10" ] }, { "cell_type": "markdown", "id": "b518da46", "metadata": {}, "source": [ "_notes_\n", "\n", "- `WHERE` limits result rows.\n", "- Comparison operators `<`, `>`, `<=`, `>=`, `=`, and `<>`.\n", "- Logical operators `AND`, and `OR`.\n", "- Column used in `WHERE` need not appear in " ] }, { "cell_type": "markdown", "id": "ca6c0d70", "metadata": {}, "source": [ "#### The \"LIKE\" operator" ] }, { "cell_type": "code", "execution_count": null, "id": "fccef75e", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT product_name, quantity_per_unit, unit_price\n", "FROM products\n", "WHERE quantity_per_unit LIKE '%bottle%'" ] }, { "cell_type": "markdown", "id": "b85484c7", "metadata": {}, "source": [ "#### Regular expression test" ] }, { "cell_type": "code", "execution_count": null, "id": "617ed062", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT product_name, quantity_per_unit, unit_price\n", "FROM products\n", "WHERE quantity_per_unit ~ 'box(es)?$'" ] }, { "cell_type": "markdown", "id": "4ffa401a", "metadata": {}, "source": [ "_notes_\n", "\n", "- Using `$` we excluded the rows that can be viewed with `quantity_per_unit ~ 'box(es)? '`.\n", "- Additionally, strings can also be compared\n", " - for equality with `<>` and `=`, and\n", " - for lexicographic order with `<` & friends." ] }, { "cell_type": "markdown", "id": "54ab8145", "metadata": {}, "source": [ "### Set operations" ] }, { "cell_type": "code", "execution_count": null, "id": "aebe714f", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT supplier_id\n", "FROM products\n", "WHERE quantity_per_unit LIKE '%can%'\n", "\n", "-- SELECT supplier_id\n", "-- FROM suppliers\n", "-- WHERE country = 'Germany'" ] }, { "cell_type": "markdown", "id": "fd9957ed", "metadata": {}, "source": [ "_notes_\n", "\n", "- Suppliers that sell canned products.\n", "- Suppliers from Germany.\n", "- The operations\n", " - `UNION`,\n", " - `INTERSECT`,\n", " - `EXCEPT`.\n", "- Columns with types need to match." ] }, { "cell_type": "code", "execution_count": null, "id": "08b57b1e", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT supplier_id from suppliers where country = 'Germany'" ] }, { "cell_type": "markdown", "id": "7b338004", "metadata": {}, "source": [ "## The \"DISTINCT\" keyword" ] }, { "cell_type": "code", "execution_count": null, "id": "9192b00c", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "\n", "SELECT country\n", "FROM suppliers" ] }, { "cell_type": "markdown", "id": "a1e59c68", "metadata": {}, "source": [ "_notes_\n", "\n", "- Change `SELECT` → `SELECT DISTINCT`." ] }, { "cell_type": "markdown", "id": "9d656206", "metadata": {}, "source": [ "### Different value types" ] }, { "cell_type": "code", "execution_count": null, "id": "5cd1ec6a", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT 'This is a string.'\n", "-- select 43.5\n", "-- SeLeCt CuRrEnT_dAtE\n", "-- SELECT NOW()\n", "-- SELECT TRUE" ] }, { "cell_type": "markdown", "id": "9d96722c", "metadata": {}, "source": [ "_notes_\n", "\n", "- We can experiments with different datatypes like\n", " - floating-point numbers,\n", " - `SELECT DATE '2025-10-07'` — dates,\n", " - `SELECT TIME '12:00'` — times,\n", " - also with time zone (`SELECT TIME WITH TIME ZONE '12:00+02:00'`, this is a different type in standard SQL),\n", " - `SELECT TIMESTAMP '2025-10-14 12:06:49.599321'` — timestamps,\n", " - also with time zone, analogously to times,\n", " - `SELECT INTERVAL '1-2 3 4:05:06'` – time intervals, and\n", " - `FALSE` — booleans.\n", "- Even authors of DBMSes say the date/time types are not well designed.\n", "- Character case does not matter in SQL keywords." ] }, { "cell_type": "markdown", "id": "aeccfa26", "metadata": {}, "source": [ "### Relevant operators and builtin functions" ] }, { "cell_type": "code", "execution_count": null, "id": "5d64e30c", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%sql\n", "SELECT 'This is ' || 'a concatenated ' || 'string.'\n", "-- SELECT ROUND(4.5)\n", "-- SELECT NOW() - INTERVAL '7 DAYS'" ] }, { "cell_type": "markdown", "id": "b9e61b9e", "metadata": {}, "source": [ "_notes_\n", "\n", "- What does the pipe character (\"|\") do in your favorite programming language?\n", "- SQL features some _builtin functions_ for certain tasks.\n", " - `SELECT SQRT(4*4 + 3*3)`\n", " - `SELECT POW(POW(5, 2) - POW(4, 2), 0.5)`\n", "- Rounding:\n", " - `ROUND()`,\n", " - `FLOOR()`, and\n", " - `CEIING()` (or just `CEIL()`.\n", "- Timestamp to date, etc.:\n", " - `SELECT DATE (NOW() - INTERVAL '7 DAYS')`\n", "- `NOT`\n", "- `SELECT DATE('now') BETWEEN '2025-10-01' AND '2025-11-01'`\n", " - Also `NOT BETWEEN`.\n", " - Inclusive." ] }, { "cell_type": "markdown", "id": "7402c992", "metadata": {}, "source": [ "### The \"IN\" operator" ] }, { "cell_type": "code", "execution_count": null, "id": "5659301b", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT company_name, country\n", "FROM customers\n", "WHERE country = 'Poland' OR country = 'Germany' OR country = 'France'" ] }, { "cell_type": "markdown", "id": "df5a1027", "metadata": {}, "source": [ "_notes_\n", "\n", "- Change the condition to `country IN ('Poland', 'Germany', 'France')`." ] }, { "cell_type": "markdown", "id": "be5d5fde", "metadata": {}, "source": [ "### The \"CASE\" form" ] }, { "cell_type": "code", "execution_count": null, "id": "19678724", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT order_id,\n", " shipped_date,\n", " shipped_date - order_date AS time_to_shipped\n", "-- CASE shipped_date - order_date\n", "-- WHEN 1 THEN '1 day'\n", "-- WHEN 7 THEN '1 week'\n", "-- ELSE shipped_date - order_date || ' days'\n", "-- END AS time_to_shipped\n", "FROM orders" ] }, { "cell_type": "markdown", "id": "6e73d1ac", "metadata": {}, "source": [ "_notes_\n", "\n", "- There are 2 types of `CASE` expression that can be used for conditional logic.\n", "- First is simple `CASE` that matches a value against a list of other possible values." ] }, { "cell_type": "code", "execution_count": null, "id": "928767ee", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT order_id,\n", " shipped_date,\n", " CASE\n", " WHEN shipped_date - order_date = 1 THEN\n", " '1 day'\n", " WHEN shipped_date - order_date < 7 THEN\n", " (shipped_date - order_date) || ' days'\n", " WHEN shipped_date - order_date < 14 THEN\n", " '1 week'\n", " WHEN shipped_date - order_date < 28 THEN\n", " (shipped_date - order_date) / 7 || ' weeks'\n", " WHEN shipped_date - order_date < 56 THEN\n", " '1 month'\n", " ELSE\n", " (shipped_date - order_date) / 28 || ' months'\n", " END AS time_to_shipped\n", "FROM orders" ] }, { "cell_type": "markdown", "id": "328f0710", "metadata": {}, "source": [ "_notes_\n", "\n", "- Second is \"searched\" `CASE` that searches a list of expressions for a truish one.\n", "- It can be also thought of as a simple `CASE` that matches a value of `TRUE` against a list of other possible values.\n", " - Try adding `TRUE` after `CASE` in this example :)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "/gnu/store/q35bxk1i5blx0rcgxphhrq9xqmgha9bz-python-3.11.11/bin/python3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }