{ "cells": [ { "cell_type": "code", "execution_count": null, "id": "42ee79ee", "metadata": {}, "outputs": [], "source": [ "import agh_db_lectures\n", "agh_db_lectures.prepare_notebook_for_sql()" ] }, { "cell_type": "markdown", "id": "c23a12a4", "metadata": {}, "source": [ "# Object Orientation" ] }, { "cell_type": "markdown", "id": "0a226718", "metadata": {}, "source": [ "Program code of a git forge could have a class hierarchy like this.\n", "\n", "![](classes-uml.svg)" ] }, { "cell_type": "code", "execution_count": null, "id": "f1ea88e7", "metadata": {}, "outputs": [], "source": [ "class Namespace:\n", " def __init__(self, name):\n", " self.name = name\n", " \n", "class User(Namespace):\n", " def __init__(self, name, hashed_pwd, email):\n", " super().__init__(name)\n", " self.hashed_pwd = hashed_pwd\n", " self.email = email\n", "\n", "class Group(Namespace):\n", " def __init__(self, name, members):\n", " super().__init__(name)\n", " self.members = members\n", "\n", "class Repo:\n", " def __init__(self, name, owner_namespace):\n", " self.name = name\n", " self.owner = owner_namespace" ] }, { "cell_type": "markdown", "id": "c8309971", "metadata": {}, "source": [ "_notes_\n", "\n", "OOP works in terms of classes / objects and relationships between them. SQL databases (the most prominent kind of databases) work in terms of tables and foreign key relationships.\n", "\n", "Attempts to develop database technologies that are based on objects rather than tables have spectacularly failed to conquer the industry.\n", "\n", "However, there are some technologies / tools that combine relational databases and object orientation." ] }, { "cell_type": "markdown", "id": "43cc4e33", "metadata": {}, "source": [ "## Object-Relational Mapping" ] }, { "cell_type": "markdown", "id": "364b263c", "metadata": {}, "source": [ "_notes_\n", "\n", "ORM is a framework that allows defining object types in a programming language that are automatically mapped to tables in a relational database.\n", "\n", "ORMs provide high-level object-oriented APIs for interfacing with an underlying database. They are often capable of abstracting away the DBMS incompatibilities (so that, e.g., the same codebase can work with both Postgres and MariaDB)." ] }, { "cell_type": "markdown", "id": "7bcf2668", "metadata": {}, "source": [ "Some of the most popular ORMs are\n", "\n", "- Hibernate (Java)\n", "- Entity Framework (C#)\n", "- **Django (Python)**" ] }, { "cell_type": "markdown", "id": "b2582d2a", "metadata": {}, "source": [ "_notes_\n", "\n", "We shall use Django as our reference ORM." ] }, { "cell_type": "markdown", "id": "7cd45d48", "metadata": {}, "source": [ "### A Bare Django Project" ] }, { "cell_type": "code", "execution_count": null, "id": "2bca9f1e", "metadata": {}, "outputs": [], "source": [ "!psql --port 25432 --quiet -c \"DROP DATABASE object_orientation\" postgres || true\n", "!psql --port 25432 --quiet -c \"CREATE DATABASE object_orientation WITH OWNER demo_user\" postgres || true\n", "%sql postgresql://demo_user:demo_pwd@localhost:25432/object_orientation" ] }, { "cell_type": "code", "execution_count": null, "id": "99e9799c", "metadata": {}, "outputs": [], "source": [ "agh_db_lectures.setup_django_project()" ] }, { "cell_type": "code", "execution_count": null, "id": "16c4fa3b", "metadata": {}, "outputs": [], "source": [ "!ls . demo_site forge" ] }, { "cell_type": "markdown", "id": "97a358c1", "metadata": {}, "source": [ "_notes_\n", "\n", "Django is tailored towards served-side web development. A single Django project (here, `demo_site`) can have one or more \"apps\" in it (here, there's a single app, `forge`)." ] }, { "cell_type": "code", "execution_count": null, "id": "265644e3", "metadata": {}, "outputs": [], "source": [ "!tail -10 demo_site/settings.py" ] }, { "cell_type": "markdown", "id": "82ee9c1a", "metadata": {}, "source": [ "_notes_\n", "\n", "We've configured Django to connect to the same db that we access from this notebook.\n", "\n", "Note that we specified the db backend for Django to use. Django now knows to use a Postgres db connector, Postgres SQL syntax and types when interacting with the db." ] }, { "cell_type": "markdown", "id": "0ca8ff71", "metadata": {}, "source": [ "#### Django Database Initialization (Initial Migration)" ] }, { "cell_type": "markdown", "id": "3cd86d56", "metadata": {}, "source": [ "_notes_\n", "\n", "Django manages the tables for us. We do not need to (in fact, should not, unless we know what we're doing) modify the database schema directly." ] }, { "cell_type": "code", "execution_count": null, "id": "8188f068", "metadata": {}, "outputs": [], "source": [ "!python3 manage.py migrate" ] }, { "cell_type": "code", "execution_count": null, "id": "4d0d9051", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM pg_tables WHERE schemaname = 'public'" ] }, { "cell_type": "markdown", "id": "ba91910d", "metadata": {}, "source": [ "_notes_\n", "\n", "We see that Django has created some 10 tables for itself." ] }, { "cell_type": "markdown", "id": "82bbc896", "metadata": {}, "source": [ "### A Basic Django Data Model" ] }, { "cell_type": "markdown", "id": "b81ea931", "metadata": {}, "source": [ "Two basic modes of ORMs operation can be named.\n", "\n", "- classes first\n", "- database first" ] }, { "cell_type": "markdown", "id": "319ddabf", "metadata": {}, "source": [ "_notes_\n", "\n", "Some ORMs are capable of analyzing an existing database schema and generating the definitions of programming language classes corresponding to existing tables. This is the **database first approach**.\n", "\n", "We can also write classes and have the ORM generate tables that match these classes. This is the **classes first approach**.\n", "\n", "There are strong proponents of both techniques. We'll rely on the classes first approach for most of our examples." ] }, { "cell_type": "code", "execution_count": null, "id": "86fee7af", "metadata": {}, "outputs": [], "source": [ "!cat forge/models.py" ] }, { "cell_type": "markdown", "id": "e68e518c", "metadata": {}, "source": [ "_notes_\n", "\n", "This is where Django would normally have us write our classes." ] }, { "cell_type": "code", "execution_count": null, "id": "12d6a5b8", "metadata": {}, "outputs": [], "source": [ "with open('forge/models.py', 'wt') as out:\n", " out.write('''\n", "from django.db import models\n", "\n", "class Namespace(models.Model):\n", " name = models.CharField(max_length=50)\n", "''')" ] }, { "cell_type": "markdown", "id": "01c7cb93", "metadata": {}, "source": [ "_notes_\n", "\n", "We define a class (a Django **model**) using `django.db.models.Model` as a base class and `django.db.models.CharField` object as attribute's value.\n", "\n", "Django is capable of creating a database table corresponding to this class." ] }, { "cell_type": "code", "execution_count": null, "id": "18afa814", "metadata": {}, "outputs": [], "source": [ "!python3 manage.py makemigrations" ] }, { "cell_type": "code", "execution_count": null, "id": "884592ab", "metadata": {}, "outputs": [], "source": [ "!cat forge/migrations/0001_initial.py" ] }, { "cell_type": "markdown", "id": "4b86bd47", "metadata": {}, "source": [ "_notes_\n", "\n", "Whe we add or update classes, Django generates migration scripts that **we'd likely want to version-control**. If needed, they can be hand-edited after generation.\n", "\n", "Mere generation of a migration script does not cause any changes to the database. An explicit invocation of a migration does." ] }, { "cell_type": "code", "execution_count": null, "id": "e2ad1355", "metadata": {}, "outputs": [], "source": [ "!python3 manage.py migrate" ] }, { "cell_type": "code", "execution_count": null, "id": "d5efab7f", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM pg_tables WHERE tablename LIKE '%forge%'" ] }, { "cell_type": "code", "execution_count": null, "id": "150e29e3", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "markdown", "id": "f0e0b79c", "metadata": {}, "source": [ "_notes_\n", "\n", "An empty table got created and automatically got a primary key `id` column.\n", "\n", "We can see use `pg_dump` to see the `CREATE TABLE` and `ALTER TABLE` statements corresponding to the table (Django might have used equivalent but different commands to define the table)." ] }, { "cell_type": "code", "execution_count": null, "id": "4757b5bb", "metadata": {}, "outputs": [], "source": [ "!pg_dump --port 25432 object_orientation -t forge_namespace --schema-only" ] }, { "cell_type": "code", "execution_count": null, "id": "d32ca484", "metadata": {}, "outputs": [], "source": [ "agh_db_lectures.prepare_notebook_for_django()" ] }, { "cell_type": "markdown", "id": "f86b544c", "metadata": {}, "source": [ "#### Creating an Object Using Django API" ] }, { "cell_type": "code", "execution_count": null, "id": "b2b56110", "metadata": {}, "outputs": [], "source": [ "from forge.models import Namespace\n", "\n", "some_namespace = Namespace(name='pancratius')\n", "some_namespace" ] }, { "cell_type": "code", "execution_count": null, "id": "f7423268", "metadata": {}, "outputs": [], "source": [ "some_namespace.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "ccca6d07", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "markdown", "id": "0b666142", "metadata": {}, "source": [ "#### Updating and Removing an Object Using Django API" ] }, { "cell_type": "code", "execution_count": null, "id": "950ab74e", "metadata": {}, "outputs": [], "source": [ "some_namespace.id" ] }, { "cell_type": "code", "execution_count": null, "id": "fcfe28a4", "metadata": {}, "outputs": [], "source": [ "some_namespace.name = 'agh'" ] }, { "cell_type": "markdown", "id": "56c8e1ff", "metadata": {}, "source": [ "_notes_\n", "\n", "Primary key values are used to distinguish objects. If we change the value of a a non-key attribute, Django understands it to be a modified object and we can persists the change using, once again, the `save()` method." ] }, { "cell_type": "code", "execution_count": null, "id": "5126b336", "metadata": {}, "outputs": [], "source": [ "some_namespace.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "f815fedc", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "code", "execution_count": null, "id": "ef6d8a35", "metadata": {}, "outputs": [], "source": [ "some_namespace.name = 'theodore'\n", "some_namespace.id = 2\n", "some_namespace.save()" ] }, { "cell_type": "markdown", "id": "a0875139", "metadata": {}, "source": [ "_notes_\n", "\n", "If we change the primary key attribute of an object, Django understands it to represent an entity distinct from the initial one. It gets persisted in a new table row." ] }, { "cell_type": "code", "execution_count": null, "id": "07b00be2", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "code", "execution_count": null, "id": "5d8ce21d", "metadata": {}, "outputs": [], "source": [ "some_namespace.delete()" ] }, { "cell_type": "markdown", "id": "446192f2", "metadata": {}, "source": [ "_notes_\n", "\n", "We can of course remove an object." ] }, { "cell_type": "code", "execution_count": null, "id": "584adfda", "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "markdown", "id": "a9fe6da2", "metadata": {}, "source": [ "### Getting Objects" ] }, { "cell_type": "code", "execution_count": null, "id": "4a931dd4", "metadata": {}, "outputs": [], "source": [ "agh = Namespace.objects.get(name__contains = 'gh')" ] }, { "cell_type": "code", "execution_count": null, "id": "b35c1da5", "metadata": {}, "outputs": [], "source": [ "agh.name" ] }, { "cell_type": "markdown", "id": "dba1e4bf", "metadata": {}, "source": [ "### Changing and Migrating the Model" ] }, { "cell_type": "code", "execution_count": null, "id": "1e1d2f53", "metadata": {}, "outputs": [], "source": [ "from django.core.management import execute_from_command_line" ] }, { "cell_type": "code", "execution_count": null, "id": "5f68ee5d", "metadata": {}, "outputs": [], "source": [ "from django.db import models\n", "\n", "# Danger!\n", "# Definition of a model class inside an interactive Jupyter Notebook.\n", "# DO NOT DO THIS AT HOME!\n", "class Namespace(models.Model):\n", " name = models.CharField(max_length=50,\n", " null=False,\n", " blank=False,\n", " primary_key=True)" ] }, { "cell_type": "markdown", "id": "92c8bb80", "metadata": {}, "source": [ "_notes_\n", "\n", "Django allows us to configure some properties of a field. For example, we can make it non-null or we can make our attribute of choice a primary key.\n", "\n", "The `prepare_notebook_for_django()` function we called later involves a crude hack to make Django consider all interactively defined classes as coming from the `forge.models` python module. Without workarounds one cannot play with model definitions in interactive environments. \n", "\n", "Additionally, for things to work we now need to make and invoke migrations using a function from ` django.core.management` rather than the usual `manage.py` script." ] }, { "cell_type": "code", "execution_count": null, "id": "f43a9330", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'makemigrations'])" ] }, { "cell_type": "code", "execution_count": null, "id": "4c8cd6f5", "metadata": {}, "outputs": [], "source": [ "!cat forge/migrations/0002_remove_namespace_id_alter_namespace_name.py" ] }, { "cell_type": "code", "execution_count": null, "id": "df95384a", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'migrate'])" ] }, { "cell_type": "code", "execution_count": null, "id": "4374a769", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "markdown", "id": "e247d94c", "metadata": {}, "source": [ "_notes_\n", "\n", "See how Django has removed the `id` column as part of the migration.\n", "\n", "An automatic `id` attribute is not added if we explicitly name a primary key attribute." ] }, { "cell_type": "markdown", "id": "847983ba", "metadata": {}, "source": [ "### Class Inheritance and One-to-One Relationships" ] }, { "cell_type": "markdown", "id": "165a06aa", "metadata": {}, "source": [ "_notes_\n", "\n", "There can be different goals of inheritance. We might want to create a table that merely has similar columns to some other one. But we might also want to record the information that some entities are further specialized (for example, a namespace has been created as a user account).\n", "\n", "We shall discuss the latter here." ] }, { "cell_type": "code", "execution_count": null, "id": "41680aa5", "metadata": {}, "outputs": [], "source": [ "class User(Namespace):\n", " hashed_pwd = models.CharField(max_length=100)\n", " email = models.CharField(max_length=100,\n", " null=False)" ] }, { "cell_type": "code", "execution_count": null, "id": "930849b2", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'makemigrations'])" ] }, { "cell_type": "code", "execution_count": null, "id": "0f89fe39", "metadata": { "scrolled": true }, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'migrate'])" ] }, { "cell_type": "code", "execution_count": null, "id": "7f12517f", "metadata": {}, "outputs": [], "source": [ "new_user = User(\n", " name = 'theodore',\n", " hashed_pwd = '$y$j9T$rt0zYkpt3a5PAQlDKStPR1$kWqCbyFiO7/MK0PwXj6XJ34MI0mU7VygLZuEaFxX87.',\n", " email = '543210@student.agh.edu.pl'\n", ")\n", "\n", "new_user.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "7cf5eef0", "metadata": {}, "outputs": [], "source": [ "new_user = User(\n", " name = 'pancratius',\n", " hashed_pwd = '$y$j9T$rt0zYkpt3a5PAQlDKStPR1$kWqCbyFiO7/MK0PwXj6XJ34MI0mU7VygLZuEaFxX87.',\n", " email = 'frying_pan_cratius@example.com'\n", ")\n", "\n", "new_user.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "102761bb", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_user" ] }, { "cell_type": "code", "execution_count": null, "id": "68ea5c51", "metadata": {}, "outputs": [], "source": [ "query_set = Namespace.objects.all()\n", "query_set" ] }, { "cell_type": "code", "execution_count": null, "id": "a679607a", "metadata": {}, "outputs": [], "source": [ "query_set = Namespace.objects.all().order_by('name')\n", "query_set" ] }, { "cell_type": "code", "execution_count": null, "id": "5f16af21", "metadata": {}, "outputs": [], "source": [ "theodore = query_set[2]\n", "theodore" ] }, { "cell_type": "code", "execution_count": null, "id": "373f220b", "metadata": {}, "outputs": [], "source": [ "theodore_as_user = theodore.user\n", "theodore_as_user, theodore_as_user.email" ] }, { "cell_type": "markdown", "id": "27157393", "metadata": {}, "source": [ "_notes_\n", "\n", "The following shall fail as the \"agh\" namespace is not a user." ] }, { "cell_type": "code", "execution_count": null, "id": "924d4218", "metadata": {}, "outputs": [], "source": [ "agh = query_set[0]\n", "agh.user" ] }, { "cell_type": "markdown", "id": "45c0cd47", "metadata": {}, "source": [ "_notes_\n", "\n", "We added the users and they now appear in both `forge_user` and `forge_namespace` tables. Django keeps these in sync." ] }, { "cell_type": "code", "execution_count": null, "id": "be3fc642", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_namespace" ] }, { "cell_type": "markdown", "id": "13d55a46", "metadata": {}, "source": [ "_notes_\n", "\n", "Not that in canse of Django it is common to make use of Django's own user model and create a one-to-one mapping to it from one's own. We're not doing this in our examples." ] }, { "cell_type": "markdown", "id": "58f20298", "metadata": {}, "source": [ "#### Manualy Defined Column Name in One-to-One Mapping" ] }, { "cell_type": "code", "execution_count": null, "id": "a432e673", "metadata": {}, "outputs": [], "source": [ "class Group(Namespace):\n", " name_ptr = models.OneToOneField(Namespace,\n", " on_delete=models.CASCADE,\n", " parent_link=True,\n", " primary_key=True,\n", " # default: `namespace_ptr_id'\n", " db_column='name')\n", "\n", "execute_from_command_line(['manage.py', 'makemigrations'])" ] }, { "cell_type": "markdown", "id": "451b5067", "metadata": {}, "source": [ "_notes_\n", "\n", "The name_ptr field is identical to the one that would be automatically created, but we can, for example, change its database column name." ] }, { "cell_type": "code", "execution_count": null, "id": "7452f76c", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'migrate'])" ] }, { "cell_type": "code", "execution_count": null, "id": "422da6c1", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_group\n", "-- SELECT * FROM forge_user" ] }, { "cell_type": "markdown", "id": "264a35fb", "metadata": {}, "source": [ "_notes_\n", "\n", "Note that it is possible to alter data using SQL and observe the effects with Django." ] }, { "cell_type": "code", "execution_count": null, "id": "7e023714", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "INSERT INTO forge_group (name)\n", "VALUES ('agh')" ] }, { "cell_type": "code", "execution_count": null, "id": "b1b3b97d", "metadata": {}, "outputs": [], "source": [ "Group.objects.get()" ] }, { "cell_type": "markdown", "id": "b12a8710", "metadata": {}, "source": [ "### Renaming a Table" ] }, { "cell_type": "code", "execution_count": null, "id": "b5cbc42f", "metadata": {}, "outputs": [], "source": [ "class Group(Namespace):\n", " name_ptr = models.OneToOneField(Namespace,\n", " on_delete=models.CASCADE,\n", " parent_link=True,\n", " primary_key=True,\n", " # default: `namespace_ptr_id'\n", " db_column='name')\n", "\n", " class Meta:\n", " db_table = 'group'\n", "\n", "execute_from_command_line(['manage.py', 'makemigrations'])" ] }, { "cell_type": "code", "execution_count": null, "id": "d61e71a3", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'migrate'])" ] }, { "cell_type": "code", "execution_count": null, "id": "068e66bd", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_group -- Not exists." ] }, { "cell_type": "markdown", "id": "b09abf34", "metadata": {}, "source": [ "_notes_\n", "\n", "Change table name to `\"group\"` — that's how we got it renamed.\n", "\n", "Note that we had to quote `group` as it is an SQL keywords. Django handles quoting of table and column names automatically." ] }, { "cell_type": "markdown", "id": "97106f72", "metadata": {}, "source": [ "### Many-to-Many Fields" ] }, { "cell_type": "code", "execution_count": null, "id": "4d89fa7d", "metadata": {}, "outputs": [], "source": [ "class Group(Namespace):\n", " name_ptr = models.OneToOneField(Namespace,\n", " on_delete=models.CASCADE,\n", " parent_link=True,\n", " primary_key=True,\n", " # default: `namespace_ptr_id'\n", " db_column='name')\n", "\n", " members = models.ManyToManyField('forge.User')\n", "\n", " class Meta:\n", " db_table = 'group'\n", "\n", "execute_from_command_line(['manage.py', 'makemigrations'])" ] }, { "cell_type": "code", "execution_count": null, "id": "76038ff7", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'migrate'])" ] }, { "cell_type": "code", "execution_count": null, "id": "f7e07dd8", "metadata": {}, "outputs": [], "source": [ "agh = Group.objects.get()\n", "agh.members.add(User.objects.get(name='pancratius'))" ] }, { "cell_type": "markdown", "id": "e18104a2", "metadata": {}, "source": [ "_notes_\n", "\n", "Note that this is idempotent." ] }, { "cell_type": "code", "execution_count": null, "id": "0b74c788", "metadata": {}, "outputs": [], "source": [ "agh = Group.objects.get()\n", "agh.members.add(User.objects.get(name='pancratius'))\n", "agh.members.add(User.objects.get(name='theodore'))" ] }, { "cell_type": "code", "execution_count": null, "id": "a2e5a6dd", "metadata": {}, "outputs": [], "source": [ "agh.members.all()" ] }, { "cell_type": "markdown", "id": "7d991337", "metadata": {}, "source": [ "_notes_\n", "\n", "Django has automatically created a joiner table for us." ] }, { "cell_type": "code", "execution_count": null, "id": "75b9451d", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM pg_tables\n", "WHERE tablename LIKE '%member%'" ] }, { "cell_type": "code", "execution_count": null, "id": "a6cb6e93", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM group_members" ] }, { "cell_type": "markdown", "id": "f2a49a1b", "metadata": {}, "source": [ "### Foreign Keys (Many-to-One Relationships)" ] }, { "cell_type": "code", "execution_count": null, "id": "dd7fb8e6", "metadata": {}, "outputs": [], "source": [ "class Repo(models.Model):\n", " namespace = models.ForeignKey(Namespace,\n", " models.CASCADE,\n", " blank=False,\n", " null=False)\n", " repo_name = models.CharField(max_length=50,\n", " null=False,\n", " blank=False)\n", "\n", "execute_from_command_line(['manage.py', 'makemigrations'])" ] }, { "cell_type": "markdown", "id": "554328a2", "metadata": {}, "source": [ "_notes_\n", "\n", "The convention is to name a foreign key field the same as the referenced class (but in lower case, etc.). \n", "\n", "Be aware that ORM might sometimes take some respoisibilities away from the db and do things by itself. When we specify `CASCADE` on a foreign key field, as above, this does not mean Django will make it a real foreign key with automatic cascade deletion. We might get surprised if we try using SQL directly." ] }, { "cell_type": "code", "execution_count": null, "id": "508c29c1", "metadata": {}, "outputs": [], "source": [ "execute_from_command_line(['manage.py', 'migrate'])" ] }, { "cell_type": "code", "execution_count": null, "id": "26b74780", "metadata": {}, "outputs": [], "source": [ "lectures_repo = Repo(namespace=agh, repo_name='agh_db_lectures')\n", "lectures_repo" ] }, { "cell_type": "code", "execution_count": null, "id": "e6c69148", "metadata": {}, "outputs": [], "source": [ "print(lectures_repo.id)" ] }, { "cell_type": "code", "execution_count": null, "id": "97bc871f", "metadata": {}, "outputs": [], "source": [ "lectures_repo.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "4caf28a5", "metadata": {}, "outputs": [], "source": [ "print(lectures_repo.id)" ] }, { "cell_type": "code", "execution_count": null, "id": "7108a88b", "metadata": {}, "outputs": [], "source": [ "Repo.objects.get().namespace.group.members.all()" ] }, { "cell_type": "code", "execution_count": null, "id": "cb4d5093", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM forge_repo" ] }, { "cell_type": "markdown", "id": "6f0f5876", "metadata": {}, "source": [ "#### \"Fake\" \"CASCADE\"" ] }, { "cell_type": "markdown", "id": "203a7bc0", "metadata": {}, "source": [ "_notes_\n", "\n", "Note that even though we specified `models.CASCADE` in foreign key's definition the Postgres table dump shows no use of `CASCADE` in the foreign key's definition." ] }, { "cell_type": "code", "execution_count": null, "id": "28cb20bf", "metadata": { "scrolled": true }, "outputs": [], "source": [ "!pg_dump --port 25432 object_orientation -t forge_repo --schema-only" ] }, { "cell_type": "markdown", "id": "75a6ac48", "metadata": {}, "source": [ "_notes_\n", "\n", "Let's verify that when using SQL (rather than Django APIs) instead of `CASCADE` behavior, we have the defaul behavior of reference record deletions being inhibited." ] }, { "cell_type": "code", "execution_count": null, "id": "da9a2881", "metadata": {}, "outputs": [], "source": [ "iet = Namespace(name='iet')\n", "iet.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "b63fb70a", "metadata": {}, "outputs": [], "source": [ "fork_repo = Repo(namespace=iet, repo_name='agh_db_lectures_fork')\n", "fork_repo.save()" ] }, { "cell_type": "code", "execution_count": null, "id": "06a0edac", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "DELETE FROM forge_namespace\n", "WHERE name = 'iet'" ] }, { "cell_type": "markdown", "id": "50206d18", "metadata": {}, "source": [ "_notes_\n", "\n", "The above fails with foreign key violation instead of cascadingly removing the repo row." ] }, { "cell_type": "code", "execution_count": null, "id": "08c20f4b", "metadata": {}, "outputs": [], "source": [ "iet.delete()" ] }, { "cell_type": "markdown", "id": "f3477c5f", "metadata": {}, "source": [ "_notes_\n", "\n", "Django does, however, perform cascading removals when its OO API is used." ] }, { "cell_type": "code", "execution_count": null, "id": "22400794", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "-- The fork repo row got removed upon `iet.delete()'.\n", "SELECT * FROM forge_repo" ] }, { "cell_type": "markdown", "id": "d53f5d0a", "metadata": {}, "source": [ "### Using Raw SQL from Django" ] }, { "cell_type": "code", "execution_count": null, "id": "9938d309", "metadata": {}, "outputs": [], "source": [ "list(User.objects.raw('SELECT * FROM forge_user'))" ] }, { "cell_type": "code", "execution_count": null, "id": "4d8701f2", "metadata": {}, "outputs": [], "source": [ "list(User.objects.raw('''\n", " SELECT 'eugenia' AS namespace_ptr_id,\n", " '555555@student.agh.edu.pl' AS email,\n", " NULL AS hashed_pwd\n", "'''))" ] }, { "cell_type": "markdown", "id": "f43995ab", "metadata": {}, "source": [ "_notes_\n", "\n", "Note that order of raw query columns does not matter, only column names." ] }, { "cell_type": "markdown", "id": "768b0d44", "metadata": {}, "source": [ "### Deferring Fetching of Field Values" ] }, { "cell_type": "markdown", "id": "e8340302", "metadata": {}, "source": [ "What if seldom needed fields contain large values?" ] }, { "cell_type": "markdown", "id": "9ae248f1", "metadata": {}, "source": [ "_notes_\n", "\n", "Consider a table that has vary many columns with only a few of them needed for certain operation.\n", "\n", "Consider large binary blobs (e.g., photos) stored in a column of a table with users." ] }, { "cell_type": "code", "execution_count": null, "id": "18ae3e0c", "metadata": {}, "outputs": [], "source": [ "theodore = User.objects.defer('email', 'hashed_pwd').all()[0]\n", "theodore" ] }, { "cell_type": "code", "execution_count": null, "id": "b094a398", "metadata": {}, "outputs": [], "source": [ "theodore.__dict__" ] }, { "cell_type": "code", "execution_count": null, "id": "a89bba3c", "metadata": {}, "outputs": [], "source": [ "theodore.hashed_pwd" ] }, { "cell_type": "code", "execution_count": null, "id": "7386d17b", "metadata": {}, "outputs": [], "source": [ "theodore.__dict__" ] }, { "cell_type": "markdown", "id": "4ba4300b", "metadata": {}, "source": [ "_notes_\n", "\n", "We see that the deferred field `hashed_pwd` got fetched lazily. We also see that the other deferred field, `email` did not get fetched yet because it has not been used so far." ] }, { "cell_type": "markdown", "id": "fe0b83ac", "metadata": {}, "source": [ "## Elements of Object Orientation in SQL" ] }, { "cell_type": "markdown", "id": "4a0305cf", "metadata": {}, "source": [ "### User-Defined Types" ] }, { "cell_type": "code", "execution_count": null, "id": "6bd701d0", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "CREATE TYPE hashed_password AS (\n", " salt VARCHAR(20),\n", " hash VARCHAR(150)\n", ")" ] }, { "cell_type": "markdown", "id": "b176eb42", "metadata": {}, "source": [ "_notes_\n", "\n", "The SQL standard allows defining custom composite types." ] }, { "cell_type": "code", "execution_count": null, "id": "265a85f3", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "CREATE TABLE users_secure(\n", " name VARCHAR(50) PRIMARY KEY,\n", " pwd hashed_password\n", ")" ] }, { "cell_type": "markdown", "id": "e1d86e5a", "metadata": {}, "source": [ "_notes_\n", "\n", "A composite type can be used as a column type. Not that, wrt our formal theory, such table schema is not in 1NF. The first normal form requires that all attributes are atomic." ] }, { "cell_type": "code", "execution_count": null, "id": "68ec6a4f", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "INSERT INTO users_secure (name, pwd)\n", "VALUES (\n", " 'theodore',\n", " ROW('219fE02', 'kWqCbyFiO7/MK0PwXj6XJ34MI0mU7VygLZuE')\n", ")" ] }, { "cell_type": "markdown", "id": "6ec4f080", "metadata": {}, "source": [ "_notes_\n", "\n", "We can create a composite values using, among others, the `ROW` constructor." ] }, { "cell_type": "code", "execution_count": null, "id": "094193dc", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM users_secure" ] }, { "cell_type": "code", "execution_count": null, "id": "c7b59fbe", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT name,\n", " (pwd).*,\n", " (pwd).salt || '$' || (pwd).hash AS salt_hash\n", "FROM users_secure" ] }, { "cell_type": "markdown", "id": "ab6dcb7a", "metadata": {}, "source": [ "_notes_\n", "\n", "Parentheses around composite value column name are necessary because `pwd.sth` would be interpreted as a reference to (nonexistent) table `pwd` rather than column `pwd`.\n", "\n", "Note that `*` in the following could be viewed as a special case of `users_secure.*`. The two snippets below are equivalent.\n", "\n", "```sql\n", "SELECT * FROM users_secure;\n", "```\n", "\n", "```sql\n", "SELECT users_secure.* FROM users_secure;\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "a3acc87b", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "\n", "SELECT users_secure.* FROM users_secure;" ] }, { "cell_type": "markdown", "id": "47030217", "metadata": {}, "source": [ "#### Inheritance of User-Defined Types" ] }, { "cell_type": "markdown", "id": "2a10924f", "metadata": {}, "source": [ "```sql\n", "CREATE TYPE salted_hash_w_alg\n", " UNDER salted_hash (\n", " algorithm VARCHAR(5)\n", " )\n", "```" ] }, { "cell_type": "markdown", "id": "e978ed7f", "metadata": {}, "source": [ "_notes_\n", "\n", "A child type would have all the fields of the parent plus its own fields.\n", "\n", "Standard defines it but Postgres does not currently supports this." ] }, { "cell_type": "markdown", "id": "7a22d17b", "metadata": {}, "source": [ "### Inheritance of Postgres Tables" ] }, { "cell_type": "code", "execution_count": null, "id": "04519f8e", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "CREATE TABLE admins (\n", " admin_panel_dark_mode BOOLEAN\n", ") INHERITS (users_secure)" ] }, { "cell_type": "code", "execution_count": null, "id": "aea877d6", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "INSERT INTO admins (name, pwd, admin_panel_dark_mode)\n", "VALUES (\n", " 'pancratius',\n", " ROW('G6dHW0H', 'aksGA3JHSB8JGSJaLbs&J34MI0mU7VygLZuE'),\n", " TRUE\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "dc11645d", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM admins" ] }, { "cell_type": "markdown", "id": "3d50ea5f", "metadata": {}, "source": [ "_notes_\n", "\n", "A child table has all the columns of its parent plus its own extra columns." ] }, { "cell_type": "code", "execution_count": null, "id": "ee3d4ecc", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM users_secure" ] }, { "cell_type": "markdown", "id": "7eb92151", "metadata": {}, "source": [ "_notes_\n", "\n", "All child rows are visible through the parent, even though the parent and child tables are stored separately by Postgres." ] }, { "cell_type": "code", "execution_count": null, "id": "ee3cc86c", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT * FROM ONLY users_secure" ] }, { "cell_type": "markdown", "id": "71240f66", "metadata": {}, "source": [ "_notes_\n", "\n", "The `ONLY` keyword can be used in `FROM` to have Postgres fetch just the rows of the parent." ] }, { "cell_type": "code", "execution_count": null, "id": "7fd8cf69", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "INSERT INTO users_secure (name, pwd)\n", "VALUES (\n", " 'pancratius',\n", " ROW('G6dHW0H', 'aksGA3JHSB8JGSJaLbs&J34MI0mU7VygLZuE')\n", ");" ] }, { "cell_type": "markdown", "id": "5d03523d", "metadata": {}, "source": [ "_notes_\n", "\n", "Even if the parent has a `UNIQUE` (or `PRIMARY KEY`) constraint on some columns, duplicates can appear in child tables." ] }, { "cell_type": "code", "execution_count": null, "id": "89f40882", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT users_secure.* FROM users_secure" ] }, { "cell_type": "markdown", "id": "95e562c4", "metadata": {}, "source": [ "_notes_\n", "\n", "We can reference the special column `tableoid` to see which table each row really comes from." ] }, { "cell_type": "code", "execution_count": null, "id": "9ee836ee", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SELECT users_secure.tableoid, users_secure.*\n", "FROM users_secure" ] }, { "cell_type": "markdown", "id": "48798910", "metadata": {}, "source": [ "_notes_\n", "\n", "Also note that, besides counter-intuitive handling of uniqueness constraints, Postgres table inheritance doesn't allow foreign key constraints to be inherited. Additionally, a foreign key constraint targeting the parent does not \"see\" the rows in its children.\n", "\n", "It is possible to inherit from multiple parent tables at the same time." ] }, { "cell_type": "markdown", "id": "4199673c", "metadata": {}, "source": [ "#### Table Inheritance in SQL Standard" ] }, { "cell_type": "markdown", "id": "b0841998", "metadata": {}, "source": [ "```sql\n", "CREATE TABLE admins (\n", " admin_panel_dark_mode BOOLEAN\n", ") UNDER users_secure\n", "```" ] }, { "cell_type": "markdown", "id": "2aaa31b3", "metadata": {}, "source": [ "_notes_\n", "\n", "It is possible to inherit from multiple parents as well.\n", "\n", "Many DBMSes do not implement table inheritance." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "/gnu/store/q35bxk1i5blx0rcgxphhrq9xqmgha9bz-python-3.11.11/bin/python3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }