Lesson 4, Topic 2

In Progress

← Previous Next→

Databases and Data Modelling

isaac March 12, 2026

Lesson Progress

0% Complete

Websites become more useful when they can remember information: users, products, blog posts, bookings, messages, and more. This is what databases are for. In this topic, you will learn the core database types used in web development, how to model your data, and how your back-end connects to a database to persist (save) and retrieve data safely.

1) What a Database Does (and why it matters)

A database is a system for storing data so it can be:

Created (saved)
Read (retrieved)
Updated (changed)
Deleted (removed)

These are often called CRUD operations.

In a typical website:

The front-end collects input and shows results.
The back-end applies rules (business logic), checks permissions, and talks to the database.
The database stores the data reliably.

2) Relational vs Document Databases (two common approaches)

There are many database types, but most web applications start with either relational (SQL) or document (NoSQL) databases.

Relational Databases (SQL)

Examples: PostgreSQL, MySQL, MariaDB, Microsoft SQL Server

Key ideas:

Data is stored in tables (like spreadsheets).
Tables have rows (records) and columns (fields).
Relationships between tables are expressed using keys.

Best suited for:

Strong consistency and correctness (e.g. payments, bookings)
Data with clear relationships (users → orders → order items)
Reporting and complex queries (joins, aggregations)

Common terms:

Schema: the structure of your tables and columns
Primary Key (PK): unique identifier for each row (often id)
Foreign Key (FK): a reference to a row in another table (e.g. orders.user_id)

Document Databases (NoSQL)

Examples: MongoDB, Firestore

Key ideas:

Data is stored as documents, usually JSON-like objects.
Documents live inside collections (similar to tables, but more flexible).
Documents can be nested, which can reduce the need for joins.

Best suited for:

Fast-moving or flexible data structures
Content-focused apps (articles, comments, profiles)
Some event/log style data

Important trade-off:

Document databases often make it easier to start quickly, but you must be disciplined to avoid messy, inconsistent data over time.

3) Data Modelling: Turning Real-World Requirements into Data Structures

Data modelling is how you plan what data you need and how it fits together.

A good data model:

matches the real business needs
reduces duplication
makes common queries easy
supports future change without breaking everything

Step-by-step approach

Identify entities (things you store): User, Product, Order, Post, Comment
List attributes (fields): e.g. User: id, email, password_hash, created_at
Define relationships:
- One-to-one (1:1)
- One-to-many (1:N)
- Many-to-many (N:N)
Choose constraints: what must be unique, required, limited, etc.
Think about queries: what will the app ask the database most often?

4) Relationships in Relational Databases (with examples)

One-to-many (1:N)

Example: One User can have many Posts.

users table: id, email, …
posts table: id, user_id, title, body

Here, posts.user_id is a foreign key pointing to users.id.

Many-to-many (N:N)

Example: Posts can have many Tags, and Tags can belong to many Posts.

You typically create a join table:

posts table
tags table
post_tags table: post_id, tag_id

This keeps the data consistent and avoids repeated lists of tags in multiple places.

One-to-one (1:1)

Example: A User has one UserProfile.

Sometimes it is separated to keep the main table small or to isolate optional fields.

users table
user_profiles table: user_id, display_name, bio, etc.

5) Schema Design Basics (Relational)

Pick clear, stable IDs

Common approach: numeric auto-increment IDs or UUIDs.
UUIDs are useful when data might be created across multiple systems, but they are larger.

Use constraints to prevent bad data

NOT NULL for required fields
UNIQUE for unique values (like email)
CHECK for basic rules (e.g. rating between 1 and 5)
Foreign keys to enforce valid relationships

Normalisation (avoid unnecessary duplication)

Normalisation is a set of rules to reduce repeated data and update issues.

Example of a problem:

Storing customer_name directly in every order line item.
If the customer changes their name, you need to update many rows.

Better:

Store customer details in users/customers.
Store only customer_id in orders.

Practical note: real systems sometimes denormalise for performance, but you should start with a clean model first.

6) Modelling in Document Databases (what changes)

With documents, you often decide between:

Embedding data (nesting inside a document)
Referencing data (storing an ID to another document)

Embedding (good for “belongs together” data)

Example: a blog post with a small number of comments:

{
  "_id": "post_123",
  "title": "My Post",
  "comments": [
    { "id": "c1", "author": "Sam", "text": "Nice!" }
  ]
}

Pros:

One read fetches everything
Simple for small nested data

Cons:

Document can grow large
Updating one comment might rewrite the whole document
Harder when nested items need their own permissions and lifecycle

Referencing (good for shared or large data)

Example: keep comments as their own documents with postId.

Pros:

Scales better for large lists
More flexible querying and permissions

Cons:

Requires multiple queries or careful aggregation

7) Query Basics (SQL and Document)

SQL Query Basics

Most SQL work can be expressed using:

SELECT (read)
INSERT (create)
UPDATE (change)
DELETE (remove)

Example (conceptual):

“Get all posts for a user”
“Find a user by email”
“Count orders in the last 30 days”
“Join orders to order items and products”

Key skills:

Filtering (WHERE)
Sorting (ORDER BY)
Pagination (LIMIT / OFFSET or keyset pagination)
Aggregation (COUNT, SUM, GROUP BY)
Joining tables (JOIN)

Document Query Basics

Typical tasks include:

Find documents by field match
Filter by ranges (dates, numbers)
Sort and paginate
Use indexes to avoid slow full collection scans

8) Indexes (speeding up common reads)

An index helps the database find data faster, like an index at the back of a book.

Good candidates for indexes:

fields you search on often: email, username
foreign keys: posts.user_id
date fields used in filtering: created_at

Trade-offs:

Indexes improve read performance
They add storage cost
Writes become slightly slower because indexes must be updated

Rule of thumb: index what you frequently query, but do not index everything by default.

9) Connecting Your App to a Database (Persistence)

Your back-end server connects to the database using:

a database driver (low-level)
or an ORM/ODM (higher-level)

Drivers

Give you direct control with raw queries.
Good for performance tuning and learning fundamentals.

ORMs/ODMs

Examples:

SQL ORM: Prisma, Sequelize, TypeORM
Document ODM: Mongoose (MongoDB)

Benefits:

Faster development
Safer query building (often helps prevent injection)
Migrations and schema management features

Risks:

You can hide performance problems if you do not understand what queries are generated.
Sometimes raw queries are still needed for complex cases.

10) Migrations and Versioning Your Schema

A migration is a controlled change to your schema over time:

creating tables/collections
adding or removing columns/fields
changing constraints
adding indexes

Why migrations matter:

Your schema must change as features change.
You need repeatable, trackable changes across dev, staging, and production.

Good practice:

Store migration files in your repo (with Git).
Run migrations automatically in deployment pipelines (with care).

11) Practical Modelling Example (Relational)

Imagine a simple store:

Entities

User
Product
Order
OrderItem

Relationships

One user → many orders
One order → many order items
One product → many order items

Typical columns

users(id, email, password_hash, created_at)
products(id, name, price_cents, stock_qty)
orders(id, user_id, status, created_at)
order_items(id, order_id, product_id, quantity, price_cents_at_time)

Notice price_cents_at_time: you store the price at the time of purchase so invoices remain correct even if product prices change later.

12) Data Integrity and Safety (what to watch for)

Even before you get into advanced security, good data modelling reduces risk.

Key practices:

Validate inputs on the server (types, ranges, required fields).
Use transactions for multi-step updates (e.g. create order + items + reduce stock).
Enforce unique constraints (e.g. unique email).
Use foreign keys (relational) to prevent orphaned rows.
Never store passwords in plain text: store password hashes (e.g. bcrypt/argon2) and salt.

13) Check Your Understanding (quick self-check)

You should be able to:

explain the difference between relational and document databases
model 1:N and N:N relationships
choose between embedding and referencing in a document database
describe what an index does and when to add one
explain why migrations are important
outline how the back-end persists data via drivers or ORMs/ODMs

Mini Activity: Model a Booking System (short practice)

Create a basic data model for a small salon booking site:

Entities: Customer, Service, Booking
Decide relationships (e.g. one customer → many bookings)
List 4–6 fields per entity
Identify at least 2 fields that should be indexed (e.g. customer.email, booking.date_time)

Website Development: From Fundamentals to Deployment

Participants 8