Session Plan

  • The idea is to build on the previous week's knowledge, focusing on advanced database features and practical applications.

  • Use the same task database throughout all examples for consistency

  • Use problem-based approach: show issues, guide trainees to solutions

Requirements: Setup DB and start the example API

Mentor Instructions

You need to have the example API running to demonstrate concepts and exercises. Share your screen and setup the session materials:

git clone https://github.com/HackYourFuture-CPH/hyf-assignment-template.git
cd courses/backend/databases/

Trainee Instructions

  • They should have their tasks.sqlite3 database from Week 1 ready to use with a GUI tool. Tools setup - Week 1

  • They should follow the same instructions but on their own fork of the assignment repo, inside the courses/backend/databases/ directory.

Aggregate Functions

Reporting

  • A common business need: "We need reports from our task management system"

  • Show inefficient approach: calculating stats in application code

  • Introduce SQL aggregates as a more performant solution

    • Show code in /tasks-per-user and /status-distribution endpoints

Key Aggregate Functions

Other examples of questions we can answer with aggregates:

  • COUNT: How many tasks were created?

  • SUM: Total estimated hours across tasks

  • AVG: Average completion time

  • MIN/MAX: Earliest/latest due dates

  • GROUP BY: Essential for aggregation

Exercise: Write Your Own Aggregate Queries

Trainees work with their tasks.sqlite3 from Week 1

Write SQL queries to answer these questions:

  1. How many tasks are overdue? (due_date < today)

  2. What's the average number of tasks per user?

  3. Which status has the most tasks?

  4. Find the user with the most completed tasks.

Solutions discussion

Click to see the solutions

This can be executed directly in the SQLite command line or any SQLite client.

-- Count overdue tasks
SELECT COUNT(*) AS overdue_count
FROM task
WHERE due_date < DATE('now');

-- Average tasks per user
SELECT AVG(task_count) AS average_tasks
FROM (
  SELECT user_id, COUNT(*) AS task_count
  FROM user_task
  GROUP BY user_id
);

-- Status with most tasks
SELECT s.name, COUNT(*) AS task_count
FROM task t
JOIN status s ON t.status_id = s.id
GROUP BY s.id, s.name
ORDER BY task_count DESC
LIMIT 1;

-- User with most completed tasks (status_id = 3 for 'Done')
SELECT u.name, COUNT(*) AS completed_tasks
FROM user u
JOIN user_task ut ON u.id = ut.user_id
JOIN task t ON ut.task_id = t.id
WHERE t.status_id = 3
GROUP BY u.id, u.name
ORDER BY completed_tasks DESC
LIMIT 1;

Database Security

SQL Injection Demo

  • The idea is to show the vulnerable search endpoint and how it can be exploited

Normal search: http://localhost:3000/api/search/vulnerable?query=wash

Leak user data: http://localhost:3000/api/search/vulnerable?query=%27%20UNION%20SELECT%20name%2C%20email%2C%20phone%20FROM%20user%3B%20--

Exercise: SQL Injection Attack Practice

Trainees try to exploit the vulnerable endpoint

Using the running API, try these attacks:

  1. Extract all user emails

  2. Try to delete data (see what happens)

  3. Attempt to find hidden information

Attack strings to try

' OR '1'='1
'; DROP TABLE tasks; --
' UNION SELECT * FROM user --

Security Best Practices

  • Show how it's fixed in the search/secure endpoint

  • Always use parameterized queries or ORMs which make things easier for developers

  • Mention the importance of validation, both in client and server

Database Types Overview

When NOT to use relational databases?

  • Key-Value Stores (Redis): Caching, real-time features

  • Document Stores (MongoDB): Flexible schemas, JSON-like data

  • Graph Databases (Neo4j): Social networks, recommendation engines

  • Time-Series: IoT sensor data, financial metrics

Deployment Overview

Development vs Production: Understanding the Journey

Local vs Prod environment

[!NOTE] The diagram illustrates the difference between local development and production environments.

Why do we care about different environments?

  • We use them to test and develop our applications in a safe space without affecting real users or data.

  • They help us identify and fix issues before deployment.

A recap of what we mean when we say "the cloud"

  • A remote server (computer) running 24/7 somewhere else

  • Multiple users can connect simultaneously

  • Databases run on these servers

  • Your data needs to be accessible from anywhere

Why do we need to migrate to another database?

Aspect
SQLite (Dev)
PostgreSQL (Production)

Location

Local file

Remote server

Use case

Development, small apps

Production, real applications

Type of information

Testing, prototyping

Real user data, critical info

Scaling

Limited

Excellent

Demo time!

[!NOTE] Guide the trainees through the steps, explaining each part. You can choose to show it entirely and have them replicate later or to give them some time to follow along.

We're going to recreate our local SQLite database to a remote PostgreSQL database on Render.com

Step 1: Create Postgres DB on Render

Follow the steps from the hyf-project-template: https://github.com/HackYourFuture-CPH/hyf-project-template/tree/main/api#deploying

Step 2: Recreate Database using DBeaver

We'll use DBeaver to connect to the remote PostgreSQL database and run SQL commands to recreate our schema and data.

  1. Connect to your PostgreSQL database on Render:

    • Open DBeaver

    • New Connection (Ctrl + Shift + N) → PostgreSQL

    • Fill in connection details from Render's "Connection Info" page

    • Test connection

  2. Create a new SQL script:

    • Right-click on your PostgreSQL connection → SQL Editor → New SQL Script

    • Copy the provided PostgreSQL-compatible SQL (see tasks-postgres.sql in the assignment repo)

  3. Execute the script:

    • Paste the entire SQL content

    • Click Execute SQL Script (Not "Execute SQL Statement")

    • Should see an output saying the number of queries executed and the updated rows

  4. Verify:

    • Disconnect and reconnect to refresh

    • Expand Tables to see: user, task, status, user_task

    • Right-click any table → View Data to verify records

Step 3: Environment Variables

Environment variables allow us to configure our application differently based on where it's running (development vs production) without changing code.

Create a .env file in the example-api folder

# .env file for local development
NODE_ENV=development

# For production testing (get this from Render)
DATABASE_URL=postgresql://user:password@host:5432/database

[!IMPORTANT] Never commit .env files to git! They contain sensitive credentials. Always add .env to your .gitignore file.

Database Configuration Code

Show the trainees how the code switches between databases:

// example-api/index.js

// Development: local SQLite
const developmentConfig = {
  client: "sqlite3",
  //...
};

// Production: remote PostgreSQL
const productionConfig = {
  client: "pg",
  //...
};

Step 4: Test Both Environments

Now we'll test our API with both databases to see the differences.

Testing development (SQLite)

npm run dev

Open your browser:

What's happening

  • Reading from local tasks.sqlite3 file

  • Perfect for development and testing

Testing production (PostgreSQL)

# Update .env to use production
# DATABASE_URL=your-render-database-url

npm run prod

Open your browser to the same endpoints:

What's happening:

  • Reading from remote PostgreSQL on Render

  • Slightly slower (network latency)

  • Same data, different database engine

[!TIP] You can show the diagram again to illustrate how you are connecting to different databases based on environment.

Key Concepts to Emphasize

Environment Variables

  • Never hardcode database credentials

  • Different configs for dev/staging/production

  • .env file locally, Render UI for production

  • Always add .env to .gitignore

Migration Best Practices

  • Always backup before migrating

  • Monitor after deployment

  • Keep development and production schemas in sync

Alternative: Knex Migrations (Optional)

[!TIP] For production projects, you'll likely use migration tools like Knex.js to version-control your schema changes. This is beyond today's scope but worth exploring for your projects.

Questions to Explore

  • What happens if the remote database goes down?

  • How do we update the database after the first setup?

  • Can multiple developers work with the same production database?

Summary & Q&A

Key takeaways

  • SQL aggregates are your friends for calculations: reporting, dashboard stats

  • Security: Never trust user input, always validate.

  • Choose the right database for your use case

Last updated