Master SQL: Complete Beginner to Advanced Guide
Your complete journey from absolute beginner to industry expert
Start Learning Download Cheat SheetModule 1: Introduction to SQL
SQL (Structured Query Language) is a standard language for storing, manipulating and retrieving data in databases.
Why SQL is important? SQL is crucial for managing relational databases, enabling data analysts, developers, and database administrators to interact with data efficiently.
Diagram: How databases work
Database: An organized collection of structured information, or data, typically stored electronically in a computer system.
DBMS (Database Management System): Software that interacts with end-users, applications, and the database itself to capture and analyze data.
RDBMS (Relational Database Management System): A DBMS that is based on the relational model, where data is organized into tables (relations).
- MySQL: Popular open-source relational database.
- PostgreSQL: Advanced open-source relational database, known for its extensibility and SQL compliance.
- SQLite: Self-contained, serverless, zero-configuration, transactional SQL database engine.
- SQL Server: Microsoft's relational database management system.
- Oracle: Enterprise-grade commercial relational database system.
Module 2: SQL Setup
Installing MySQL
Step-by-step guide for Windows, macOS, and Linux.
Installing PostgreSQL
Detailed instructions for various operating systems.
Using SQLite
How to get started with SQLite, typically requiring no installation.
- MySQL Workbench: Official GUI tool for MySQL.
- DBeaver: Universal database tool for developers and database administrators.
- pgAdmin: The most popular and feature-rich open source administration and development platform for PostgreSQL.
- VS Code Extensions: Useful extensions for SQL development in Visual Studio Code.
Module 3: Database Fundamentals
Table: A collection of related data entries, consisting of columns and rows.
Column: A set of data values of a particular simple type, one for each row of the database.
Row (Record/Tuple): A single entry in a table, containing data for each column.
- INT: Integer numbers (e.g., 1, 100, -5).
- VARCHAR(size): Variable-length string (e.g., 'Hello World').
- TEXT: Large text strings.
- DATE: Date values (e.g., '2023-10-26').
- BOOLEAN: True/False values.
- FLOAT: Floating-point numbers with single precision.
- DECIMAL(P,S): Exact numeric value with specified precision and scale.
Exercise: Identify Data Types
For a table storing customer information (name, age, email, registration_date, total_spent), suggest appropriate SQL data types for each column.
Module 4: Basic SQL Commands
Used to create a new SQL database.
CREATE DATABASE company;
Used to create a new table in a database.
CREATE TABLE employees(
id INT PRIMARY KEY,
name VARCHAR(100),
salary DECIMAL(10,2)
);
Used to insert new rows of data into a table.
INSERT INTO employees (id, name, salary) VALUES (1,'John Doe',50000.00);
Used to retrieve data from a database.
SELECT * FROM employees;
Used to filter records based on a specified condition.
SELECT * FROM employees WHERE salary > 40000;
Used to modify existing records in a table.
UPDATE employees SET salary = 60000.00 WHERE id = 1;
Used to delete existing records from a table.
DELETE FROM employees WHERE id = 1;
Module 5: Filtering Data
AND: Combines two or more conditions, returns true if all conditions are true.
SELECT * FROM employees WHERE salary > 50000 AND department = 'Sales';OR: Combines two or more conditions, returns true if any condition is true.
SELECT * FROM employees WHERE department = 'Sales' OR department = 'Marketing';NOT: Negates a condition.
SELECT * FROM employees WHERE NOT department = 'HR';IN: Used to specify multiple possible values for a column.
SELECT * FROM employees WHERE department IN ('Sales', 'IT', 'HR');BETWEEN: Used to select values within a given range.
SELECT * FROM employees WHERE salary BETWEEN 40000 AND 60000;LIKE: Used in a WHERE clause to search for a specified pattern in a column.
Wildcards (% and _):
- `%` represents zero, one, or multiple characters.
- `_` represents a single character.
SELECT * FROM employees WHERE name LIKE 'J%'; -- Names starting with JSELECT * FROM employees WHERE name LIKE '_ohn'; -- Names like 'John' or 'Cohn'Used to test for empty values (NULL).
SELECT * FROM employees WHERE email IS NULL;SELECT * FROM employees WHERE email IS NOT NULL;Exercise: Advanced Filtering
Write a query to find employees in the 'IT' or 'Marketing' department whose salary is between 45000 and 70000, and whose name contains 'an'.
Module 6: Sorting & Aggregation
ORDER BY: Sorts the result set of a query in ascending or descending order.
SELECT * FROM employees ORDER BY salary DESC;LIMIT: Restricts the number of rows returned by a query.
SELECT * FROM employees ORDER BY salary DESC LIMIT 5;DISTINCT: Returns only unique values in a specified column.
SELECT DISTINCT department FROM employees;COUNT(): Returns the number of rows that matches a specified criterion.
SELECT COUNT(*) FROM employees;SUM(): Calculates the sum of a set of numbers.
SELECT SUM(salary) FROM employees;AVG(): Calculates the average of a set of numbers.
SELECT AVG(salary) FROM employees;MIN(): Returns the smallest value in a set.
SELECT MIN(salary) FROM employees;MAX(): Returns the largest value in a set.
SELECT MAX(salary) FROM employees;Module 7: GROUP BY & HAVING
GROUP BY: Groups rows that have the same values in specified columns into a summary row.
HAVING: Used to filter groups based on a specified condition (similar to WHERE, but for groups after aggregation).
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000
ORDER BY avg_salary DESC;
Module 8: SQL Joins
Joins are used to combine rows from two or more tables based on a related column between them.
Types of Joins
Returns records that have matching values in both tables.
SELECT employees.name, departments.dept_name
FROM employees
INNER JOIN departments ON employees.dept_id = departments.id;
Returns all records from the left table, and the matching records from the right table. If there is no match, the result is NULL from the right side.
SELECT employees.name, departments.dept_name
FROM employees
LEFT JOIN departments ON employees.dept_id = departments.id;
Returns all records from the right table, and the matching records from the left table. If there is no match, the result is NULL from the left side.
SELECT employees.name, departments.dept_name
FROM employees
RIGHT JOIN departments ON employees.dept_id = departments.id;
Returns all records when there is a match in either left or right table.
SELECT employees.name, departments.dept_name
FROM employees
FULL OUTER JOIN departments ON employees.dept_id = departments.id;
A table is joined with itself. Used to combine and compare rows within the same table.
SELECT E1.name AS Employee, E2.name AS Manager
FROM employees E1, employees E2
WHERE E1.manager_id = E2.id;
Returns the Cartesian product of the sets of rows from the joined tables.
SELECT employees.name, departments.dept_name
FROM employees CROSS JOIN departments;
Module 9: Advanced SQL
A subquery is a query embedded inside another SQL query.
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
A CTE is a temporary named result set that you can reference within a single SQL statement (SELECT, INSERT, UPDATE, DELETE).
WITH HighSalaryEmployees AS (
SELECT name, salary
FROM employees
WHERE salary > 50000
)
SELECT name FROM HighSalaryEmployees;
Perform a calculation across a set of table rows that are somehow related to the current row.
ROW_NUMBER()
SELECT name, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) as rn
FROM employees;RANK()
SELECT name, salary,
RANK() OVER (ORDER BY salary DESC) as rnk
FROM employees;DENSE_RANK()
SELECT name, salary,
DENSE_RANK() OVER (ORDER BY salary DESC) as drnk
FROM employees;LEAD() & LAG()
SELECT name, salary,
LAG(salary, 1, 0) OVER (ORDER BY salary) as prev_salary,
LEAD(salary, 1, 0) OVER (ORDER BY salary) as next_salary
FROM employees;Module 10: Database Relationships
Database relationships define how tables are connected to each other.
Each record in Table A can have only one matching record in Table B, and vice versa.
A record in Table A can have many matching records in Table B, but a record in Table B has only one matching record in Table A.
A record in Table A can have many matching records in Table B, and a record in Table B can have many matching records in Table A. Requires a junction/associative table.
Module 11: Normalization
Normalization is the process of organizing the columns and tables of a relational database to minimize data redundancy and improve data integrity.
Eliminate repeating groups in individual tables. Create a separate table for each set of related data and identify each set of related data with a primary key.
Meet all the requirements of the first normal form. Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
Meet all the requirements of the second normal form. Remove columns that are not directly dependent upon the primary key.
A stronger version of 3NF. Deals with cases where there are overlapping candidate keys.
Module 12: Indexing
Indexes are special lookup tables that the database search engine can use to speed up data retrieval.
Determines the physical order of data in a table. A table can have only one clustered index.
Does not alter the physical order of the table. Contains the indexed fields and pointers to the actual data rows.
Module 13: Transactions
A transaction is a single logical unit of work. It is a sequence of operations performed as a single logical unit of work. The transaction's operations either all occur or none of them occur.
ACID Properties
- Atomicity: All operations in a transaction either complete or none complete.
- Consistency: Ensures that the database changes states only in allowed ways.
- Isolation: Concurrent transactions execute independently without interference.
- Durability: Ensures that once a transaction has been committed, it will remain committed even in the event of a system failure.
Transaction Commands
BEGIN TRANSACTION; -- or START TRANSACTION;
-- SQL statements here
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT; -- Make changes permanent
-- ROLLBACK; -- Undo changes if an error occurs
Module 14: Views
A view is a virtual table based on the result-set of an SQL statement. A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database.
CREATE VIEW EmployeeSalaries AS
SELECT name, salary
FROM employees
WHERE department = 'IT';
Practical Usage: Simplify complex queries, enhance security, and present data in a customized way.
Module 15: Stored Procedures
A stored procedure is a prepared SQL code that you can save, so the code can be reused over and over again. If you have an SQL query that you write over and over again, save it as a stored procedure, and then just call it to execute.
DELIMITER //
CREATE PROCEDURE GetEmployeeByID (IN emp_id INT)
BEGIN
SELECT * FROM employees WHERE id = emp_id;
END //
DELIMITER ;
-- To call the procedure:
CALL GetEmployeeByID(1);
Module 16: Triggers
A trigger is a set of SQL statements that automatically "fires" or executes when a specified event occurs in a table (e.g., an INSERT, UPDATE, or DELETE operation).
DELIMITER //
CREATE TRIGGER before_employee_insert
BEFORE INSERT ON employees
FOR EACH ROW
BEGIN
IF NEW.salary < 30000 THEN
SET NEW.salary = 30000;
END IF;
END //
DELIMITER ;
Module 17: Performance Optimization
Query Optimization Techniques
- EXPLAIN: Use the
EXPLAIN(orEXPLAIN ANALYZE) statement to understand how your database executes a query. - Avoid SELECT *: Only select the columns you need to reduce data transfer.
- Proper Indexing: Create indexes on columns used in WHERE clauses, JOIN conditions, and ORDER BY clauses.
- Query Tuning: Rewrite inefficient queries, e.g., using joins instead of subqueries where appropriate.
EXPLAIN SELECT * FROM employees WHERE salary > 50000;Real-World Example: Slow Query Fix
Imagine a query taking too long. We analyze it with EXPLAIN and find a full table scan. Adding an index to the 'salary' column dramatically speeds up the query.
Module 18: SQL Security
SQL Injection
A code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution.
-- Malicious input: ' OR 1=1 --
SELECT * FROM users WHERE username = '' OR 1=1 --' AND password = 'password';
Prevention Methods
- Parameterized Queries (Prepared Statements): The safest way to prevent SQL injection.
- Input Validation: Sanitize and validate all user input.
- Least Privilege: Grant users only the necessary permissions.
User Permissions & Roles
CREATE USER 'analyst'@'localhost' IDENTIFIED BY 'secure_password';
GRANT SELECT ON company.employees TO 'analyst'@'localhost';
FLUSH PRIVILEGES;
Best Practices: Regular security audits, strong passwords, keep software updated.
Module 19: Real-World Projects
Apply your knowledge with these practical projects:
Database Design
Tables: Books, Authors, Members, Loans. Relationships: One-to-many (Author to Books), Many-to-many (Books to Members via Loans).
SQL Queries
Examples: Add new book, borrow book, return book, list overdue books.
Database Design
Tables: Products, Customers, Orders, OrderItems, Categories.
SQL Queries
Examples: Calculate total sales, find top customers, product inventory.
Database Design
Tables: Patients, Doctors, Appointments, MedicalRecords.
SQL Queries
Examples: Schedule appointments, retrieve patient history.
Database Design
Tables: Accounts, Transactions, Customers.
SQL Queries
Examples: Deposit, withdrawal, transfer funds, view transaction history.
Database Design
Tables: Students, Courses, Enrollments, Instructors.
SQL Queries
Examples: Enroll student, assign grades, list courses.
Module 20: SQL for Data Analysts
Key Performance Indicator (KPI) Queries
-- Total Revenue
SELECT SUM(amount) FROM sales;
-- Number of New Customers Last Month
SELECT COUNT(DISTINCT customer_id)
FROM customers
WHERE registration_date BETWEEN '2023-09-01' AND '2023-09-30';
Sales Analysis
-- Sales by Product Category
SELECT c.category_name, SUM(oi.quantity * oi.price) as total_sales
FROM order_items oi
JOIN products p ON oi.product_id = p.id
JOIN categories c ON p.category_id = c.id
GROUP BY c.category_name
ORDER BY total_sales DESC;
Customer Segmentation
-- High-Value Customers (e.g., top 10% by total spend)
WITH CustomerSpend AS (
SELECT c.id, c.name, SUM(oi.quantity * oi.price) as total_spent
FROM customers c
JOIN orders o ON c.id = o.customer_id
JOIN order_items oi ON o.id = oi.order_id
GROUP BY c.id, c.name
)
SELECT id, name, total_spent
FROM CustomerSpend
WHERE total_spent > (SELECT PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY total_spent) OVER () FROM CustomerSpend)
ORDER BY total_spent DESC;
Business Reports
How to generate various reports using SQL, covering aspects like inventory, customer demographics, and more.
Module 21: SQL for Backend Developers
CRUD APIs with SQL
How to implement Create, Read, Update, Delete operations in backend applications interacting with a database.
-- Create (Insert)
INSERT INTO users (username, email, password_hash) VALUES ('newuser', 'user@example.com', 'hashed_password');
-- Read (Select)
SELECT id, username, email FROM users WHERE id = ?;
-- Update
UPDATE users SET email = 'new_email@example.com' WHERE id = ?;
-- Delete
DELETE FROM users WHERE id = ?;
Database Integration
Connecting various backend frameworks (e.g., Node.js with Express, Python with Django/Flask) to SQL databases.
Transactions in Backend
Managing database transactions within application logic to ensure data integrity for multi-step operations.
Pagination
Implementing efficient pagination for large datasets to improve application performance and user experience.
SELECT * FROM products ORDER BY id LIMIT 10 OFFSET 20; -- Get 10 products, skipping the first 20
Module 22: SQL Interview Preparation
100 Beginner Questions
Covers basic commands, data types, and simple queries. (Placeholder for content)
100 Intermediate Questions
Focuses on joins, subqueries, aggregate functions, and normalization. (Placeholder for content)
100 Advanced Questions
Includes window functions, CTEs, performance tuning, and complex problem-solving. (Placeholder for content)
Each question will include a detailed answer and explanation.
Module 23: Interactive Quizzes
Test your knowledge with our interactive quizzes!
Beginner Quiz
20 Multiple Choice Questions
Intermediate Quiz
20 Multiple Choice Questions
Advanced Quiz
20 Multiple Choice Questions
Features: Instant feedback, score calculation, progress tracking, and achievement badges.
Module 24: SQL Cheat Sheet
Download our comprehensive SQL cheat sheet for quick reference!
Includes:
- Common DDL/DML Commands
- Join Syntax (INNER, LEFT, RIGHT, FULL)
- Aggregate Functions (COUNT, SUM, AVG, MIN, MAX)
- Window Functions (ROW_NUMBER, RANK, LEAD, LAG)
- Best Practices & Optimization Tips
Module 25: SQL Learning Roadmap
Structured roadmaps to guide your learning journey.
30-Day Roadmap
Fast-track your SQL basics.
60-Day Roadmap
Cover intermediate to advanced concepts.
90-Day Roadmap
Become an SQL master with projects and optimization.
Additional Features
You'll find interactive notes and tips throughout the modules that you can expand for more detail.
Earn badges as you complete quizzes and modules!
Receive a certificate upon successful completion of the entire course and final assessment.
Learn from common pitfalls and how to avoid them in SQL.
- Forgetting `WHERE` clause with `UPDATE`/`DELETE`.
- Misunderstanding `LEFT JOIN` vs `INNER JOIN`.
- Not indexing large tables.
No comments:
Post a Comment