SQL Notes
Basics
-
Create and use the database (MySQL as an example)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
CREATE DATABASE gregs_list; USE gregs_list; CREATE TABLE doughnut_list ( doughnut_id INT NOT NULL AUTO_INCREMENT, doughnut_name VARCHAR(10) DEFAULT NULL, doughnut_type VARCHAR(8) DEFAULT NULL, PRIMARY KEY (doughnut_id), FOREIGN KEY (doughnut_type) REFERENCES foo(doughnut_type) ); -- Variable Char, up to 10 and 8 chars. DESC doughnut_list; -- Describe the table. -- DROP TABLE doughnut_list; -- Delete the table. INSERT INTO doughnut_list (doughnut_name, doughnut_type) VALUES ('hello', 'world'); -- The order doesn't matter, but they must match -- We can also insert multiple rows at onces (...), (...) SELECT * FROM doughnut_list; -- Show the table. * means all columns. SELECT doughnut_name FROM doughnut_list; -- Just show the name column SELECT * FROM doughnut_list WHERE doughnut_name = 'hello' AND doughnut_type = 'world'; -- Show the table selectively. = instead of == -- Create a copy of a table CREATE TABLE foo_copy AS SELECT * FROM foo;
-
For
INSERT
, there are three variations from above:- Changing the order of the columns, as long as the values match them in the same order.
- Omitting column names altogether, as long as we add all the values in the same order.
- Leaving some columns out, as long as the values we’re adding match the columns.
-
\'
or''
to insert data with the single quote. -
Single line comments:
--
; multi-line comments:/**/
. -
In the terminal:
1 2 3 4
mysql -u root -p ... show databases; use foo; show tables;
-
We can
INSERT
DEFAULT
andNULL
as well. -
INSERT INTO foo SELECT * FROM bar
is also possible. -
Convert data types with
CAST(x AS foo)
andCONVERT(x, foo)
. -
FORMAT(number, decimal)
formats the number. e.g.FORMAT(2.3401, 2)
gives 2.34. -
CHAR(9)
is tab,CHAR(10)
is line feed,CHAR(13)
is carriage return.CHAR(13, 10)
is CRLF. -
An index speeds up the queries by letting the DB go directly to a row instead of doing a full scan. MySQL creates indexes for primary keys, foreign keys, and unique keys by default. Avoid creating indexes on columns that are updated frequently as it slows down the insert, update, and delete.
1
CREATE INDEX invoice_date_index ON invoices (invoice_date);
-
DECIMAL(5, 2)
means 5 digits in total with 2 decimal points inside (e.g. 123.45).
SELECT
-
Both
<>
and!=
mean not equal. -
Use
IS NULL
to select null ones:WHERE doughnut_name IS NULL;
. -
WHERE location LIKE '%CA';
selects all the values that end with CA.%
replaces anything while_
replaces one letter. -
WHERE value BETWEEN 1 AND 10;
. -
WHERE language IN ('Java', 'Python', 'C', 'C++');
is equivalent to usingOR
. Plus, we haveWHERE language NOT IN ('Java', 'Python', 'C', 'C++');
. -
NOT
goes right afterWHERE
when used withBETWEEN
andLIKE
:WHERE NOT location LIKE '%CA';
WHERE NOT value BETWEEN 1 AND 10;
. -
ORDER BY foo DESC, bar;
can be added to the end (DESC means descending.) -
Select from two tables:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
SELECT o.OrderID, c.CustomerName, c.OrderDate FROM Orders o JOIN Customers c ON o.CustomerID=c.CustomerID; -- Self-join is also possible -- join a table from another db: foo.table -- Or do the implicit join SELECT o.OrderID, c.CustomerName, c.OrderDate FROM Orders o, Customers c WHERE o.CustomerID=c.CustomerID; -- Or with USING SELECT foo, bar FROM a JOIN b USING (same_column); -- Or natural join (based on columns with same names) SELECT foo, bar FROM a NATURAL JOIN b; -- Cross join creates the Cartesian product SELECT foo, bar FROM a CROSS JOIN b;
-
LIMIT 100
limits the number of rows to 100.LIMIT 2, 3
gets 3 rows that start with the third row (rows are also 0-index based). -
We can also use
AS
likeSELECT invoice - payment - credit AS balance_due
andSELECT CONCAT(first_name, ' ', last_name) AS full_name
. Simple alias are also possibleSELECT foo AS bar
. -
Use
/
for division andDIV
for integer division. Use%
orMOD
for modulo. -
LEFT
,DATE_FORMAT
, andROUND
:1 2 3 4 5 6 7 8
-- LEFT for string prefixes SELECT first_name, last_name, CONCAT(LEFT(first_name, 1), LEFT(last_name, 1)) AS initials FROM foo; SELECT invoice_date, DATE_FORMAT(invoice_date, '%m/%d/%y') AS 'MM/DD/YY', DATE_FORMAT(invoice_date, '%e-%b-%Y') AS 'DD-Mon-YYYY' FROM invoices ORDER BY invoice_date; -- Round to the nearest integer ROUND(total) -- Round to one decimal place ROUND(total, 1)
-
SELECT DISTINCT foo
eliminates duplicates. -
LIKE '%foo'
matches zero or more chars andLIKE '-foo'
matches one char. -
Union.
1 2 3 4 5 6 7 8
-- No duplicates SELECT column_name(s) FROM table1 UNION SELECT column_name(s) FROM table2; -- Allow duplicates SELECT column_name(s) FROM table1 UNION ALL SELECT column_name(s) FROM table2;
DELETE
-
DELETE FROM doughnut_list WHERE doughnut_type = 'bar';
. -
DELETE FROM doughnut_list;
deletes the entire table. -
Use
SELECT
first to make sure what we’re deleting is correct.
UPDATE
-
UPDATE doughnut_list SET doughnut_type = 'barbar' WHERE doughnut_type = 'bar';
. -
Without
WHERE
, every possible one will be updated. -
Use comma to separate different columns after
SET
. -
It can do basic math:
SET cost = cost + 1
.
ALTER
-
ALTER TABLE doughnut_list ADD COLUMN doughnut_id INT NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (doughnut_id);
-
FIRST
,SECOND
,LAST
,BEFORE foo
,AFTER foo
can be used. -
ALTER TABLE doughnut_list RENAME TO foo;
. -
ALTER TABLE doughnut_list CHANGE COlUMN foo bar INT;
changes foo to bar with int. -
ALTER TABLE doughnut_list MODIFY COLUMN foo VARCHAR(10);
changes the data type. We can also doFIRST
orLAST
. -
ALTER TABLE doughnut_list DROP COLUMN foo;
deletes a column.
Summary Queries
-
AVG()
,SUM()
,MIN()
,MAX()
,COUNT()
are aggregate functions. -
GROUP BY foo_id HAVING AVG(bar) > 42
collapses each group into one row.HAVING
does the search after the aggregating, whereasWHERE
does it before grouping. -
WITH ROLLUP
adds a summary roll at the end of eachGROUP
. -
GROUPING
returns 1 if null because of being in a summary row fromWITH ROLLUP
. Otherwise (in a grouped row), return 0. -
Aggregate window functions don’t collapse each group into one row. An empty
OVER()
treats the entire result set as a partition.ORDER BY
makes the values cumulative. -
Frames with
ROWS
andRANGE
-
Named windows can also be used
WINDOW vendor_window AS (PARTITION BY vendor_id)
. And thenOVER vendor_window
is possible.
Subqueries
-
Example.
1 2
SELECT foo FROM bar WHERE foo_id IN (SELECT foo_id FROM foobar WHERE x = 'x');
-
ANY
,SOME
,ALL
can be used. E.g.,WHERE x > ALL (...)
. -
Correlated subqueries are executed once for each row in the main query, whereas uncorrelated subqueries are run only once.
1 2 3 4 5 6
-- Correlated subquery SELECT vendor_id, invoice_number, invoice_total FROM invoices i WHERE invoice_total > (SELECT AVG(invoice_total) FROM invoices WHERE vendor_id = i.vendor_id); -- Correlated subquery with EXISTS. EXISTS tests if the subquery returns a result set. SELECT vendor_id, vendor_name, vendor_state FROM vendors WHERE NOT EXISTS (SELECT * FROM invoices WHERE vendor_id = vendors.vendor_id);