Our database for today: name id varchar int, pk. Model: Artists has_many albums
artists varchar name artist_id int, fk id int, pk. Model: Album has_one artist ...
Databases & Python Where is my mind?
Last Time: Types of databases Basics of the relational model Introduction to SQL
Our database for today:
songs
albums
artists
Model: Song
Model: Artists
Model: Album
has_many albums
has_one artist, has_many songs
has_one album
id
int, pk
id
int, pk
id
int, pk
name
varchar
artist_id
int, fk
album_id
int, fk
name
varchar
title
varchar
track_number
int
date_added
timestamp
total_time
int
Most programming languages handle databases the same way: 1. Connect to the database 2. Create some sort of “cursor” or “statement” 3. Construct a query 4. Ask the cursor to execute the query 5. Interact with the results 5.1.(typically using iteration)
Python follows this pattern: import sqlite3 conn = sqlite3.connect(‘itunes.db’) cursor = conn.cursor() query = “select name from artists” results = cursor.execute(query) for r in results: print r
$ python all.py (u'Sneaker Pimps',) (u'Arlo Guthrie',) (u'Dropkick Murphys',) ....
http://docs.python.org/library/sqlite3.html
Other ways to get the results: results = cursor.execute(query) all_results = results.fetchall(); first_result = results.fetchone(); another_result = results.fetchone(); # if no more results: another_result == None # true!
Each row is represented by a tuple: first_result = results.fetchone(); print first_result # prints (u'Sneaker Pimps',) print first_result[0]
By setting up the connection differently, we can treat rows as dictionaries: import sqlite3 conn = sqlite3.connect(‘itunes.db’) conn.row_factory = sqlite3.Row cursor = conn.cursor() query = “select name from artists” results = cursor.execute(query) for r in results: print r[‘name’]
Building SQL queries dynamically: query = “select name from artists”
Not too realistic; usually, we’ll want a particular artist: query = “select id from artists where name = ‘Pixies’”
Of course, we don’t usually want to hard-code names like that: query = “select id from artists where name = ‘Pixies’” band_name = ‘Pixies’ query = “select id from artists where name = ‘%s’” % band_name
What’s wrong here? band_name = “Old 97’s” query = “select id from artists where name = ‘%s’” % band_name
If you’re lucky, the query will just error out...
steven% python query.py query was: select id from artists where name = 'Old 97's' Traceback (most recent call last): File "all.py", line 12, in results = cursor.execute(query) sqlite3.OperationalError: near "s": syntax error steven%
If you’re unlucky, it could be much worse:
steven% python query.py select id from artists where name = 'old'); drop table artists; -‐-‐' steven%
http://xkcd.com/327/
So, what should we do instead? Use placeholders: band_name = ‘Pixies’ query = “select id from artists where name = ?” results = curs.execute(query, band_name)
# Larger example, from the Python sqlite3 docs: for t in [('2006-‐03-‐28', 'BUY', 'IBM', 1000, 45.00), ('2006-‐04-‐05', 'BUY', 'MSOFT', 1000, 72.00), ('2006-‐04-‐06', 'SELL', 'IBM', 500, 53.00), ]: c.execute('insert into stocks values (?,?,?,?,?)', t)
Challenge: Find all albums by the Pixies songs
albums
artists
Model: Song
Model: Artists
Model: Album
has_many albums
has_one artist, has_many songs
has_one album
id
int, pk
id
int, pk
id
int, pk
name
varchar
artist_id
int, fk
album_id
int, fk
name
varchar
title
varchar
track_number
int
date_added
timestamp
total_time
int
1. Find the artist id 2. Find all albums with that artist id
Challenge: Find all albums by the Pixies import sqlite3 conn = sqlite3.connect(‘itunes.db’) cursor = conn.cursor() artist_query = “select id from artists where name = ‘Pixies’” results = cursor.execute(query) r = results.fetchone() artist_id = r[0]
What if I don’t have anything by the Pixies?
album_query = “select * from albums where artist_id = ?” albums = cursor.execute(album_query, artist_id) for a in album: print a[2] # third column is name
Or, we could do it in one query using an inner join: select alb.name from artists art inner join albums alb on art.id = alb.artist_id where art.name = 'Pixies'; name Death to the Pixies Bossa Nova Surfer Rosa
Additional important SQL tricks: Table aliasing: select alb.name from albums alb, artists art where art.name = ‘Pixies’ and alb.artist_id = art.id
Additional important SQL tricks: Column aliasing: select alb.name as album_name from albums alb, artists art where art.name = ‘Pixies’ and alb.artist_id = art.id for r in results: print r[‘album_name’]
Additional important SQL tricks: Ordering: select name from artists order by name asc select title from songs order by date_added desc
Additional important SQL tricks: Limiting the number of results: select title, total_length from songs order by total_length desc limit 5
“Five longest songs in database.”
Additional important SQL tricks: Wildcards: select name from artists where name like ‘A%’
“All artists whose name starts with ‘A’” select name from albums where name like '%blues%';
All albums with “blues” in the title...
Additional important SQL tricks: Aggregate functions: select count(id) from albums where artist_id = 278
select count(albums.id) from albums inner join artists on albums.artist_id = artists.id where artists.name = ‘Pixies’
Additional important SQL tricks: Aggregate functions: select count(albums.id) as c, artists.name from albums inner join artists on albums.artist_id = artists.id group by artists.id limit 3 name
count
Sneaker Pimps
1
Arlo Guthrie
4
Dropkick Murphys
3
Additional important SQL tricks: Aggregate functions: select count(albums.id) as c, artists.name from albums inner join artists on albums.artist_id = artists.id group by artists.id order by c desc limit 3 name
count
Ani DiFranco
17
León Gieco
12
Bad Religion
12