Where is my mind?

8 downloads 274 Views 891KB Size Report
Our database for today: name id varchar int, pk. Model: Artists has_many albums artists varchar name artist_id int, fk id int, pk. Model: Album has_one artist ...
Databases & Python Where is my mind?

Last Time: Types of databases Basics of the relational model Introduction to SQL

Our database for today:

songs

albums

artists

Model: Song

Model: Artists

Model: Album

has_many albums

has_one artist, has_many songs

has_one album

id

int, pk

id

int, pk

id

int, pk

name

varchar

artist_id

int, fk

album_id

int, fk

name

varchar

title

varchar

track_number

int

date_added

timestamp

total_time

int

Most programming languages handle databases the same way: 1. Connect to the database 2. Create some sort of “cursor” or “statement” 3. Construct a query 4. Ask the cursor to execute the query 5. Interact with the results 5.1.(typically using iteration)

Python follows this pattern: import  sqlite3 conn  =  sqlite3.connect(‘itunes.db’) cursor  =  conn.cursor() query  =  “select  name  from  artists” results  =  cursor.execute(query) for  r  in  results: print  r

$  python  all.py (u'Sneaker  Pimps',) (u'Arlo  Guthrie',) (u'Dropkick  Murphys',) ....

http://docs.python.org/library/sqlite3.html

Other ways to get the results: results  =  cursor.execute(query) all_results  =  results.fetchall(); first_result  =  results.fetchone(); another_result  =  results.fetchone(); #  if  no  more  results: another_result  ==  None  #  true!

Each row is represented by a tuple: first_result  =  results.fetchone(); print  first_result  #  prints  (u'Sneaker  Pimps',) print  first_result[0]

By setting up the connection differently, we can treat rows as dictionaries: import  sqlite3 conn  =  sqlite3.connect(‘itunes.db’) conn.row_factory  =  sqlite3.Row cursor  =  conn.cursor() query  =  “select  name  from  artists” results  =  cursor.execute(query) for  r  in  results: print  r[‘name’]

Building SQL queries dynamically: query  =  “select  name  from  artists”

Not too realistic; usually, we’ll want a particular artist: query  =  “select  id  from  artists  where  name  =  ‘Pixies’”

Of course, we don’t usually want to hard-code names like that: query  =  “select  id  from  artists  where  name  =  ‘Pixies’” band_name  =  ‘Pixies’ query  =  “select  id  from  artists  where  name  =  ‘%s’”  %  band_name

What’s wrong here? band_name  =  “Old  97’s” query  =  “select  id  from  artists  where  name  =  ‘%s’”  %  band_name

If you’re lucky, the query will just error out...

steven%  python  query.py   query  was:  select  id  from  artists  where  name  =  'Old  97's' Traceback  (most  recent  call  last):    File  "all.py",  line  12,  in          results  =  cursor.execute(query) sqlite3.OperationalError:  near  "s":  syntax  error steven%  

If you’re unlucky, it could be much worse:

steven%  python  query.py   select  id  from  artists  where  name  =  'old');  drop  table  artists;  -­‐-­‐' steven%  

http://xkcd.com/327/

So, what should we do instead? Use placeholders: band_name  =  ‘Pixies’ query  =  “select  id  from  artists  where  name  =  ?” results  =  curs.execute(query,  band_name)

#  Larger  example,  from  the  Python  sqlite3  docs: for  t  in  [('2006-­‐03-­‐28',  'BUY',  'IBM',  1000,  45.00),                    ('2006-­‐04-­‐05',  'BUY',  'MSOFT',  1000,  72.00),                    ('2006-­‐04-­‐06',  'SELL',  'IBM',  500,  53.00),                  ]:        c.execute('insert  into  stocks  values  (?,?,?,?,?)',  t)

Challenge: Find all albums by the Pixies songs

albums

artists

Model: Song

Model: Artists

Model: Album

has_many albums

has_one artist, has_many songs

has_one album

id

int, pk

id

int, pk

id

int, pk

name

varchar

artist_id

int, fk

album_id

int, fk

name

varchar

title

varchar

track_number

int

date_added

timestamp

total_time

int

1. Find the artist id 2. Find all albums with that artist id

Challenge: Find all albums by the Pixies import  sqlite3 conn  =  sqlite3.connect(‘itunes.db’) cursor  =  conn.cursor() artist_query  =  “select  id  from  artists  where  name  =  ‘Pixies’” results  =  cursor.execute(query) r  =  results.fetchone() artist_id  =  r[0]

What if I don’t have anything by the Pixies?

album_query  =  “select  *  from  albums  where  artist_id  =  ?” albums  =  cursor.execute(album_query,  artist_id) for  a  in  album: print  a[2]  #  third  column  is  name

Or, we could do it in one query using an inner join: select  alb.name   from  artists  art   inner  join  albums  alb   on  art.id  =  alb.artist_id   where  art.name  =  'Pixies'; name Death  to  the  Pixies Bossa  Nova Surfer  Rosa

Additional important SQL tricks: Table aliasing: select  alb.name   from  albums  alb,  artists  art   where  art.name  =  ‘Pixies’   and  alb.artist_id  =  art.id

Additional important SQL tricks: Column aliasing: select  alb.name  as  album_name from  albums  alb,  artists  art   where  art.name  =  ‘Pixies’   and  alb.artist_id  =  art.id for  r  in  results: print  r[‘album_name’]

Additional important SQL tricks: Ordering: select  name  from  artists  order  by  name  asc select  title  from  songs  order  by  date_added  desc

Additional important SQL tricks: Limiting the number of results: select  title,  total_length  from  songs  order  by   total_length  desc  limit  5

“Five longest songs in database.”

Additional important SQL tricks: Wildcards: select  name  from  artists  where  name  like  ‘A%’

“All artists whose name starts with ‘A’” select  name  from  albums  where  name  like  '%blues%';

All albums with “blues” in the title...

Additional important SQL tricks: Aggregate functions: select  count(id)  from  albums  where  artist_id  =  278

select  count(albums.id)   from  albums   inner  join  artists   on  albums.artist_id  =  artists.id   where  artists.name  =  ‘Pixies’

Additional important SQL tricks: Aggregate functions: select  count(albums.id)  as  c,  artists.name   from  albums   inner  join  artists   on  albums.artist_id  =  artists.id   group  by  artists.id   limit  3 name

count

Sneaker  Pimps

1

Arlo  Guthrie

4

Dropkick  Murphys

3

Additional important SQL tricks: Aggregate functions: select  count(albums.id)  as  c,  artists.name   from  albums   inner  join  artists   on  albums.artist_id  =  artists.id   group  by  artists.id   order  by  c  desc   limit  3 name

count

Ani  DiFranco

17

León  Gieco

12

Bad  Religion

12