Before class
Reading
- "The Waste Lands" by John Emshwiller and Jeremy Singer-Vine, The Wall Street Journal, 2013. Presented by Julia Harte.
- "Top Secret America" by Dana Priest and William M. Arkin, The Washington Post, 2010. Presented by Daniel Drepper.
Due Monday, March 10 (enjoy the nice weekend!) Get as far as you can on analyzing the table of service calls in New York City for January and February 2014. Expect to have to use SELECT, GROUP BY, ORDER BY and WHERE clauses in your queries. Try to write a lede of a story about what New Yorkers were complaining about in Mayor DeBlasio's first two months in office. Be sure to send us a) your lede, and b) a document showing the SQL for the key queries you wrote. We don't need all of them, but there may be several that contribute to your lede. Turn something in, even if you can't get very far. Without it, we can't help you.
NOTE: I made a mistake on the original practice database, which only had one day, not two months' worth of complaints. Because of the size, I've created another download that you can import into your practice database the same way you did before, this time called janrequests.sql. It contains only two weeks' worth of complaints, from Mayor DeBlasio's first two full weeks in office. If you don't remember how to import, go back to the video and skip to the very end. It'll show you the screens.
In class
Resources
MySQL videos, practice data and tip sheets
- Video on installing MySQL on your Macbooks. This includes instructions on how to load the practice data onto your computer. I had trouble starting the SQL Server until I rebooted. Others had trouble with the 10.7 version but found the 10.6 version OK.
- The practice data in the form of a MySQL dump file. As of Sunday morning, it includes the corrected version of New York City service requests, called janrequests.
- Tips for working with MySQL Workbench, including how to get rid of the 1,000-record limit.
- A cheat sheet for how to use MySQL for filtering. If you want to follow along, practice importing the self-contained dump file into your practice database from this file. It has both of the tables that are used in the cheat sheet.
- A cheat sheet for using MySQL for summing and counting using group-by queries.
Other tools for the week
- Open Refine. This tool helps you cluster records in dirty data and perform rudimentary analysis. For more information, watch these screencasts.
- SQLite Manager, a plugin for the Firefox that allows you to use your browser is a tool for managing SQLite databases (includes a tool for importing CSV files)
Troy Thibodeaux's gentle introduction to SQL using SQLite
- Part 1 and Part 2. (Don't worry about the creation of databases for now -- start with selecting)
- Practice dataset #1, csv of independent expenditures from Federal Elections Commission for 2014 elections so far
Other SQL tip sheets
Structured Query Language is such a standard way to look things up in databases that there are a gizillion tutorials already out in the wild. Being with W3Schools.com. There are slightly different flavors of SQL depending on which software you use, but in general they all have the same features.
There are good IRE tip sheets for Microsoft Access, but none of you have that available to you on your Macs. That said, the tip sheets give you a good overview of WHY you want to graduate from a spreadsheet, and the kinds of things a "relational" database can do that are not ideal in spreadsheets.