Saturday 4:30 p.m.–4:50 p.m. in Terrace
Diving into Open Data with IPython Notebook & pandas
Julia Evans
- Audience level:
- Intermediate
Description
I'll walk you through Python's best tools for getting a grip on some new open data: IPython Notebook and pandas. I'll show you how to read in data, clean it up, graph it, and draw some conclusions, using some open data about the number of cyclists on Montréal's bike paths as an example.
Abstract
The idea is to choose a sample data set (cyclists on Montréal's bike paths), and
- clean up the data (fix date formatting issues, remove null values, ...)
- graph the data
- download some weather data from the weather office website and look at the correlation between weather and number of people biking
- do some aggregation to find out how many people bike on Mondays vs Wednesdays
- talk about possible directions to take the project (make a model using scikit-learn or PyMC!)