Funny, I just started doing similar work with the data a few days ago. It's definitely an interesting dataset that's been fun to play with.
From the privacy side of things, and to prove a point, I've been working on de-anonmyizing the data for the past few days by comparing the data against a few other datasets from FOIA and data.cityofchicago.org. It's so far been surprisingly easy to find the plate and driver info. The sort of myopic privacy with these large datasets is pretty shocking. (I don't plan on making the code/results public except to maybe Chicago's own data folk.)
From the privacy side of things, and to prove a point, I've been working on de-anonmyizing the data for the past few days by comparing the data against a few other datasets from FOIA and data.cityofchicago.org. It's so far been surprisingly easy to find the plate and driver info. The sort of myopic privacy with these large datasets is pretty shocking. (I don't plan on making the code/results public except to maybe Chicago's own data folk.)