Open data: Transport Canada vehicle recalls

by Leslie Young

There’s a new addition to the federal government’s data.gc.ca portal: vehicle recalls from Transport Canada.

This is a dataset I had filed an Access to Information request for a while ago (see the interactive app I built with it here) and after the story was published, I had contacted the Treasury Board’s data.gc.ca people to let them know that this data should be publicly available for download. They replied, saying that they agreed.

And now it’s up. There are two sets: one for all recalls, and one for those within the last sixty days. The former is updated monthly, the latter daily.

We had already uploaded the dataset to our own Open Data page, but it’s good to have updated versions coming out.

It’s certainly useful stuff and just the kind of thing that belongs on an open data site.

Tell the government about your apps!

By Leslie Young

I blogged recently about an online application I built using Transport Canada’s Road Safety Recalls Database. The underlying data for my story is already available online, but was not easily available for download.

I called their media department to try to obtain a copy of the database that was powering their already public recall search tool, but was told that this was impossible. I ended up filing an ATIP request for the information. This is really overkill when the information is already public, just not in a convenient format… but it seemed the only way to get a copy of the database. 

After we released our story on vehicle recalls, I received a tweet suggesting I submit my app to the federal government’s open data portal. They have a form that lets you tell about how you used data.gc.ca data. Here is what I wrote:

Application Name : Interactive: Vehicle recalls

URL : http://www.globalnews.ca/Pages/storyFullWidth.aspx?id=6442664891

Description : This is an interactive web application built using Transport Canada’s Road Safety Recalls Database. It allows users to look up and compare recall information on different vehicles. This data was not on data.gc.ca. Although a searchable version of the database is on Transport Canada’s website, I was required to file an ATIP request for a copy of the data. I would argue that this data should be made available on data.gc.ca, as it would allow this kind of application to be constantly updated with the latest recall information, rather than just be a snapshot. There are many similar databases on GoC websites, which are public and searchable, but do not allow users to download the raw data. These should be added to the open data site.

 The next day, I received this response:

Good afternoon,

Thank you for contacting the Government of Canada’s Open Data Portal.

We would like to thank you for bringing this application to our attention. We agree that this data, as well as its application, should be made available on the Open Data Portal.

After receiving your email this morning, we contacted Transport Canada to suggest that they provide the raw data on the portal. We anticipate that this dataset will be available on the Government of Canada’s Open Data Portal in the future.

Thank you,

Open Data

Government of Canada

Based on this experience, I would suggest to people that they write the Open Data site and share their apps, and even suggest new datasets. It seems they read their email, and maybe we’ll start seeing useful data appearing on the site.

In the meantime, you can download my copy of the vehicle recalls data here.

Interactive vehicle recall data

By Leslie Young

I recently put together an interactive feature that lets users explore vehicle recall data and compare recalls on different cars. The full news story is here, with a nice TV report by Global Toronto’s Sean O’Shea.

The data for this is drawn from Transport Canada’s Road Safety Recalls Database, a searchable version of which is available on their website, although I had to do an Access to Information request to get the underlying data.

What excited me about this project is that it’s truly interactive. What you see on the screen completely changes depending on what you type. Everyone will have a different experience because everyone will want to search and compare different vehicles. Hopefully, it’s both informative and fun.

I used Tableau Public to power this feature. I’m liking this tool a lot lately because of the way you can link a variety of different visualizations together to really explore a given topic.

You do this by creating filters and applying them to multiple charts - in this case, I created some “Quick filters” and set them to “Global.”

It’s not perfect though. Tableau is powerful, but very non-intuitive to use. This project took a lot of trial and error and email exchanges with Tableau staff to put together. The Public free version also has a limit of 100,000 rows, which meant that I had to summarize my data and make it show only recalls from 1990 and later.

Still, pretty fun. Last week, I also did a similar visualization (under the hood at least) on MP’s expenses.

Freedom of Information and the PDF

By Leslie Young

PDFs are a fact of life in data journalism. Most of the time, when you request “an electronic file” or “an electronic document” in your Freedom of Information request, the end result is a PDF.

While it’s a step up from simply getting a stack of papers as your response, a PDF response is annoying in a number of ways. It’s hard to work with. Unlike a spreadsheet format, (Excel, .CSV, etc.) you can’t analyze a PDF. You often can’t copy and paste or export the data – sometimes Acrobat won’t even recognize the document as text!

Some departments do respond with an Excel file. Some are even nice enough, when I ask, to email me the Excel version of the PDF response that they had previously sent me. They don’t have to do this, and I really appreciate it.

But it seems like the default electronic file format is the PDF, which means that I will spend hours trying to force the information into a friendlier format. It doesn’t stop me from doing the story, it just makes it more difficult.

So why do many departments seem to favour the PDF? I decided to ask the Treasury Board Secretariat, the body charged with administering the federal government’s Access to Information legislation.

Here’s what they said.

Question:

I would like to know why, when a requester requests an electronic document, the response is usually provided as a PDF.

Why do departments seem to prefer releasing information as PDFs instead of a more open electronic file format, such as an Excel spreadsheet? This is particularly relevant in the case of a request for information from a database, which since it’s a table filled with numbers, would be more useful to a journalist in an Excel or other format.

Answer:

Our government is committed to openness and transparency which is why we are pursuing the Open Government initiative that will continue to make government data freely available, and currently requires all completed ATI summaries to be posted online within 30 calendar days of being readied.  Current Access to Information regulations direct departments to provide information in the format requested wherever possible, and our government continues to update and add to the already hundreds of thousands of data sets and the amount of information available to Canadians online in various formats.  Where alternative formats are not available or suitable, the government will respond with a pdf version in order to ensure that requests for information are still carried out effectively.

So it seems all you need to do is ask – very specifically. It’s a valuable lesson. Next time, I will make sure to ask for a .CSV, and see what happens.

In which I find a use for Elections Canada data

by Leslie Young

During the last federal election, we heard about complaints of election irregularities, loosely defined. These included things like harassing phone calls, slashed tires and much more, largely in Guelph and some Toronto ridings.

At that time, I thought an informative little feature would be to look into how many complaints there have been over the last few elections. Maybe there were a lot of complaints, so this year wasn’t unusual. Or maybe the opposite.

So, I looked into Elections Canada’s reports. After every election, the agency issues a report that in part, summarizes the number of complaints that they received. But since you can register a complaint up to ten years after an incident, these aren’t a full list.

So I called Elections Canada, and they couldn’t give me the numbers. Eventually, I decided it was worthwhile to file an Access to Information request for the complaints. Unsatisfied with my first response, I filed a second, more detailed request.

This is what I got (one of three files):

Access to Information request, Elections Canada

This is a pdf document, printed and re-scanned a couple of times, then redacted, then scanned again. Un-copy-and-paste-able, un-crackable - this is data journalism hell.

Why so much redaction? According to the letter I received from Elections Canada, some information was withheld according to sections 16.3 (which gives special rules to the Chief Electoral Officer) and 19(1) (general privacy considerations) of the Access to Information Act.

Also, they write that “the Office of the Commissioner of Canada Elections must be able to maintain the confidentiality of its files in order to protect the presumption of innocence”. They go on to write, “the exemptions are applied to protect against the risk that a specific allegation might mistakenly be viewed as substantiated or used for political purposes.”

Looking at what I had, and unable to do any detailed analysis on my documents, I sat on this for months. I had no idea what to do with it.

Until of course, the fraudulent robocalls story hit the news. Then, I went through my file with Document Cloud, printed it out, got out my highlighter and counted through every mention of telephone calls by hand.

I found 30 complaints that referred to calls saying that polling stations had been moved.

Since then, we’ve heard that Elections Canada has had 30,000 contacts on this issue. The story goes on.