Wednesday, August 31, 2016

Importing CSV/TSV data: COUNT aggregate function

Refresh your data journalism skills. Let's look at San Francisco police dept incident database for the years 2003-2013 available here: data.sfgov.org. For convenience, let's use a snapshot of it stored as Google spreadsheet. It contains about 15000 records, one record per incident.

Totals
Let's create a Pie chart with totals for each crime category.

1. Login to charte.ca and create a new Pie chart.

2. In the Data panel, click "Import" (you can find it right under the data grid).

3. On wizard screen #1, paste data from the spreadsheet (we will re-use this data in all examples):


4. On Wizard screen #2, select series (crime category), category(NONE) and value (NONE) columns:


5. On Wizard screen #3 (filtering), leave all fields blank - we do not need any filtering.

6. On wizard screen #4, keep aggregate function selection at COUNT, so data import routine simply counts correspondent records.

7. On Wizard screen #5 (sorting), leave default values in all fields and click "Done"

8. As a result, you will get this data grid and chart:


Totals by year
Naturally, now we are curious how crime picture progressed through the years. Initial steps are the same, but the chart type is Line now.



1. On Wizard screen #2, select series (crime category), category(year) and value (NONE) columns:


2. On Wizard screen #3 (filtering), leave all fields blank - we do not need any filtering.

3. Click through wizards screens #4 and #5 and get this data grid and chart:

Seasonal drinking
It would be nice to see if there is a seasonal pattern in DRUNKINNESS offense.

1. On Wizard screen #2, select series (year), category(month) and value (NONE) columns:


2. On Wizard screen #3 (filtering), select crime category - DRUNKINNESS.


3. Click through wizards screens #4 and #5 and get this data grid and chart:


Apparently, we did not discover any really interesting patterns here. No spikes on big holiday months. Maybe just a crackdown on drunkards in May-Oct 2013.

Incident resolution
Finally, it would be nice to learn how SF police deals with the offenders. Let's add incident resolution to the picture.

1. On Wizard screen #2, select series (resolution), category(year) and value (NONE) columns:


2. On Wizard screen #3 (filtering), select crime category - DRUNKINNESS (apparently, this type of offense is the focus of our little research) and a few types of resolution that are common for this type of offence: "ARREST,BOOKED", "NONE", or "ARREST, CITED".



3. Click through wizards screens #4 and #5 and get this data grid and chart:


Apparently, during the 2015 crackdown, police officers were instructed to take serious offenders into custody, give warnings to peaceful random folks, and use less citations.

As you see, in all these examples we used only one kind of aggregate functions - COUNT. Read about AVG and SUM functions here.

5 comments:

  1. I think this is one of the most important information for me. And i am glad reading your article. 야한동영상

    Click this link
    야설

    ReplyDelete
  2. It’s really a cool and useful piece of info. I’m glad that you shared this useful info with us. Please keep us informed like this. Thanks for sharing. 일본야동

    Click this link
    한국야동

    ReplyDelete
  3. This is the best article for anyone that wants to learn about this subject. 한국야동닷컴

    Click this link
    국산야동

    ReplyDelete
  4. I delight in, cause I found just what I was looking for. You have ended my four day long hunt! God Bless you man. Have a great day. 국산야동

    Click this link
    야설

    ReplyDelete
  5. It’s an amazing paragraph in support of all the web visitors; they will get benefit from it I am sure. 중국야동넷

    Click this link
    야설

    ReplyDelete