Spatial Data Visualization

Ming Yu Liew


Dataset

Traffic Crashes - Vehicles - Dataset source, March-December 2020

Traffic Crashes - Crashes - Related dataset, March-December 2020

SR1050 Instruction Manual - Explains what some of the data elements represents

This dataset contains about 1.33 million records and 72 attributes (columns) such as crash date, vehicle info(brand, model, year, registration), unit type(bike, self-driving), etc. Each record is a vehicle (referred to as units in the data) that was involved in a traffic crash collected by the Chicago Police Department.


Questions and Visualizations


Question 1

Domain Question: What vehicles causes the most accidents?

Data Question: What is the distribution of unit no for each unit type?

Attributes: UNIT_NO, UNIT_TYPE, (total number of accidents reported)

We want to know which vehicles causes the most accidents. The most unit type that causes the most accidents will most likely be vehicles with drivers, but it would be helpful to check what other types of vehicles causes more accident, such as driverless and bicycles.

Question 2

Domain Question: How safe is it to bike in Chicago?

Data Question: What percentage of accidents involving bicycles resulted in an injury?

Attributes: UNIT_TYPE, CRASH_TYPE, (total number of accidents reported)

Instead of just creating a visualization for bicycles, we can use a selector from the visualization for question 1 and show for each unit type.

Visualization


Question 3

Domain Question: Do more peoeple get injured when there are more occupants in the vehicles?

Data Question: How does occupant count relate to total injuries and what is the distribution for each point?

Attributes: INJURIES_TOTAL, OCCUPANT_CNT, (total number of accidents reported)

We want to know if more people are involved in the accident would result in more injuries. This can help us determine the safety of vehicles that can carry a lot of passengers, like buses.

Question 4

Domain Question: How severe is the injury of the people involved in accidents?

Data Question: What is the distribution of the different injury types?

Attributes: INJURIES_FATAL, INJURIES_INCAPACITATING, INJURIES_NON_INCAPACITATING, INJURIES_REPORTED_NOT_EVIDENT, INJURIES_NO_INDICATION

By linking it with the visualization from question 3, we can see the most common type of injury suffered, especially accidents reported with a higher total injuries.

Visualization


Question 5

Domain Question: Which months are there more accidents?

Data Question: What is the distribution of the accidents per month?

Attributes: CRASH_DATE, (total number of accidents reported)

The main motivation is to find out how road activity changes over the seasons and how different events may affect the number of accidents occuring.

Question 6

Domain Question: Which day do most accidents occur?

Data Question: What is the distribution of the accidents at over each month?

Attributes: CRASH_DATE, (total number of accidents reported)

The motivation is the same as the previous question but we look more into the details by checking the distribution per day over a whole month

Visualization


Spatial View


The map below is made using Leaflet and the map tile is from OpenStreetMap