A research team planned to study Australian road transport crash fatalities from 2010 to 2018
(inclusive). As a team member, you were given the dataset about Australian Road Death
Fatalities (https://data.gov.au/data/dataset/australian-road-deaths-database ), and were
requested to analyse the data and prepare a report about your work and findings.
The dataset can be downloaded from Blackboard or the above website. The dataset contains
basic demographic and crash details of Australian road crashes between 1989 and 2019. As
the team does not have any specific goal for the analysis, you have the freedom to explore the
data, and dig out anything you feel interesting or significant. However, you are to limit your
research and analysis to the years 2010 to 2018.
The potential audiences include other researchers, business representatives, and government
agencies. They may have limited ICT or mathematical knowledge.
To prepare the report, please include the following sections:
Provide an introduction to the problem. Include background material as appropriate: who
cares about this problem, what impact it has, where does the data come from, what are the
dimensions and structures of the data.
2. Data Setup
Describe how to load the data, and how the pre-processing is performed.
The original dataset is not ready for analysis and it is different from the data forms that we
are familiar with in previous practices. This means we need to do some pre-processing, either
for the whole dataset, or for a subset of the dataset required for each sub task described later.
Once you have some ideas of exploratory or advanced analysis, you need to adjust the form
of dataset. This can be achieved either by manipulating records in R by transposition or
subsetting, or with other tools (e.g. notepad or excel) before reading them into R. Please
explain your solution in this section.
3. Exploratory Data Analysis
3.1 One-variable analysis
One-variable analysis studies one variable (one row or one column) each time. For example,
we can select a particular Australian state or year to get a column of numbers and the
histogram can be used.
Perform 2 one-variable analyses. Plot one graph for each variable. Explain the finding for
3.2 Two-variable analysis
ICT110 Introduction to Data Science Assignment 2
Page 4 of 5
Two-variable analysis studies the relation between two variables. For example, we can select
“Diseases of the nervous system” and “Year”, then a time series (scatter) plot can be drawn.
Or, we can select “2015” and “Causes”.
Perform 2 two-variable analysis. Plot one graph for each variable. Explain the finding for
4. Advanced Analysis
Briefly explain the concept of clustering and k-means.
Perform 1 clustering analysis to group years according to a selected cause.
4.2 Linear Regression
Briefly explain the concept of linear regression.
Perform 2 linear regression analysis. Plot the learned models.
Save your time - order a paper!
Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlinesOrder Paper Now