Q1) [14 Marks]Consider the hourly pedestrian count data collected at the Melbourne Central stationin...

Question

Q1) [14 Marks]Consider the hourly pedestrian count data collected at the Melbourne Central stationin Melbourne over the one-month period in March 2022. This dataset is given as aCSV file, named “MelbCentPedCntMarch2022.csv”.1.1) Plot the histogram for the count data. Comment on the shape. How manymodes can be observed in the data?1.2) Fit a single Gaussian model (, ) to the distribution of the data, where is the mean and is the standard deviation of the Gaussian distribution.Find the maximum likelihood estimate (MLE) of the parameters, i.e., the mean and the standard deviation .Plot the obtained (single Gaussian) density distribution along with thehistogram on the same graph.1.3) Fit a mixture of Gaussians model to the distribution of the data using numberof Gaussians equal to 4 (four). Use R programming to perform this.Provide the mixing coefficients, mean and standard deviation for each of theGaussians found.Plot all these Gaussians on top of the histogram plot.Include a plot of the combined density distribution as well (use different colorsfor the density plots in the same graph).1.4) Provide a plot of the log likelihood values obtained over the iterations andcomment on them.1.5) Comment on the distribution models obtained in Q1.2 and Q1.3. Which one isbetter?

melbcentpedcntmarch2022-4vsqv2tz.csv

Subhanbasha · Accepted Answer

Answers
Question 1.
Ans:
Using R we are imported data into R studio where we can do all the analysis on the hourly pedestrian count data collected at the Melbourne Central station.
The file is csv format so used read.csv() in R to load the data.
Question 1.1
Ans: The histogram of the pedestrian count data as follows
By observing the above plot, the data is not following normal distribution which is the assumption of all the statistical test. We can observe that the data has following multinormal distribution that is at the left-hand side the data has right tailed distribution. And from 1000 we can see that the data approximately normal. So that we can say one part of the data following normal but not other part of the data. To make the data as normal we need to do some transformations on top of the data. Then the data will look like normal of gaussian.

Q1) [14 Marks]Consider the hourly pedestrian count data collected at the Melbourne Central stationin Melbourne over the one-month period in March 2022. This dataset is given as aCSV file, named...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment