Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Microsoft Word - W23 MATH341_345 Project V1.docx 1 Winter 2023 MATH 341/345 Project Deployments of Safety Cars in Formula One in XXXXXXXXXX (Version 1. February 5, 2023) Introduction:...

1 answer below »
Microsoft Word - W23 MATH341_345 Project V1.docx
1
Winter 2023 MATH 341/345 Project
Deployments of Safety Cars in Formula One in XXXXXXXXXX
(Version 1. Fe
uary 5, 2023)

Introduction:
This project aims at modeling the frequencies of safety car deployments per race in Formula
One and the time intervals between safety car deployments in XXXXXXXXXXA safety car in
Formula One is deployed while the “yellow flags” are waved by the marshals and the Race
Director decides that it is necessary to remove any hazards on the race track or that the racing
cars need to slow down due to unfavorable track conditions (i.e., heavy rain). When a safety
car is deployed, in addition to the yellow flags, each driver sees “SC” boards on the sides of the
track. Moreover, the same information is displayed on the steering wheel of each racing car.

Safety cars and yellow flags are important components of Formula One racing to protect
drivers’ and marshals’ lives. When the safety car is leading the race, each racing car needs to
unch up and follow the safety car without overtaking any other cars, unless they are allowed
to unlap themselves. As the safety car goes around the track at a much slower speed than the
normal racing pace, marshals can quickly remove any hazards on the track and improve the
track condition without wo
ying about fast-moving racing cars.

However, even with strict regulations under the yellow flag condition, accidents happen,
especially during wet weather races. A notable recent incident happened at the 2014 Japanese
Grand Prix, when a very promising young French driver Jules Bianchi of Marussia collided with a
tractor crane under the “double yellow flag” condition. A “double yellow flag” condition
indicates that marshals may be present on the track and the driver needs to prepare to stop, if
necessary. Bianchi lost control of the car due to aquaplaning on the wet surface and suffered a
fatal injury as a result of the collision with the tractor crane.

The FIA (governing body of the Formula One races) took the incident very seriously and
implemented a number of safety measures. One of them is an introduction of the “virtual
safety car (VSC)”. Under VSC condition, each driver needs to slow down their car to the posted
speed limit, usually resulting in a 35 to 40% speed reduction. Because it is a “virtual” safety car,
under VSC, the actual safety car is not deployed; rather, each racing car is equipped with the
device which automatically slows down to the posted speed limit under VSC.

Even with the introduction of VSC in 2015, under severe conditions, safety cars are deployed
once in a while. Here, an interesting question arises: Did the introduction of VSC change the
frequency of safety car deployments? This is an important question to answer for race
strategists, as the deployment of a safety car means that each team needs to react quickly to
adjust their tire strategies. Each driver is required to make at least one pit stop to change their
tires during the race, and a pit stop under the safety car condition implies that they can save
about 20 seconds, possibly gaining several precious positions in the race without overtaking. At
2
the same time, fresh tires typically make the racing car more drivable, increasing the chances of
catching and overtaking the other racing cars in front after the pit stop.
Related article: https:
www.mclaren.com
acing/2019/canadian-grand-prix/how-make-right-
call-safety-ca

Note that the importance of understanding probability is emphasized in this article.

Your Tasks in This Project:
Your main task in this project is to analyze the safety car deployment data in Formula One to
determine whether there are any changes in the frequency of safety car deployments between
the pre-VSC era XXXXXXXXXXand post-VSC era XXXXXXXXXXThat involves fitting reasonable
distribution(s) to the data for the number of safety car deployments per race and time intervals
etween the safety car deployments in these two time periods. Then, by comparing these two
distributions, you are asked to conclude whether strategic adjustments were necessary to
account for increased/decreased safety car deployments after VSC was introduced in 2015. The
dataset is originally retrieved from Kaggle
(https:
www.kaggle.com/datasets/jtrotman/formula-1-race-events), but it was further
augmented by adding Type, Round, TotalRounds, TotalLaps, and Condition. These additional
pieces of information were taken from the Wikipedia entries for the Formula One races.

A thorough and complete analysis of the main task above is sufficient to receive full credit for
this project. That is, you are not required to do any additional programming beyond what is
given if you choose to do so. However, you are probably interested in doing a more detailed
analysis of the dataset to make your analysis useful and interesting for the participating
Formula One teams. To help you analyze the dataset in more detail, the dataset provided
(augmented_safety_cars.csv) contains additional information such as type of the circuit
(permanent or street) and track condition (dry, mixed, or wet).

In addition, you will be asked to watch an interesting video titled “What Does An F1 Strategist
Do?” (https:
youtu.be/4CFkltWIc8o) so that you can see what Formula One strategists actually
do before, during, and after each race. At the same time, you will see how they interact with
acers, mechanics, race engineers, data analysts, and team principals.

How This Project Works:
This project consists of three parts; Probability Questions, Statistics Questions, and project
write-up. For the Probability and Statistics Questions, you need to answer the questions given
elow. For the project write-up, you may choose to summarize the results based on the R code
given. However, to make the project more interesting, you are encouraged to ca
y out
additional analysis. If you find anything interesting, you may choose to write about your
interesting finding(s) instead. To make sure that what you decide to write in your write-up is
appropriate, please talk to the instructor before you do anything. The instructor will be happy
to assist you with additional programming if necessary.



3
Probability Questions (12 Points in Total):
1. Watch “What Does An F1 Strategist Do?” (https:
youtu.be/4CFkltWIc8o) and describe
how the Formula One strategist position is related to your major(s) in a paragraph or
two. Note: Everyone on your team needs to write a separate paragraph or two. (2pts)
2. Suppose that you look at each of ? different laps in Formula One races. Why is checking
whether or not each of these laps was led by a safety car is a binomial experiment?
(2pts)
3. Why is it reasonable to assume that the number of safety car deployments in a fixed
period of time (i.e., five seasons) follows the Poisson distribution (approximately)?
Recall the relationship between the binomial and Poisson distribution, and state what
happens to ? (the number of laps) and ? (the probability that each lap is led by a safety
car). (2pts)
4. Why is it reasonable to assume that the time intervals between safety car deployments
are (approximately) exponentially distributed? (2pts)
5. Suppose that we consider two time periods of Formula One racing (2010 – 2014 and
2015 – XXXXXXXXXXIs it safe to assume that the number of safety car deployments in each of
these two time periods is independent of each other? In other words, is it reasonable to
say that the number of safety car deployments in 2010 – 2014 does not significantly
influence the number of safety car deployments in 2015 – 2019? Justify. (2pts)
6. Recall the memoryless property of the exponential distribution, which says that
?(? ≥ ?! + ?"    |    ? ≥ ?!) = ?(? ≥ ?"), ?! ≥ 0, ?" ≥ 0, if and only if ? is exponentially
distributed. What does this imply regarding the probability that the next safety car
deployment is 5 races from now given that it has been 3 races since the last safety car
deployment? Comment. (2pts)
Note: The above phenomenon is known as “the waiting time paradox”.
7. (Optional) Any questions you have about this project.

Statistics Questions (Read W23MATH341Project.R and run the program to answer these
questions. Look for “SQ” in the comments in the R code to identify which part of the code is
efe
ing to which question.) (20 Points in Total):
Note: The length of each race is set to 1, which is reasonable given that each race has an
approximately the same race distance. According to the data, the first safety deployment in
2010 occu
ed at lap 2 of Round 2 (which was a 58-lap race) and the second deployment
occu
ed at lap 1 of Round 4 (which was a 56-lap race). Thus, the first duration is simply
(Round 1) + (Deployment in Round 2) = 1 + 2/58 = XXXXXXXXXXThen, the second duration is the
duration between these two deployments is given by (Remaining laps in Round 2) + Round 3
+ (Deployment in Round 4) = (1 – 2/ XXXXXXXXXX/56 = XXXXXXXXXX.
1. Look at the histograms of the number of safety car deployments per race. Do these
histograms suggest that the data are Poisson distributed (approximately)? Or, is there
any clear evidence against that? Comment. (2pts)
2. The best-fit Poisson pmfs, as represented by the blue dotted lines, use
lambda=mean(first_half) and lambda=mean(second_half) for the first and second half of
the 2010’s, respectively. Explain why it makes sense to use these values. (2pts)
4
3. Look at the histograms of the time intervals between two safety car deployments. Do
these histograms suggest that the data are exponentially distributed (approximately)?
Or, is there any clear evidence against that? Comment. (2pts)
4. The best-fit exponential pmfs, as represented by the blue dotted lines, use
ate=1/mean(interval1) and rate=1/mean(interval2) for the first and second half of the
2010’s, respectively. Explain why it makes sense to use these values. (2pts)
5. Report mean(first_half), mean(interval1), mean(second_half), and mean(interval2).
Then, describe how mean(first_half) and mean(interval1), as well as mean(second_half)
and mean(interval2), are approximately related to each other. After that, explain why
that happens by recalling the distributions you identified for the number of safety car
deployments and the time interval between two safety car deployments. (2pts)
6. Running a two-sample t-test for comparing means or to construct a confidence interval
for the difference in means using the time interval data may potentially lead to wrong
esults. Explain why in terms of normality and independence. (2pts)
7. Explain why the concerns you mentioned in the previous question are actually not
concerning for this dataset. (2pts)
8. The t.test() function in R gives the one- and two-sample t-test results for the mean or
difference in means, including the confidence intervals and p-values. The parameter
var.equal in the t.test() function specifies whether or not the common variance can be
assumed (if yes, TRUE, and otherwise, FALSE). For comparing the time intervals, can we
assume common variance? Comment. Recall that the mean and standard deviation are
equal to each other in the case of exponential distribution. (2pts)
9. Report the results of the t.test() function (95% confidence interval, degrees of freedom
used, and p-value) for the var.equal=TRUE and var.equal=FALSE cases. (2pts)
10. Based on the results above, discuss whether or not there is any statistically significant
change in the distribution of the safety car deployments between these two time
periods. (2pts)
11. (Optional) The Kolmogorov-Smirnov test is a one- and two-sample test that directly
compares the cumulative distribution function(s) of the data. In the one-sample case, a
esearcher hypothesizes the underlying distribution and see how well the cumulative
distribution function (cdf) estimated from the data (known as the empirical cdf) matches
that of the hypothesized distribution. In the two-sample case, the two empirical cdf’s
are directly compared. Do the test results show any evidence against the deviation from
the exponential distribution for the time interval data? Also, are these two datasets
significantly different from each other? Justify your conclusion by reporting the p-values
and interpreting these p-values. (Extra credit: 1pt)
12. (Optional) The quantile-quantile (Q-Q) plot is a visual tool to see if the dataset of
interest follows a certain distribution. Although the Q-Q plot is typically used for the
normal distribution, for this project, we use the Q-Q plot for the exponential
distribution. If the points on the plot follows a straight line on the Q-Q plot, that is an
indication that the dataset follows the exponential distribution well. Present the Q-Q
plots for the time interval datasets (pre- and post-VSC) and comment. (Extra credit: 1pt)
13. (Optional) Another important aspect of the dataset is the independence of the
observations. A common assumption
Answered 2 days After Feb 28, 2023

Solution

Banasree answered on Mar 02 2023
27 Votes
Probability Question:
1.Ans.
As a mechanical engineering major, the role of an F1 strategist is one that particularly intriguing. The position requires a strong understanding of the technical aspects of racing as well as strategic thinking and quick decision-making skills. The coursework has covered a variety of topics related to the design and operation of racing vehicles, which is essential to understanding the data that the strategist must analyze and interpret during a race. Additionally, the ability to think critically and make quick decisions is a skill that have honed throughout the coursework and co-cu
icular activities. The role of an F1 strategist requires a unique combination of technical knowledge and strategic thinking, making it a fascinating career option for those with a background in mechanical engineering.
As a statistics major, the role of an F1 strategist is one that find particularly interesting due to the importance of data analysis in the decision-making process. The ability to analyze and interpret large amounts of data is crucial to making informed decisions as a strategist. Additionally, statistical modeling can help predict the frequency of safety car deployments and time intervals between deployments, which can be used to inform strategic decisions during the race. The role of an F1 strategist requires a unique combination of technical knowledge, strategic thinking, and statistical analysis, making it a fascinating career option for those with a background in statistics.
2.Ans.
Checking whether or not each of the laps was led by a safety car can be considered a binomial experiment because it has the following characteristics:
1. The experiment consists of a fixed number of trials, which is the total number of laps in the race.
2. Each trial has only two possible outcomes: either the lap was led by a safety car or it was not.
3. The outcomes of the trials are independent of each other. The fact that one lap was led by a safety car does not affect the likelihood of the next lap being led by a safety car.
4. The probability of success (i.e., the probability of a lap being led by a safety car) is constant for each trial.
5. It can use the binomial distribution to calculate the probability of a certain number of laps being led by a safety car out of the total number of laps.
3.Ans.
The Poisson distribution is related to the binomial distribution in the following way: when the number of trials in a binomial experiment (i.e., the number of laps in a race) becomes very large and the probability of success in each trial (i.e., the probability that a lap is led by a safety car) becomes very small, the binomial distribution converges to the Poisson distribution with the mean parameter equal to the product of the number of trials and the probability of success.
In the case of safety car deployments in Formula One, the number of laps in a race (i.e., the number of trials) can be very large, and the probability that each lap is led by a safety car is typically very small. Therefore, if we consider the number of safety car deployments over a fixed period of time (e.g., five seasons), it is reasonable to approximate the distribution of the number of safety car deployments by the Poisson distribution with the mean parameter equal to the product of the total number of laps in the five seasons and the probability of a lap being led by a safety car.
4.Ans.
It is reasonable to assume that the time intervals between safety car deployments are (approximately) exponentially distributed for several reasons. Firstly, the occu
ence of safety car deployments is a random and unpredictable event, and the exponential distribution is often used to model the time between occu
ences of rare events. Secondly, the exponential distribution has a memoryless property, which means that the probability of a safety car being deployed in a given time interval is independent of the time since the last deployment. This property is often observed in situations where events occur randomly and independently over time, making the exponential distribution a natural choice for modeling the time between safety car deployments in Formula One races.
5.Ans.
It is not safe to assume that the number of safety car deployments in each of the two time periods (2010-2014 and 2015-2019) is independent of each other. There are many factors that can affect the number of safety car deployments in a given time period, including changes in track conditions, changes in the rules and regulations, and changes in the behavior of the drivers. Some of these factors may have ca
ied over from one time period to another, making it difficult to assume independence between them.
For example, the introduction of the virtual safety car (VSC) in 2015 could have affected the number of safety car deployments in the 2015-2019 time period. It is possible that teams became more conservative with their tire strategies under VSC conditions, leading to fewer safety car deployments overall. Alternatively, teams could have become more aggressive with their driving under VSC conditions, leading to more safety car deployments overall. These types of ca
yover effects make it difficult to assume independence between the two time periods, and therefore it is important to analyze them separately to understand the underlying trends and factors that influence safety car deployments.
6.Ans.
Let's say that the average time between safety car deployments is 10 races, which means that the rate parameter of the exponential distribution is λ = 1/10.
Using the memoryless property of the exponential distribution, we can calculate the probability that the next safety car deployment is 5 races from now given that it has been 3 races since the last safety car deployment:
P(X > 8 | X > 3) = P(X > 5)
where X is the time between safety car deployments.
Since X follows an exponential distribution with λ = 1/10, we can find P(X > 5) as follows:
P(X > 5) = e^(-5λ) = e^(-1/2) ≈ 0.6065
Therefore, the probability that the next safety car deployment is 5 races from now given that it has been 3 races since the last safety car deployment is approximately 0.6065.
7.Ans.
No.
Statistical Questions:
SQ - This code is an R script that analyzes safety car deployments in Formula 1 races during the 2010s. The script loads a dataset of augmented safety car deployments (including additional information such as type of race track, lap numbers, and conditions), extracts data for the 2010s, and...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here