page 1 of 3
EXAMINATION OFFICE
IU.ORG
ADVANCED WORKBOOK
Task for: DLMDSAS01– Advanced Statistics
Note on copyright and plagiarism:
Please take note that IU Internationale Hochschule GmbH holds the copyright to the examination tasks. We
expressly object to the publication of tasks on third-party platforms. In the event of a violation, IU Internationale
Hochschule is entitled to injunctive relief. We would like to point out that every submitted written assignment is
checked using a plagiarism software. We therefore suggest not to share solutions under any circumstances, as
this my give rise to the suspicion of plagiarism.
The workbook bases on the output of the parameter generator. The produced numbers are ?1 to ?20 and are used
in the assignment. Only start the assignment after generating the numbers (which is done personally).
Task 1: Basic Probabilities and Visualizations (1)
Please provide the requested visualization as well as the numeric results. In both cases, please provide how you
ealized these (calculations, code, steps…) and why it is the appropriate tools. Do not forget to include the scale
of each graphics so a reader can read the numbers represented.
• If ?1 is 0: A vote with outcome ??? or ??????? follows a Bernoulli distribution where ?(vote = "???") = ?2.
Represent the proportion of “for” and “against” in this single Bernoulli trial using a graphic and a
percentage. Can an expectation be calculated? Justify your answer by all necessary hypotheses.
• If ?1 is between 1 and 3:
The number of meteorites falling on an ocean in a given year can be modelled by one of the following
distributions. Give a graphic showing the probability of one, two, three… meteorites falling (until the
probability remains provably less than 0.5% for any bigger number of meteorites). Calculate the
expectation and median and show them graphically on this graphic:
o If ?1 is 1: a Poisson distribution with an expectation of ? = ?2
o If ?1 is 2: a negative binomial distribution with number of successes of ? = ?2 und ? = ?3.
o If ?1 is 3: a geometric distribution counting the number of Bernoulli trials with ? = ?2 until it
succeeds.
Task 2: Basic Probabilities and Visualizations (2)
Let ? be the random variable with the time to hear an owl from your room’s open window (in hours). Assume that
the probability that you still need to wait to hear the owl after ? hours is one of the following:
• If ?4 is 0: the probability is given by ?5e
−?6 ? + ?7e
−?8 ?
• If ?4 is 1: the probability is given by ?5e
−?6 ?
2
+ ?7e
−?8 ?
8
• If ?4 is 2: the probability is given by ?5e
−?6 √? + ?7e
−?8 √?
3
• If ?4 is 3: the probability is given by ?5e
−?6 ?
2
+ ?7e
−?8 ?
2
page 2 of 3
EXAMINATION OFFICE
IU.ORG
Find the probability that you need to wait between 2 and 4 hours to hear the owl, compute and display the
probability density function graph as well as a histogram by the minute. Compute and display in the graphics the
mean, variance, and quartiles of the waiting times.
Please pay attention to the various units of time!
Task 3: Transformed Random Variables
A type of network router has a bandwidth total to first hardware failure called ? expressed in terabytes. The
andom variable ? is modelled by a distribution whose density is given by one of the following functions:
• (if ?9 = 0): ??(?) =
1
θ
?−
?
θ
• (if ?9 = 1): ??(?) =
1
24θ5
?4?−
?
θ
• (if ?9 = 2): ??(?) =
1
θ
for s ∈ [0, θ]
with a single parameter ?. Consider the bandwidth total to failure ? of the sequence of the two routers of the
same type (one being
ought up automatically when the first is
oken).
Express ? in terms of the bandwidth total to failure of single routers ?1 and ?2. Formulate realistic assumptions
about these random variables. Calculate the density function of the variable ?.
Given an experiment with the dual-router-system yielding a sample ?1 , ?2 , …, ?? , calculate the likelihood
function for ?. Propose a transformation of this likelihood function whose maximum is the same and can be
computed easily.
An actual experiment is performed, the infrastructure team has obtained the bandwidth totals to failure given by
the sequence ?10 of numbers. Estimate the model-parameter with the maximum likelihood and compute the
expectation of the bandwidth total to failure of the dual-router-system.
Task 4: Hypothesis Test
Over a long period of time, the production of 1000 high-quality hammers in a factory seems to have reached a
weight with an average of ?11 (in ?) and standard deviation of ?12 (in ?). Propose a model for the weight of the
hammers including a probability distribution for the weight. Provide all the assumptions needed for this model
to hold (even the uncertain ones)? What parameters does this model have?
One aims at answering one of the following questions about a new production system:
• (if ?13 = 0): Does the new system make more constant weights?
• (if ?13 = 1): Does the new system make lower weights?
• (if ?13 = 2): Does the new system make higher weights?
• (if ?13 = 3): Does the new system make less constant weights?
To answer this question a random sample of newly produced hammers is evaluated yielding the weights in ?14.
What hypotheses can you propose to test the question? What test and decision rule can you make to estimate if
the new system answers the given question? Express the decision rules as logical statements involving critical
values. What e
or probabilities can you suggest and why? Perform the test and draw the conclusion to answer
the question.
page 3 of 3
EXAMINATION OFFICE
IU.ORG
Task 5: Regularized Regression
Given the values of an unknown function ?: ℝ → ℝ at some selected points, we try to calculate the parameters
of a model function using OLS as a distance and a ridge regularization:
• (if ?15 = 0): a polynomial model function of thirteen ?? parameters:
?(?) = ?0 + ?1? + ?2?
2 + ⋯ + ?12?
12
• (if ?15 = 2): a polynomial model function of eleven ?? parameters:
?(?) = ?0 + ?1? + ?2?
2 + ⋯ + ?10?
10
Calculate the OLS estimate, and the OLS ridge-regularized estimates for the parameters given the sample points
of the graph of ? given that the values are y = ?16.
Provide a graphical representation of the graphs of the approximating functions and the data points.
Remember to include the steps of your computation which are more important than the actual computations.
Task 6: Bayesian Estimates
(following Hogg, McKean & Craig, exercise 11.2.2)
Let ?1, ?2, …, ?10 be a random sample from a gamma distribution with ? = 3 and ? = 1/?. Suppose we believe
that ? follows a gamma-distribution with ? = ?17 and ? = ?18 and suppose we have a trial (?1, … , ??) with an
observed �̅� = ?19.
a) Find the posterior distribution of ?.
) What is the Bayes point estimate of ? associated with the square-e
or loss function?
c) What is the Bayes point estimate of ? using the mode of the posterior distribution?