Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Homework 5 of STAT 3355 Data Analysis for Statisticians & Actuaries Due: 2:30pm October 26 (Tuesday), 2021 Let’s work on the dataset diamonds in the package ggplot2. You can use the following code to...

1 answer below »
Homework 5
of
STAT 3355 Data Analysis for Statisticians & Actuaries
Due: 2:30pm
October 26 (Tuesday), 2021
Let’s work on the dataset diamonds in the package ggplot2. You can use the following
code to load the data. Use necessary code to read the description of the dataset, which
contains 53940 samples and 10 variables.
# Install the package if you never did
install.packages("ggplot2")
# Load the pacakge
li
ary(ggplot2)
# Load the mpg dataset
data("diamonds")
Problem 1 (1× 5 = 5 points)
Use ggplot2 to visualize the data. You need to paste the resulting plots and related code
in order to get the full points. For each ggplot2 plot:
• make it complete
eadable, in other words, it should include axis label(s), title, and
legend if necessary;
• write 1–2 sentence about what the chart tells you about the data.
(a) Choose a bin number or a binwidth (Hint: See page 11 of lecture 04c.pdf), explain
why, and create a histogram of carat
(b) Make a scatter plot of y =price against x =carat and set the color to clarity
(c) Make a scatter plot of y =price against x =carat and add a smooth line to each
group of points defined by clarity
(d) Make a scatter plot of y =price against x =carat and facet it by clarity
(e) Show carat vs cut, make a point, a jitter, a box plot and a violin plot, respectively.
Which one is the best for visualization?
1
https:
elearning.utdallas.edu/webapps
lackboard/execute/content/file?cmd=view&mode=designer&content_id=_3086018_1&course_id=_171899_1
Problem 2 (1× 5 = 5 points)
Use ggplot2 to recreate the following plots with title. You need to paste the new plots and
elated code in order to get full points.
(a) Recreate the following two plots, add a short title, and comment on the merits of each
one compared to the othe
0
1000
2000
3000
4000
5000
I1 SI2 SI1 VS2 VS1 VVS2 VVS1 IF
clarity
co
un
t
cut
Fai
Good
Very Good
Premium
Ideal
Fai
G
ood
V
ery G
ood
P
em
ium
Ideal
I1 SI2 SI1 VS2 VS1 VVS2 VVS1 IF
0
1000
2000
3000
4000
5000
0
1000
2000
3000
4000
5000
0
1000
2000
3000
4000
5000
0
1000
2000
3000
4000
5000
0
1000
2000
3000
4000
5000
clarity
co
un
t
cut
Fai
Good
Very Good
Premium
Ideal
(b) Recreate the following plot and add a short title
0
10000
20000
XXXXXXXXXX
carat
p
ic
e
clarity
I1
SI2
SI1
VS2
VS1
VVS2
VVS1
IF
2
(c) Recreate the following plot and add a short title
0
5000
10000
15000
I1 SI2 SI1 VS2 VS1 VVS2 VVS1 IF
clarity
p
ic
e
cut
Fai
Good
Very Good
Premium
Ideal
(d) Recreate the following plot and add a short title
0
5000
10000
15000
0 1 2 3
carat
p
ic
e
cut
Fai
Good
Very Good
Premium
Ideal
(e) Recreate the following plot and add a short title (Hint: Choose binwidth = 0.1)
Fai
G
ood
V
ery G
ood
P
em
ium
Ideal
XXXXXXXXXX
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
depth
de
ns
ity
3
Answered Same Day Oct 26, 2021

Solution

Suraj answered on Oct 26 2021
132 Votes
Solution 1:
The first histogram plot is given as follows:
R-Code:
ggplot(df,aes(carat))+geom_histogram(color="red",binwidth =0.1)+labs(title="Histogram",y="Frequency")
Here, we can see that the distribution of carat variable is positively skewed as the there is long tail towards right side.
The next scatter plot R-code is given as follows:
ggplot(df,aes(carat,price,color=clarity))+geom_point()+labs(title="Scatter Plot")
Here, we can see that there is positive relation between each type of clarity type between carat and price.
The next plot is given with R-code as...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here