Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

10/22/2020 Final Project file:///C:/Users/leesu/Downloads/project.html 1/2 Final Project Important: Please create your GitHub repository on the AU-R-Programming organization by Monday (Nov. 2nd 2020)...

1 answer below »
10/22/2020 Final Project
file:
C:/Users/leesu/Downloads/project.html 1/2
Final Project
Important: Please create your GitHub repository on the AU-R-Programming organization by Monday
(Nov. 2nd 2020) at 2pm and use this repository to work on this assignment. Your final submission will be done
via Canvas in a single html file (in which you will specify the name of the co
esponding GitHub repository)
and is due by Friday, Dec. 4th 2020 at 11.59pm (no late work is accepted). The version submitted on Canvas
will have to co
espond to the last version on the GitHub repository and you will receive zero points if you
make modifications to your work after Dec. 4th 2020 at 11.59pm.
The final project will be evaluated on 100 points and the goal is to develop an R package implementing linea
egression as highlighted in Section 6.4 of the book (https:
smac-group.github.io/ds/section-
functions.html#section-example-continued-least-squares-function). The package must contain the basic functions
to perform linear regression (e.g. estimate the coefficient vector ) and obtain different statistics from the
procedure. Using the notation from the book and without using any of the linear regression functions already
available in R (i.e. all outputs must be produced using formulas provided in the book and in this document), the
asic outputs from the procedure must be the following:
Confidence intervals: the user must be able to choose the significance level to obtain for the
confidence intervals for and whether to use the asymptotic or bootstrap approach for this.
Plots (with e.g. ggplot2) including:
1. Residuals vs fitted-values (fitted values are ).
2. qq-plot of residuals
3. Histogram (or density) of residuals
Mean Square Prediction E
or (MSPE) computed in matrix form:
where is the number of observations in the data (i.e. number of rows).
F-test: compute the statistic in matrix form and output the co
esponding p-value. With representing the
sample mean of , let
and and . Then we can define and
and obtain the F-statistic as follows:
Using the appropriate distribution in R, compute which co
esponds to the p-value.
Help documentation for all functions (for example using the roxygen2 package)
β
α 1 − α
β
= Xŷ β̂
MSPE := ( −
1
n
∑
i=1
n
yi ŷ i)
2
n
ȳ
y
SSM := ( − ,∑
i=1
n
ŷ i ȳ)
2
SSE := ( − ,∑
i=1
n
yi ŷ i)
2
DFM = p − 1 DFE = n − p MSM = SSM/DFM
MSE = SSE/DFE
= .F ∗
MSM
MSE
P (F > )F ∗
https:
smac-group.github.io/ds/section-functions.html#section-example-continued-least-squares-function
10/22/2020 Final Project
file:
C:/Users/leesu/Downloads/project.html 2/2
The package will be made available for download on a GitHub repository in the AU-R-
Programming organization and the submission will be an html file on Canvas. The
html file wil be a so-called vignette which indicates the name of the GitHub repository
(and package) where you explain and give examples of how to use the package
functions for all the desired outputs using one of the datasets on the Canvas course
page.
Up to 20 bonus points will be given for the final projects if other features are added for the package (e.g. a website
with vignette, an example Shiny app that uses the package, the use of the Rcpp package).
Answered Same Day Oct 22, 2021

Solution

Naveen answered on Oct 31 2021
141 Votes
Testing1/.Rbuildignore
^.*\.Rproj$
^\.Rproj\.user$
Testing1/.Rproj.use
E408E4C2
uild_options
auto_roxygenize_for_build_and_reload="1"
auto_roxygenize_for_build_package="1"
auto_roxygenize_for_check="1"
live_preview_website="1"
makefile_args=""
preview_website="1"
website_output_format="all"
Testing1/.Rproj.use
E408E4C2/cpp-definition-cache
[]
Testing1/.Rproj.use
E408E4C2/pcs/debug-
eakpoints.ppe
{
"debugBreakpointsState": {
"
eakpoints": []
}
}
Testing1/.Rproj.use
E408E4C2/pcs/files-pane.ppe
{
"sortOrder": [
{
"columnIndex": 2,
"ascending": true
}
],
"path": "F:/Projects/GreyNodes/69502/Testing1/R"
}
Testing1/.Rproj.use
E408E4C2/pcs/source-pane.ppe
{
"activeTab": 0
}
Testing1/.Rproj.use
E408E4C2/pcs/windowlayoutstate.ppe
{
"left": {
"splitterpos": 258,
"topwindowstate": "NORMAL",
"panelheight": 612,
"windowheight": 650
},
"right": {
"splitterpos": 522,
"topwindowstate": "NORMAL",
"panelheight": 612,
"windowheight": 650
}
}
Testing1/.Rproj.use
E408E4C2/pcs/workbench-pane.ppe
{
"TabSet1": 3,
"TabSet2": 3,
"TabZoom": {}
}
Testing1/.Rproj.use
E408E4C2/persistent-state
uild-last-e
ors="[]"
uild-last-e
ors-base-dir="F:/Projects/GreyNodes/69502/Testing1/"
uild-last-outputs="[{\"type\":0,\"output\":\"==> devtools::document(roclets = c('rd', 'collate', 'namespace'))\\n\\n\"},{\"type\":2,\"output\":\"Updating Testing1 documentation\\r\\n\"},{\"type\":2,\"output\":\"Loading Testing1\\r\\n\"},{\"type\":1,\"output\":\"Writing my_lm.Rd\\r\\nWriting my_qqplot.Rd\\r\\nWriting my_resid_fit.Rd\\r\\nWriting my_hist_resid.Rd\\r\\nWriting my_MSPE.Rd\\r\\nWriting my_F.Rd\\r\\n\"},{\"type\":2,\"output\":\"Warning: The existing 'NAMESPACE' file was not generated by roxygen2, and will not be overwritten.\\r\\n\"},{\"type\":1,\"output\":\"Documentation completed\\n\\n\"},{\"type\":0,\"output\":\"==> Rcmd.exe INSTALL --no-multiarch --with-keep.source Testing1\\n\\n\"},{\"type\":1,\"output\":\"* installing to li
ary 'C:/Users/dell/Documents/R/win-li
ary/4.0'\\r\\n\"},{\"type\":1,\"output\":\"* installing *source* package 'Testing1' ...\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** using staged installation\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** R\\r\\n\"},{\"type\":1,\"output\":\"** byte-compile and prepare package for lazy loading\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** help\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"*** installing help indices\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\" converting help for package 'Testing1'\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\" finding HTML links ...\"},{\"type\":1,\"output\":\" my_F html \\r\\n\"},{\"type\":1,\"output\":\" my_MSPE html \"},{\"type\":1,\"output\":\" done\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"\\r\\n\"},{\"type\":1,\"output\":\" my_hist_resid html \\r\\n\"},{\"type\":1,\"output\":\" my_lm html \\r\\n\"},{\"type\":1,\"output\":\" my_qqplot html \\r\\n\"},{\"type\":1,\"output\":\" my_resid_fit \"},{\"type\":1,\"output\":\" html \"},{\"type\":1,\"output\":\"\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** building package indices\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** testing if installed package can be loaded from temporary location\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** testing if installed package can be loaded from final location\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** testing if installed package keeps a record of temporary installation path\\r\\n\"},{\"type\":1,\"output\":\"* DONE (Testing1)\\r\\n\"},{\"type\":1,\"output\":\"\"}]"
compile_pdf_state="{\"tab_visible\":false,\"running\":false,\"target_file\":\"\",\"output\":\"\",\"e
ors\":[]}"
files.monitored-path=""
find-in-files-state="{\"handle\":\"\",\"input\":\"\",\"path\":\"\",\"regex\":false,\"ignoreCase\":false,\"results\":{\"file\":[],\"line\":[],\"lineValue\":[],\"matchOn\":[],\"matchOff\":[],\"replaceMatchOn\":[],\"replaceMatchOff\":[]},\"running\":false,\"replace\":false,\"preview\":false,\"gitFlag\":false,\"replacePattern\":\"\"}"
imageDirtyState="1"
saveActionState="-1"
Testing1/.Rproj.use
E408E4C2
md-outputs
Testing1/.Rproj.use
E408E4C2/saved_source_markers
{"active_set":"","sets":[]}
Testing1/.Rproj.use
E408E4C2/sources/prop/1381EF53
{
"cursorPosition": "153,1",
"scrollLine": "140"
}
Testing1/.Rproj.use
E408E4C2/sources/prop/8674A834
{}
Testing1/.Rproj.use
E408E4C2/sources/prop/A8436F53
{
"cursorPosition": "49,13",
"scrollLine": "44"
}
Testing1/.Rproj.use
E408E4C2/sources/prop/E11E1D39
{
"cursorPosition": "11,0",
"scrollLine": "0"
}
Testing1/.Rproj.use
E408E4C2/sources/prop/FA90EA93
{
"cursorPosition": "20,0",
"scrollLine": "9"
}
Testing1/.Rproj.use
E408E4C2/sources/prop/INDEX
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2FDESCRIPTION="E11E1D39"
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2FR%2FTesting1.R="1381EF53"
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2FR%2Fhello.R="A8436F53"
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2Fman%2Fhello.Rd="8674A834"
F%3A%2FR%20Package%2FSA%2FR%2FSA%20-%20Copy.R="FA90EA93"
Testing1/.Rproj.use
E408E4C2/sources/s-0DD8B8D6/1ECE7019-contents
#' Simple Linear Regression
#'
#' The function \code{my_lm()} is used to fit simple linear regression model.
#'
#' @param response A vector or matrix
#' @param covariates A vector or matrix
#' @param alpha level of significance, the default significance level is **0.05**.
#'
#' @return Returns \code{Residuals}, \code{beta hat},
#' \code{sigma hat}, \code{Variance of beta hat},
#' \code{Confidence Intervals of beta}, \code{fitted values},
#' \code{response} and \code{covariate} values
#'
#' @author Naveen Kumar M.Sc., \emph{Email}: \email{[email protected]} OR \emph{WhatsApp}: \href{https:
wa.me/918688896472}{Click Here}
#' @examples
#' y <- c(25,36,12,45,26,82,14,35,21,45,32)
#' x <- c(10,35,62,42,15,32,18,24,38,26,43)
#' lm_model <- my_lm(response, covariates, alpha = 0.1)
#' print(lm_model)
#'
#' @seealso \link{my_qqplot} for making \code{Normal Q-Q} plot, \link{my_resid_fit} getting missing percentage
#' \link{my_hist_resid} for missing count, \link{my_MSPE} getting missing percentage
#' \link{my_F} for missing count
#'
#' @export
my_lm = function(response, covariates, alpha = 0.05) {
# Make sure data formats are appropriate
response <- as.vector(response)
covariates <- as.matrix(covariates)
# Define parameters
n <- length(response)
p <- ncol(covariates)
df <- n - p
# Estimate beta through Eq. (6.1)
beta.hat <- solve(t(covariates)%*%covariates)%*%t(covariates)%*%response
# Estimate of the residual variance (sigma2) from Eq. (6.3)
# Compute residuals
fitted.val <- covariates%*%as.matrix(beta.hat)
resid <- response - fitted.val
sigma2.hat <- (1/df)*t(resid)%*%resid
# Estimate of the variance of the estimated beta from Eq. (6.2)
var.beta <- sigma2.hat*solve(t(covariates)%*%covariates)
# Estimate of the confidence interval based on alpha
quant <- 1 - alpha/2
ci.beta <- c(beta.hat - qnorm(p = quant)*sqrt(var.beta), beta.hat +
qnorm(p = quant)*sqrt(var.beta))
# Return all estimated values
return(list(residuals= resid, beta = beta.hat,
sigma2 = sigma2.hat, variance_beta = var.beta,
ci = ci.beta, fitted.values = fitted.val,
Response = response, Covariates = covariates))
}
Testing1/.Rproj.use
E408E4C2/sources/s-0DD8B8D6/29C2EAED
{
"id": "29C2EAED",
"path": "F:/Projects/GreyNodes/69502/Testing1/R/Testing1.R",
"project_path": "R/Testing1.R",
"type": "r_source",
"hash": "597828291",
"contents": "",
"dirty": false,
"created": 1604044128430.0,
"source_on_save": false,
"relative_order": 1,
"properties": {
"cursorPosition": "153,1",
"scrollLine": "140"
},
"folds": "",
"lastKnownWriteTime": 1604143560,
"encoding": "ISO8859-1",
"collab_server": "",
"source_window": "",
"last_content_update": 1604143560011,
"read_only": false,
"read_only_alternatives": []
}
Testing1/.Rproj.use
E408E4C2/sources/s-0DD8B8D6/29C2EAED-contents
#' Simple Linear Regression
#'
#' The function \code{my_lm()} is used to fit simple linear regression model.
#'
#' @param response A vector or matrix
#' @param covariates A vector or matrix
#' @param alpha level of significance, the default significance level is **0.05**.
#'
#' @return Returns \code{Residuals}, \code{beta hat},
#' \code{sigma hat}, \code{Variance of beta hat},
#' \code{Confidence Intervals of beta}, \code{fitted values},
#' \code{response} and \code{covariate} values
#'
#' @author Naveen Kumar M.Sc., \emph{Email}: \email{[email protected]} OR \emph{WhatsApp}: \href{https:
wa.me/918688896472}{Click Here}
#' @seealso \link{my_qqplot} for making \code{Normal Q-Q} plot, \link{my_resid_fit} residual vs fitted plot,
#' \link{my_hist_resid} for histogram of residuals, \link{my_MSPE} Calculating Mean Square Percentage E
o
#' \link{my_F} for F cal, F cri & p values.
#'
#'
#' @examples
#' y <- c(25,36,12,45,26,82,14,35,21,45,32)
#' x <- c(10,35,62,42,15,32,18,24,38,26,43)
#' lm_model <- my_lm(response, covariates, alpha =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here