Given the subset of genes and their expressions (mRNAseq), perform descriptive analysis and visualization. Take example on the worksheet.
A template in R JupyterNotebook format is provided (AssignmentWeek2.pdf). Each comment line describes a statement after which the R code should be added.
1) Install package 'ggplot2' - if not already installed.
2) Load package 'ggplot2'.
3) Read dataset from file 'BRCAMergedWAVE.csv'. The documentation is located in file'DataDictionary.pdf'.
4) List number of rows, number of columns, and dimension of the dataset.
5) Display the list of variables (which is the first row, numbered 0).
6) Provide descriptive statistics for all variables in the dataset.
7) Draw a histogram of age (Diagnosis.Age) showing the frequency of cancer for each age.
8) Draw a scatterplot of variable/gene ABI1 as a function of age. Add a smoothing line.
9) Draw a scatterplot of variable/gene WASF1 as a function of age. Add a smoothing line.
10) Turn in the assignment as a plain R script file (do NOT submit a Jupyter Notebook file).