Principal Component Analysis (PCA)
2020-11-18
Chapter 1 Prerequisites
In order to run the code in this chapter, you will need to install a number of packages. The packages are listed below. The recommended way of installing the packages is through BioManager.
The bookdown package can be installed from CRAN or Github:
##
## There is a binary version available but the source version is later:
## binary source needs_compilation
## bookdown 0.20 0.21 FALSE
##
## The downloaded binary packages are in
## /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//Rtmp5rrhzI/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//Rtmp5rrhzI/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//Rtmp5rrhzI/downloaded_packages
##
## There is a binary version available but the source version is later:
## binary source needs_compilation
## vctrs 0.3.4 0.3.5 TRUE
##
##
## The downloaded binary packages are in
## /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//Rtmp5rrhzI/downloaded_packages
We will also use a dataset in this chapter. The dataset contains several thousand genes measured on 23 samples. The samples are coming from four groups and have been measured in two batches.
# read the raw data
data<-read.table("data/b1_b2_data.gct",sep = "\t",header = T,comment.char = "#")
# remove unused columns
data<-data[,-c(1:2)]
# transpose the data so that genes are in column and samples are in rows
data<-t(data)
# read the metadata
metadata<-read.table("data/b1_b2_sampleinfofile.txt",sep = "\t",header = T,comment.char = "#")