Chapter 2 Let’s get started

2.1 Installation

First, download the latest version from Bioconductor:

Alternatively, you can install the latest GitHub version of the package:

The mixOmics package should directly import the following packages: igraph, rgl, ellipse, corpcor, RColorBrewer, plyr, parallel, dplyr, tidyr, reshape2, methods, matrixStats, rARPACK, gridExtra. For Apple mac users: if you are unable to install the imported package rgl, you will need to install the XQuartz software first.

2.2 Load the package

Check that there is no error when loading the package, especially for the rgl library (see above).

2.3 Upload data

The examples we give in this vignette use data that are already part of the package. To upload your own data, check first that your working directory is set, then read your data from a .txt or .csv format, either by using File > Import Dataset in RStudio or via one of these command lines:

For more details about the arguments used to modify those functions, type ?read.csv or ?read.table in the R console.

2.4 Quick start in mixOmics

Each analysis should follow this workflow:

  1. Run the method
  2. Graphical representation of the samples
  3. Graphical representation of the variables

Then use your critical thinking and additional functions and visual tools to make sense of your data! (some of which are listed in 1.2.2) and will be described in the next Chapters.

For instance, for Principal Components Analysis, we first load the data:

Then use the following steps:

This is only a first quick-start, there will be many avenues you can take to deepen your exploratory and integrative analyses. The package proposes several methods to perform variable, or feature selection to identify the relevant information from rather large omics data sets. The sparse methods are listed in the Table in 1.2.2.

Following our example here, sparse PCA can be applied to select the top 5 variables contributing to each of the two components in PCA. The user specifies the number of variables to selected on each component, for example, here 5 variables are selected on each of the first two components (keepX=c(5,5)):

You can see know that we have considerably reduced the number of genes in the plotVar correlation circle plot.

Do not stop here! We are not done yet. You can enhance your analyses with the following:

  • Have a look at our manual and each of the functions and their examples, e.g. ?pca, ?plotIndiv, ?sPCA, …

  • Run the examples from the help file using the example function: example(pca), example(plotIndiv), …

  • Have a look at our website that features many tutorials and case studies,

  • Keep reading this vignette, this is just the beginning!