Getting ready for the course

Author

Wen Zhong

Published

October 23, 2025

To perform the tasks of the workshops, you will need to bring your own laptop. The software needed is compatible with all major operating systems, including Windows, MacOS, and Linux. Although the tasks will not be particularly computational-intensive, 8 GB of RAM is recommended for optimal performance, especially when working with large datasets.

Web servers

Part of the workshops will be web-based, they will not require any local installations, just ensure you have a modern web browser installed (e.g., Google Chrome, Firefox, Safari). Check you can access and navigate each of the following web servers:

olinkWrapper: https://olink-wrapper.serve.scilifelab.se/
methylR: https://methylr.serve.scilifelab.se/
Shiny-WGCNA: https://shinywgcna.serve.scilifelab.se/

Familiarize yourself with the basic workflow of each application by reading the provided tutorial files. Pay close attention to the specific data requirements for each tool, as they differ (e.g., a zipped folder for methylR versus separate CSV files for olinkWrapper and Shiny-WGCNA).

Google Colab: Some of the workshop exercises will be run on Google’s Colab server https://colab.research.google.com/. Therefore, you will need a Google account. If you don’t have one, please create an account before the course. Once you have access to Google Colab, go through the provided guidelines https://colab.research.google.com/. Make a copy of the following IPython Notebook https://colab.research.google.com/drive/1-Eyy4px28tEm2qkBkH6_2o1jjdwaLSKT?usp=sharing to your Google Drive and run the code to test its functionality.

Software installation

Some of the tasks in the workshops will be performed using local installations of R and Python.

To install R (version 4 or more) and Rstudio, follow these links: https://cran.r-project.org/ and https://posit.co/products/open-source/rstudio/?sid=1. Both are free and should be straightforward to install.
Once you have done that, open Rstudio, and install the following packages: dplyr, tidyverse, ggplot2, umap, glmnet, igraph, vegan, MOFA2, MOFAdata, Seurat, and hdf5r.
Install Python https://www.python.org/downloads/ (version 3 or more), and the following packages: numpy, pandas, h5py, scipy, argparse, sklearn, mofapy2

If you are new to R or Python, here are some guides to get started.

Beginner’s Guide to R: https://education.rstudio.com/learn/beginner/
Beginner’s Guide to Python: https://wiki.python.org/moin/BeginnersGuide/

Reading material

During the lectures and the workshops, we will use several machine learning techniques that you should be familiar with. We will not delve deep into the mathematical background of these techniques; instead, we will focus on the purpose and limitation of each method.

A Summary of the Basic Machine Learning Models

Basic machine learning methods: https://towardsdatascience.com/a-summary-of-the-basic-machine-learning-models-e0a65627ecbe/
Supervised Machine Learning: https://www.datacamp.com/blog/supervised-machine-learning

Principal component analysis (PCA)

Papers: https://doi.org/10.1098/rsta.2015.0202
Videos: https://youtube.com/watch?v=FgakZw6K1QQ&ab_channel=StatQuestwithJoshStarmer

Applications: https://doi.org/10.1016/j.drudis.2017.01.005, https://doi.org/10.17485/ijst/2017/v10i20/91294

Dimensionality reduction techniques

UMAP R tutorial: https://cran.r-project.org/web/packages/umap/vignettes/umap.html
t-SNE R tutorial: https://www.r-bloggers.com/2019/05/quick-and-easy-t-sne-analysis-in-r/

Factor analysis

MOFA: https://biofam.github.io/MOFA2/

Linear regression

Videos: https://youtube.com/watch?v=hokALdIst8k&ab_channel=StatQuestwithJoshStarmer

Methods to find associations between different omics

Canonical correspondence analysis (CCA) R tutorial: https://rpubs.com/Devy/902673
Partial least squares (pLS): https://cran.r-project.org/web/packages/pls/vignettes/pls-manual.pdf
R package that implements CCA and pLS: https://doi:10.1093/bioinformatics/btp515, https://mixomics.org/methods/spls/

Transformers in single-cell omics https://doi.org/10.1038/s41592-024-02353-z
Spatial transcriptomics https://doi.org/10.1101/2025.01.20.634005

Moreover, here are some nice examples of studies that use multi-omics data in biomedical research.

Relatioship between different omics and health: https://doi.org/10.1038/s41467-020-18148-7
Multi-omics to better understand diseases: https://doi.org/10.1016/j.tig.2022.08.005, https://doi.org/10.1038/s10038-020-00842-5
Review of multiple multi-omics studies: https://doi.org/10.1002/mco2.315