Getting ready for the course

Author

Wen Zhong

Published

August 25, 2025

To perform the tasks of the workshops, you will need to bring your own laptop. The software needed is compatible with all major operating systems, including Windows, MacOS, and Linux. Although the tasks will not be particularly computational-intensive, 8 GB of RAM is recommended for optimal performance, especially when working with large datasets.

Web servers

Part of the workshops will be web-based, they will not require any local installations, just ensure you have a modern web browser installed (e.g., Google Chrome, Firefox, Safari). Check you can access and navigate each of the following web servers:

Familiarize yourself with the basic workflow of each application by reading the provided tutorial files. Pay close attention to the specific data requirements for each tool, as they differ (e.g., a zipped folder for methylR versus separate CSV files for olinkWrapper and Shiny-WGCNA).

Software installation

Some of the tasks in the workshops will be performed using local installations of R and Python.

If you are new to R or Python, here are some guides to get started.

Reading material

During the lectures and the workshops, we will use several machine learning techniques that you should be familiar with. We will not delve deep into the mathematical background of these techniques; instead, we will focus on the purpose and limitation of each method.

  • A Summary of the Basic Machine Learning Models

Basic machine learning methods: https://towardsdatascience.com/a-summary-of-the-basic-machine-learning-models-e0a65627ecbe/
Supervised Machine Learning: https://www.datacamp.com/blog/supervised-machine-learning

  • Principal component analysis (PCA)

Papers: https://doi.org/10.1098/rsta.2015.0202
Videos: https://youtube.com/watch?v=FgakZw6K1QQ&ab_channel=StatQuestwithJoshStarmer

Applications: https://doi.org/10.1016/j.drudis.2017.01.005, https://doi.org/10.17485/ijst/2017/v10i20/91294

  • Dimensionality reduction techniques

UMAP R tutorial: https://cran.r-project.org/web/packages/umap/vignettes/umap.html
t-SNE R tutorial: https://www.r-bloggers.com/2019/05/quick-and-easy-t-sne-analysis-in-r/

  • Factor analysis

MOFA: https://biofam.github.io/MOFA2/

  • Linear regression

Videos: https://youtube.com/watch?v=hokALdIst8k&ab_channel=StatQuestwithJoshStarmer

  • Methods to find associations between different omics

Canonical correspondence analysis (CCA) R tutorial: https://rpubs.com/Devy/902673
Partial least squares (pLS): https://cran.r-project.org/web/packages/pls/vignettes/pls-manual.pdf
R package that implements CCA and pLS: https://doi:10.1093/bioinformatics/btp515, https://mixomics.org/methods/spls/

Moreover, here are some nice examples of studies that use multi-omics data in biomedical research.