teaching

Materials for courses I teach

Below is a collection of teaching materials covering statistical and machine learning techniques with both practical applications and detailed mathematical derivations. Many include complete implementations from scratch in R.

  • Data Integration using random forest:
    Learn how to use random forests to integrate multi-omics datasets, extract proximity measures, and perform downstream analysis like clustering and dimensionality reduction.

  • Data Integration using mixOmics:
    Explore supervised and unsupervised data integration techniques using the mixOmics package, including PLS, sPLS, and DIABLO, with full implementation guidance.

  • Data Integration using MOFA:
    Understand the theory behind Multi-Omics Factor Analysis (MOFA) and how to use it for uncovering shared and specific latent factors across data types, including implementation in R from scratch.

  • Data Integration using mixKernel:
    A hands-on and theoretical introduction to kernel-based integration approaches, including customized kernels and combining them with PCA.

  • PCA basics:
    Covers Principal Component Analysis from first principles, including variance maximization, SVD connection, visualization, and full R implementation.

  • Random forest basics:
    A conceptual and mathematical breakdown of random forests, how trees are built, how splits are selected, and how to interpret feature importance.

  • Mixed models:
    Detailed explanation of linear mixed models including GLS derivations from scratch, variance components, ICC, REML vs ML, and how to implement models using lme4 and interpret random effects.

  • t-SNE and UMAP:
    Dimensionality reduction methods with detailed mathematical steps from pairwise distances to probabilities, including R implementations of both t-SNE and UMAP from scratch.

  • Introduction to Self-Organizing Maps:
    Understand the theory behind SOMs and how to implement and visualize them in R. Includes clustering and interpretation strategies for omics data.

  • Independent Component Analysis:
    Complete mathematical derivation of ICA including contrast functions, whitening, optimization, and implementation of FastICA in R.