Syllabus
Getting Started
Please read through this document for the course. It is not that long, and it will save us all time if you know this information about the course.
- Class Times:
- Tuesdays and Thursdays, 3:30–4:50 pm
- First class:
- January 7, 2025
- Location:
- CCIS 1-140
- Instructor:
- Martha White (whitem at ualberta.ca)
- Lab:
- No Lab. Instead TAs have office hours for extra help.
- eClass
- The link to the eClass course
- Textbook
- The Machine Learning Notes for this course.
TAs
(in alphabetical order)
- Alex Ayoub
- Sarosh Dandoti
- Marcos Jose
- Matthew Vandergrift
Contact Information: cmput467@ualberta.ca
Office hours:
Listed on eClass.
Course Objective
Machine Learning is all about analyzing high-dimensional data. The goal for this second course in machine learning is to expand on the foundations from the first course. We will revisit several of the concepts–including how models can be estimated from data; sound estimation principles; generalization; and evaluating models–but with the additional nuances from handling high-dimensional inputs. Topics include: optimization approaches (constrained optimization, hessians, matrix solutions), kernel machines, neural networks, dimensionality reduction, latent variables, feature selection, more advanced methods for assessing generalization (cross-validation, bootstrapping), introduction to temporal data and missing data.
This course relies on the concepts in CMPUT 267 - Machine Learning I.
Overview
- Multivariate probability, covering both discrete and continuous cases
- Analyzing high dimensional data
- Introduction to nonlinear models and representations of data
- Learning generative models
- Basics of ML theory, for generalization and convergence rates
- Introduction to missing data
- Estimating uncertainty
Learning Outcomes
By the end of the course, you should understand…
- The design process for solving a real data analysis problem:
- identifying key data issues (high-dimensionality, dependence, missing data)
- identifying an appropriate model and optimization problem
- identifying an appropriate algorithm and understanding its assumptions
- The utility of data re-representation approaches
- including dimensionality reduction approaches and neural networks
- commonalities behind how complexity is introduced to a variety of different problems
- Generalization, and how it relates to the complexity of the hypothesis class
- More advanced optimization concepts, including proximal methods
- Evaluation of learned models
- including resampling approaches to use data efficiently
- the role of cross-validation to select hyperparameters
- evaluating generative models, in addition to predictive models
By the end of the course, you will have improved your skills in…
- Implementing more advanced estimation approaches (e.g., optimization algorithms for neural networks)
- Applying mathematical concepts to solve real data problems
- Problem solving, by facing open-ended data analysis problems and needing to both formulate the problem and identifying appropriate algorithms to solve the problem
- Understanding the (mathematical) derivations for advanced machine learning algorithms, allowing you to understand new machine learning concepts more quickly
Topics
- Multivariate Probability basics
- discrete multivariate distribution
- continuous multivariate distributions
- covariance matrices and multivariate Gaussians
- entropy and KL-divergences
- Matrices
- eigenvalues and singular value decompositions
- orthogonality
- matrix norms
- Optimizing an n-D function
- gradients and Hessians
- convexity, positive definiteness
- quasi-second-order algorithms (and relation to stepsize selection)
- constrained optimization and proximal methods
- Generalized linear models
- Mixture models
- utility for modeling, as a basic generative model
- expectation-maximization
- Assessing generalization and evaluating models
- cross-validation and bootstrapping
- assessing generative models
- Regularization and overfitting
- l1 for feature selection
- early stopping and validation sets
- Fixed basis representations
- re-representing data using similarities to prototypes
- high-dimensional phenomena and the curse of dimensionality
- kernel similarities (RBFs, matching kernel)
- Learned data representations
- dimensionality reduction and principal components analysis
- neural networks
- More powerful generative models using data
representations
- variational auto-encoders
- learning algorithm (ELBO and reparameterization)
- ML theory basics
- generalization theory basics
- convergence rates for SGD
- Handling missing data
- matrix completion and imputation
- direct methods
- Temporal data and partial observability
- recurrent neural networks
- transformers
Knowledge Prerequisites
This course follows CMPUT 267, and relies on the understanding of the basic concepts in ML taught in that course. We will review many of these concepts, but now in more advanced settings (e.g., maximum likelihood for mixture models and variational autoencoders). The course relies on more knowledge in calculus and linear algebra than was needed for CMPUT 267. An excitement to understand the mathematics underlying machine learning is a must.
Pre-requisites
- One of MATH 115, 118, 145 or 155 (Calculus II)
- MATH 125 or 127 (Linear algebra)
- CMPUT 204 (Algorithms)
- CMPUT 267 (ML I)
Office hours
There are no labs, but TAs will host office hours for question and answering.
Readings / Notes
You are expected to read the corresponding sections about a class’s topic from notes before class as each class will discuss each topic in more detail and address questions about the material. You will have marked Reading Exercises associated with the readings.
All readings are from the (in progress) machine learning notes. These are designed to be short, so that you can read every chapter. I recommend avoiding printing these notes, since later parts of the notes are likely to be modified. You will be notified if I make substantive changes. Minor typos will be fixed without announcement.
Grading
- Four Quizzes (on Assignments): 24%
- Midterm: 25%
- Final exam: 35%
- Reading exercises (5): 12%
- Assignment Submission (4): 4%
Marks will be converted to Letter Grades at the end of the course, based on relative performance. There are no set boundaries, because each year we modify assessments and exams and there is some variability in performance. Set boundaries would penalize students in a year where we inadvertently made a question too difficult. A good indicator for final performance is performance on the exams, which are a large percentage of the grade. If you fail the midterm and final exams (less than 50% on both), then you will likely get an F in the course.
Late Policy
Any late work will not be accepted and will receive 0 marks. If you have a personal issue (e.g., serious illness) that impacts your ability to submit on time, then please email the instructor asap before the deadline, so that an appropriate plan can be made.
Academic Honesty
Academic honesty is taken seriously; for detailed information see https://www.deanofstudents.ualberta.ca/en/AcademicIntegrity.aspx. There is absolutely no talking or interaction during exams. This includes no passing of any items. If you do so, then it will be assumed you are cheating and your exam will be taken away.
For assignments, you are allowed and encouraged to collaborate with others. However, you cannot directly copy another person work. Plagiarism is strictly prohibited in all parts of the course.