CMPUT 467/504

Machine Learning II, Winter 2025

Syllabus

Getting Started

Please read through this document for the course. It is not that long, and it will save us all time if you know this information about the course.

Class Times:: Tuesdays and Thursdays, 3:30–4:50 pm
First class:: January 7, 2025
Location:: CCIS 1-140
Instructor:: Martha White (whitem at ualberta.ca)
Lab:: No Lab. Instead TAs have office hours for extra help.
eClass: The link to the eClass course
Textbook: The Machine Learning Notes for this course.

TAs

(in alphabetical order)

Alex Ayoub
Sarosh Dandoti
Marcos Jose
Matthew Vandergrift

Contact Information: cmput467@ualberta.ca

Office hours:

Listed on eClass.

Course Objective

Machine Learning is all about analyzing high-dimensional data. The goal for this second course in machine learning is to expand on the foundations from the first course. We will revisit several of the concepts–including how models can be estimated from data; sound estimation principles; generalization; and evaluating models–but with the additional nuances from handling high-dimensional inputs. Topics include: optimization approaches (constrained optimization, hessians, matrix solutions), kernel machines, neural networks, dimensionality reduction, latent variables, feature selection, more advanced methods for assessing generalization (cross-validation, bootstrapping), introduction to temporal data and missing data.

This course relies on the concepts in CMPUT 267 - Machine Learning I.

Overview

Multivariate probability, covering both discrete and continuous cases
Analyzing high dimensional data
Introduction to nonlinear models and representations of data
Learning generative models
Basics of ML theory, for generalization and convergence rates
Introduction to missing data
Estimating uncertainty

Learning Outcomes

By the end of the course, you should understand…

The design process for solving a real data analysis problem:
- identifying key data issues (high-dimensionality, dependence, missing data)
- identifying an appropriate model and optimization problem
- identifying an appropriate algorithm and understanding its assumptions
The utility of data re-representation approaches
- including dimensionality reduction approaches and neural networks
- commonalities behind how complexity is introduced to a variety of different problems
Generalization, and how it relates to the complexity of the hypothesis class
More advanced optimization concepts, including proximal methods
Evaluation of learned models
- including resampling approaches to use data efficiently
- the role of cross-validation to select hyperparameters
- evaluating generative models, in addition to predictive models

By the end of the course, you will have improved your skills in…

Implementing more advanced estimation approaches (e.g., optimization algorithms for neural networks)
Applying mathematical concepts to solve real data problems
Problem solving, by facing open-ended data analysis problems and needing to both formulate the problem and identifying appropriate algorithms to solve the problem
Understanding the (mathematical) derivations for advanced machine learning algorithms, allowing you to understand new machine learning concepts more quickly

Topics

Multivariate Probability basics
- discrete multivariate distribution
- continuous multivariate distributions
- covariance matrices and multivariate Gaussians
- entropy and KL-divergences
Matrices
- eigenvalues and singular value decompositions
- orthogonality
- matrix norms
Optimizing an n-D function
- gradients and Hessians
- convexity, positive definiteness
- quasi-second-order algorithms (and relation to stepsize selection)
- constrained optimization and proximal methods
Generalized linear models
Mixture models
- utility for modeling, as a basic generative model
- expectation-maximization
Assessing generalization and evaluating models
- cross-validation and bootstrapping
- assessing generative models
Regularization and overfitting
- l1 for feature selection
- early stopping and validation sets
Fixed basis representations
- re-representing data using similarities to prototypes
- high-dimensional phenomena and the curse of dimensionality
- kernel similarities (RBFs, matching kernel)
Learned data representations
- dimensionality reduction and principal components analysis
- neural networks
More powerful generative models using data representations
- variational auto-encoders
- learning algorithm (ELBO and reparameterization)
ML theory basics
- generalization theory basics
- convergence rates for SGD
Handling missing data
- matrix completion and imputation
- direct methods
Temporal data and partial observability
- recurrent neural networks
- transformers

Knowledge Prerequisites

This course follows CMPUT 267, and relies on the understanding of the basic concepts in ML taught in that course. We will review many of these concepts, but now in more advanced settings (e.g., maximum likelihood for mixture models and variational autoencoders). The course relies on more knowledge in calculus and linear algebra than was needed for CMPUT 267. An excitement to understand the mathematics underlying machine learning is a must.

Pre-requisites

One of MATH 115, 118, 145 or 155 (Calculus II)
MATH 125 or 127 (Linear algebra)
CMPUT 204 (Algorithms)
CMPUT 267 (ML I)

Office hours

There are no labs, but TAs will host office hours for question and answering.

Readings / Notes

You are expected to read the corresponding sections about a class’s topic from notes before class as each class will discuss each topic in more detail and address questions about the material. You will have marked Reading Exercises associated with the readings.

All readings are from the (in progress) machine learning notes. These are designed to be short, so that you can read every chapter. I recommend avoiding printing these notes, since later parts of the notes are likely to be modified. You will be notified if I make substantive changes. Minor typos will be fixed without announcement.

Grading

Four Quizzes (on Assignments): 24%
Midterm: 25%
Final exam: 35%
Reading exercises (5): 12%
Assignment Submission (4): 4%

Marks will be converted to Letter Grades at the end of the course, based on relative performance. There are no set boundaries, because each year we modify assessments and exams and there is some variability in performance. Set boundaries would penalize students in a year where we inadvertently made a question too difficult. A good indicator for final performance is performance on the exams, which are a large percentage of the grade. If you fail the midterm and final exams (less than 50% on both), then you will likely get an F in the course.

Late Policy

Any late work will not be accepted and will receive 0 marks. If you have a personal issue (e.g., serious illness) that impacts your ability to submit on time, then please email the instructor asap before the deadline, so that an appropriate plan can be made.

Academic Honesty

Academic honesty is taken seriously; for detailed information see https://www.deanofstudents.ualberta.ca/en/AcademicIntegrity.aspx. There is absolutely no talking or interaction during exams. This includes no passing of any items. If you do so, then it will be assumed you are cheating and your exam will be taken away.

For assignments, you are allowed and encouraged to collaborate with others. However, you cannot directly copy another person work. Plagiarism is strictly prohibited in all parts of the course.