CMPUT 367 (Fall 2022)

Intermediate Machine Learning

Syllabus

Class Times:: Tuesdays and Thursdays, 15:30–16:50 (3:30 pm - 4:50 pm)
First class:: September 1, 2022
Location:: CAB 243
Instructor:: Martha White (whitem at ualberta.ca)
Lab:: No Lab.
eClass: https://eclass.srv.ualberta.ca/course/view.php?id=71307
Syllabus: A pdf including the same information as below.
Textbook: The Intermediate Machine Learning Notes.

TAs

(in alphabetical order)

Samuel Neumann
Haseeb Shah
Hugo Luis Silva

Office hours:

Listed on eClass.

Course Objective

Machine Learning is all about analyzing high-dimensional data. The goal for this second course in machine learning is to expand on the foundations from the first course. We will revisit several of the concepts–including how models can be estimated from data; sound estimation principles; generalization; and evaluating models–but with the additional nuances from handling high-dimensional inputs. Topics include: optimization approaches (constrained optimization, hessians, matrix solutions), kernel machines, neural networks, dimensionality reduction, latent variables, feature selection, more advanced methods for assessing generalization (cross-validation, bootstrapping), introduction to non-iid data and missing data.

This course relies on the concepts in CMPUT 267 - Basics of Machine Learning.

Overview

Multivariate probability, covering both discrete and continuous cases
Analyzing high dimensional data
Introduction to nonlinear models and representations of data
Learning generative models
Basics of ML theory, for generalization and convergence rates
Introduction to missing data

Learning Outcomes

By the end of the course, you should understand…

The design process for solving a real data analysis problem:
- identifying key data issues (high-dimensionality, dependence, missing data)
- identifying an appropriate model and optimization problem
- identifying an appropriate algorithm and understanding its assumptions
Representative instances of different learning algorithms
- maximum likelihood for a broader range of problem settings
- Data re-representation approaches, including dimensionality reduction approaches and neural networks
Generalization, and how it relates to the complexity of the hypothesis class
More advanced optimization concepts, including proximal methods
Evaluation of learned models
- including resampling approaches to use data efficiently
- the role of cross-validation to select hyperparameters
- evaluating generative models, in addition to predictive models

By the end of the course, you will have improved your skills in…

Implementing more advanced estimation approaches (e.g., optimization algorithms for neural networks), and being comfortable doing so in a new language (Julia)
Applying mathematical concepts to solve real data problems
Problem solving, by facing open-ended data analysis problems and needing to both formulate the problem and identifying appropriate algorithms to solve the problem

Topics

Multivariate Probability basics
- discrete multivariate distribution
- continuous multivariate distributions
- covariance matrices and multivariate Gaussians
- entropy and KL-divergences
Matrices
- eigenvalues and singular value decomposition
- orthogonality
- matrix norms
Optimizing an n-D function
- gradients and Hessians
- convexity, positive definiteness
- quasi-second-order algorithms (and relation to stepsize selection)
- constrained optimization and proximal methods
Generalized linear models
Mixture models
- expectation-maximization
Assessing generalization and evaluating models
- cross-validation and bootstrapping
- assessing generative models
Regularization and overfitting
- l1 for feature selection (sparsity)
- early stopping and validation sets
Fixed basis representations
- re-representing data using similarities to prototypes
- high-dimensional phenomena and the curse of dimensionality
- kernel similarities (RBFs, matching kernel)
Learned data representations
- dimensionality reduction and principal components analysis
- neural networks
Nonlinear prediction and generative models using data representations
- nonlinear predictors using GLM losses
- nonlinearity for generative models (variational auto-encoder)
ML theory basics
- generalization theory basics
- convergence rates for SGD
Handling missing data
- matrix completion and imputation
- direct methods

Knowledge Prerequisites

This course follows CMPUT 267, and relies on the understanding of the basic concepts in ML taught in that course. We will review many of these concepts, but now in more advanced settings (e.g., maximum likelihood for mixture models). The course relies on more knowledge in calculus and linear algebra than was needed for CMPUT 267. The numerical methods course (CMPUT 340) is a complementary and useful course for CMPUT 367, and so is a recommended co-requisite. An excitement to understand the mathematics underlying machine learning is a must.

Pre-requisites

One of MATH 115, 118, 145 or 155 (Calculus II)
MATH 125 or 127 (Linear algebra)
CMPUT 204 (Algorithms)
CMPUT 267 (Basics of ML)

Office hours

There are no labs, but TAs will host office hours for question and answering.

Readings / Notes

You are expected to read the corresponding sections about a class’s topic from notes before class as each class will discuss each topic in more detail and address questions about the material.

All readings are from the (in progress) machine learning notes. These are designed to be short, so that you can read every chapter. I recommend avoiding printing these notes, since later parts of the notes are likely to be modified (even if only a little bit).

Grading

Quiz: 5%
Midterm: 20%
Final exam: 35%
Assignments (4): 30%
Thought Questions: 10%

Marks will be converted to Letter Grades at the end of the course, based on relative performance. There are no set boundaries, because each year we modify exams and there is some variability in performance. Set boundaries would penalize students in a year where we inadvertently made a question too difficult. A good indicator for final performance is performance on the exams, which are a large percentage of the grade. If you fail both exams (less than 50% on both), then you will likely get an F in the course.

Late Policy

Any late work will not be accepted and will receive 0 marks.

Academic Honesty

All assignments and exams are individual, except when collaboration is explicitly allowed. All the sources used for problem solution must be acknowledged, e.g. web sites, books, research papers, personal communication with people, etc. Academic honesty is taken seriously; for detailed information see https://www.deanofstudents.ualberta.ca/en/AcademicIntegrity.aspx.