Syllabus
- Class Times:
- Tuesdays and Thursdays, 15:30–16:50 (3:30 pm - 4:50 pm)
- First class:
- September 1, 2022
- Location:
- CAB 243
- Instructor:
- Martha White (whitem at ualberta.ca)
- Lab:
- No Lab.
- eClass
- https://eclass.srv.ualberta.ca/course/view.php?id=71307
- Syllabus
- A pdf including the same information as below.
- Textbook
- The Intermediate Machine Learning Notes.
TAs
(in alphabetical order)
- Samuel Neumann
- Haseeb Shah
- Hugo Luis Silva
Office hours:
Listed on eClass.
Course Objective
Machine Learning is all about analyzing high-dimensional data. The goal for this second course in machine learning is to expand on the foundations from the first course. We will revisit several of the concepts–including how models can be estimated from data; sound estimation principles; generalization; and evaluating models–but with the additional nuances from handling high-dimensional inputs. Topics include: optimization approaches (constrained optimization, hessians, matrix solutions), kernel machines, neural networks, dimensionality reduction, latent variables, feature selection, more advanced methods for assessing generalization (cross-validation, bootstrapping), introduction to non-iid data and missing data.
This course relies on the concepts in CMPUT 267 - Basics of Machine Learning.
Overview
- Multivariate probability, covering both discrete and continuous cases
- Analyzing high dimensional data
- Introduction to nonlinear models and representations of data
- Learning generative models
- Basics of ML theory, for generalization and convergence rates
- Introduction to missing data
Learning Outcomes
By the end of the course, you should understand…
- The design process for solving a real data analysis problem:
- identifying key data issues (high-dimensionality, dependence, missing data)
- identifying an appropriate model and optimization problem
- identifying an appropriate algorithm and understanding its assumptions
- Representative instances of different learning algorithms
- maximum likelihood for a broader range of problem settings
- Data re-representation approaches, including dimensionality reduction approaches and neural networks
- Generalization, and how it relates to the complexity of the hypothesis class
- More advanced optimization concepts, including proximal methods
- Evaluation of learned models
- including resampling approaches to use data efficiently
- the role of cross-validation to select hyperparameters
- evaluating generative models, in addition to predictive models
By the end of the course, you will have improved your skills in…
- Implementing more advanced estimation approaches (e.g., optimization algorithms for neural networks), and being comfortable doing so in a new language (Julia)
- Applying mathematical concepts to solve real data problems
- Problem solving, by facing open-ended data analysis problems and needing to both formulate the problem and identifying appropriate algorithms to solve the problem
Topics
- Multivariate Probability basics
- discrete multivariate distribution
- continuous multivariate distributions
- covariance matrices and multivariate Gaussians
- entropy and KL-divergences
- Matrices
- eigenvalues and singular value decomposition
- orthogonality
- matrix norms
- Optimizing an n-D function
- gradients and Hessians
- convexity, positive definiteness
- quasi-second-order algorithms (and relation to stepsize selection)
- constrained optimization and proximal methods
- Generalized linear models
- Mixture models
- expectation-maximization
- Assessing generalization and evaluating models
- cross-validation and bootstrapping
- assessing generative models
- Regularization and overfitting
- l1 for feature selection (sparsity)
- early stopping and validation sets
- Fixed basis representations
- re-representing data using similarities to prototypes
- high-dimensional phenomena and the curse of dimensionality
- kernel similarities (RBFs, matching kernel)
- Learned data representations
- dimensionality reduction and principal components analysis
- neural networks
- Nonlinear prediction and generative models using data
representations
- nonlinear predictors using GLM losses
- nonlinearity for generative models (variational auto-encoder)
- ML theory basics
- generalization theory basics
- convergence rates for SGD
- Handling missing data
- matrix completion and imputation
- direct methods
Knowledge Prerequisites
This course follows CMPUT 267, and relies on the understanding of the basic concepts in ML taught in that course. We will review many of these concepts, but now in more advanced settings (e.g., maximum likelihood for mixture models). The course relies on more knowledge in calculus and linear algebra than was needed for CMPUT 267. The numerical methods course (CMPUT 340) is a complementary and useful course for CMPUT 367, and so is a recommended co-requisite. An excitement to understand the mathematics underlying machine learning is a must.
Pre-requisites
- One of MATH 115, 118, 145 or 155 (Calculus II)
- MATH 125 or 127 (Linear algebra)
- CMPUT 204 (Algorithms)
- CMPUT 267 (Basics of ML)
Office hours
There are no labs, but TAs will host office hours for question and answering.
Readings / Notes
You are expected to read the corresponding sections about a class’s topic from notes before class as each class will discuss each topic in more detail and address questions about the material.
All readings are from the (in progress) machine learning notes. These are designed to be short, so that you can read every chapter. I recommend avoiding printing these notes, since later parts of the notes are likely to be modified (even if only a little bit).
Grading
- Quiz: 5%
- Midterm: 20%
- Final exam: 35%
- Assignments (4): 30%
- Thought Questions: 10%
Marks will be converted to Letter Grades at the end of the course, based on relative performance. There are no set boundaries, because each year we modify exams and there is some variability in performance. Set boundaries would penalize students in a year where we inadvertently made a question too difficult. A good indicator for final performance is performance on the exams, which are a large percentage of the grade. If you fail both exams (less than 50% on both), then you will likely get an F in the course.
Late Policy
Any late work will not be accepted and will receive 0 marks.
Academic Honesty
All assignments and exams are individual, except when collaboration is explicitly allowed. All the sources used for problem solution must be acknowledged, e.g. web sites, books, research papers, personal communication with people, etc. Academic honesty is taken seriously; for detailed information see https://www.deanofstudents.ualberta.ca/en/AcademicIntegrity.aspx.