Mathematics of Big Data

Readings should be done before class. All resources (including lecture slides, homework, starter files, hw solution, articles) can be found under the Resources tab.

Topics, readings and time are flexible and subject to change.

Week Topics Homework
Supervised Learning
Week 1
Introduction to Big Data
Linear Regression
Normal Equations and Optimization Techniques
Linear Algebra Review
Covariance Matrix
Read:
Murphy 1.{all}
Murphy, 7.{1,...,5}
Week 2 Gaussian Distribution
Linear Regression (Probabilitic Approach)
Gradient Descent
Newton's Methods
Logistic Regression
Exponential Family
Generalized Linear Models
Read:
Murphy, 8.{1,2,3,5} \ 8.{3.4,3.5},
9.{1,2.2,2.4,3}

Due:
Homework 1
Brainstorm for midterm project
Week 3 Probability Review
Generalized Linear Models continued
Poisson Regression
Softmax Regression
Covariance matrix
Multivariate Gaussian Distribution
Marginalized Gaussian and the Schur Complement
Read:
Murphy 9.7, 4.{1,2,3,4,5,6} (important background)

Due:
Homework 2
Project Proposal (<1 page)
Week 4 Dimensionality Reduction
Spectral Decomposition
Singular Value Decomposition
Principal Component Analysis
Generative Learning Algorithms
Gaussian Discriminant Analysis
Cholesky Decomposition
Due:
Final Project Proposal
Homework 3
Week 5 Naive Bayes
L1 Regularization and Sparsity
Lasso
Support Vector Machines
Kernels
Read:
Murphy 14.{1,2,3,4} \ 14.{4.4}
MapReduce: Simplified Data Processing on Large Clusters

Due:
Homework 4
Unsupervised Learning
Week 6
Introduction to Unsupervised Learning
Clustering
K-Means
Mixture of Gaussians
Jensen's inequality
Expectation-Maximization (EM) Algorithm
Read:
Murphy 11.{1,2,3,4} \ 11.{4.6,4.9}
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Random Features for Large-Scale Kernel Machines

Due:
Homework 5
Week 7 Summary of EM Algorithm
EM for MAP estimation
Kernel PCA
One Class Support Vector Machines
Learning Theory
Read:
Murphy 12.2.{0,1,2,3} 14.4.4
Support Vector Method for Novelty Detection

Due:
Homework 6
Midterm Project Work
Week 8 (spring break)
Work on your midterm projects.
Read:
None
Due:
None
Midterm Project Presentation; Project Due
Week 9
Be ready to present your midterm projects in class. Your submission should include all relevant code and the .tex files for your essay.
Read:
None
Due:
Midterm presentation and slides; Midterm project write-up.
Learning Theory
Week 10
Bayesian Learning
Bayesian Logistic and Linear Regressions (review)
Bayesian Inference
Intractable Integrals and Motivation for Approximate Methods
Learning Theory
Read:
Large-Scale Sparse Principal Component Analysis with Application to Text Data
On the Convergence Properties of the EM Algorithm

Due:
Homework 7
Recommender Systems
Week 11 (only if time permits)
Introduction to Recommender Systems
Collaborative Filtering
Non-Negative Matrix Factorization
Using Non-Negative Matrix Factorization for Topic Modelling
Read:
Murphy 27.6.2
Netflix Update: Try This at Home

Due:
Homework 8
Graph Methods
Week 12 (only if time permits)
(TBD) Read:
Murphy 10.{1,2,3,4,5,6}

Due:
Work on final project
Week 13
Early final project presentations Read:


Due:
Work on Final Project
Week 14
Final project presentation Due:
Final Project Presentation Slides
Work on Final Project
Finals week
Final Project Due Due:
Finish writing up final project