Kameron Decker Harris

Schedule

Legend: * = asynchronous

Date Topic Assignments Other
9/23 Wed Introduction and logistics A1 released lesson 1
9/25 Fri Linear regression 1 A1 due before class lesson 2
least squares solution Gdrive
9/28 Mon Linear regression 2 watch for tomorrow: polynomial fits on Gdrive lesson 3
optional extras:
SVD overview
SVD & least squares
9/29 Tue Polynomials & overfitting lesson 4 notebook
9/30 Wed SVD A2 released, code, tex lesson 5
10/2 Fri PCA lesson 6
10/5 Mon Homework help, PCA 2 lesson 7 PCA notebook
10/6 Tue Bias-variance and overfitting/underfitting lesson 8 Linear algebra extra 12-1 pm
10/7 Wed Ridge regression 1 lesson 9 notebook
10/9 Fri Ridge regression 2 lesson 10
10/12 Mon Ridge 3, Probability and priors A2 due before midnight (pdf on Canvas, code on github) lesson 11
10/13 Tue* Lasso (asynchronous) A3 released, tex, code Watch Gdrive video on Lasso, sparsity notebook lasso 1
10/14 Wed Lasso 2 lesson 13
10/16 Fri A2 feedback Note: typo in A3 fixed solutions on Piazza
10/19 Mon Classification intro lesson 14
10/20 Tue Logistic regression 1 lesson 15
10/21 Wed Logistic 2: losses, gradient descent lesson 16
10/23 Fri GD details, SGD A3 due lesson 17
10/26 Mon Mini-batching and convexity A4 tex code (note: due Monday 11/2) lesson 18
10/27 Tue Getting started with nonlinear models: nearest-neighbors and trees
10/28 Wed Features and kernels
10/30 Fri Project discussion
11/2 Mon A4 due

Resources

There are no required textbooks for the course, but several that I recommend are:

  1. Hastie, Tibshirani and Friedman "The Elements of Statistical Learning"
  2. James, Witten, Hastie and Tibshirani "An Introduction to Statistical Learning"
  3. Goodfellow, Bengio and Courville "Deep Learning"
  4. Boyd and Vandenberghe "Convex Optimization"
More resources that are great:

Syllabus

Course information

Time: M,T,W,F 2–2:50 pm (sync & async)

Place: Remote

Communication: This website, Piazza, occasional email blasts

Lessons: Zoom (link on Canvas)

Office hours: W 10-11 am, F 10-11 am, Zoom (link on Canvas)

Prerequisites:

Contact

Name: Kameron Decker Harris

Office: CF 475

Phone: 360-650-7366

Email: kameron.harris@wwu.edu

Learning in an unusual year

This academic quarter is not normal. We are all adjusting to the new online environment and many of us may be dealing with extra stress and uncertainty. My goal is to facilitate your learning as best I can.

The class is listed as synchronous (Zoom), however a minority of lessons will be provided asynchronously (video posted to Google drive). The class schedule (this website) will keep you up to date at least a week in advance about which lessons require synchronous attendance. I will also be offering multiple office hours, but if these do not work for you and you would like to meet, send me an email with 3 times that will work for you.

Course description and outcomes

Covers important machine learning research areas such as neural nets, kernel methods, graphical models, Bayesian learning, decision tree learning, evolutionary computation and computational learning theory. Models and algorithms from these research areas will be analyzed.

On completion of CSCI 471, students will demonstrate:

On completion of CSCI 571, students will demonstrate:

Grading

All students will be graded on homework assignments, participation and quizzes, and a final project. There will be no exams. For 571 students, there is also a slideshow presentation of an advanced machine learning topic to the class. Sometimes there will be extra, more difficult problems in the homework which will only be graded for grad students.

I may curve grades at the end of the course if it provides a spread of grades that more accurately represents the quality of work in the class. I will only do this if it improves your grade.

Participation and quizzes

Part of your grade will reflect whether or not you attend the synchronous sessions, since these will be interactive and the class will benefit from being able to ask questions and work in breakout groups. Short quizzes may be used to check your understanding of background and recently covered material.

Deadlines

Please refer to this website for the most up-to-date deadlines. You have 2 late days that you can use for up to two assignments (i.e., you may submit 1 assignment 2 days late or 2 assignments 1 day late each). If you miss a deadline or expect to miss one due to a medical or family emergency, please contact me as soon as possible to discuss arrangements.

Submitting your work

Program and project code will be developed and stored in git version control repositories, which can be accessed in Linux using the command line client. Specifically, we will use private github repositories under the "kamdh-teaching" organization. You will receive an invitation link to create a repository for each assignment, complete the assignment in a local copy of the repo, and submit by pushing your final changes to GitHub. You will receive feedback in the same repository in a branch called "grading."

Written homework must be submitted in a single PDF, with any math preferably typeset in LaTeX. You may write an assignment out by hand and scan it, but it must be legible and neat. Using your phone as a “scanner” is not ideal but okay so long as the result is as legible as a good photocopy, cropped, and submitted as a single PDF.

For group homework assignments, each student must write up and submit their own answers to the assignment. List everyone you worked with.

Homework guidelines

Teamwork is allowed on the assignments unless it is explicitly an individual assignment or problem. Working in groups is one of the best ways to learn from each other. An ideal group size is 3 people, since it's hard for everyone to contribute in large groups.

Each student must write up and submit their own answers to the assignment, with all group members listed. That means you may discuss the steps in an algorithm or mathematical argument, but write it out or type the code yourself. Doing this helps fix the ideas in your memory. Don't copy-paste from colleagues or the internet. Do not post your code to the internet.

Math problems are graded for correctness and quality of explanation ("X follows from Y, because Z"). If you make a math mistake early on that leads to an incorrect final result, you will still receive partial credit if the rest of the logic is sound.

Writing assignments are graded like a writing class, since communication is an important skill even for engineers and other technical people.

Coding language: We will be working exclusively in python 3, one of the most common languages of machine learning practitioners. For the assignments, you will be required to write your own algorithms using basic linear algebra routines in numpy and scipy but without any machine learning-specific libraries such as scikit-learn. Which packages are allowed in a given assignment will be specified. In any case, your programs must run on the department computers running Linux. If installing python on your home machine, I recommend the Anaconda distribution.

Programs will be graded on correctness, clarity, and efficiency (in that order).

Technical assistance

If you are having problems with any of the machines in a Computer Science Department lab, contact CS Support at cs.support@wwu.edu.

In-class Behavior, Norms, and University Policies

In this course, you should act professionally, respectfully, and maturely. This includes both our time together and when interacting with classmates outside of class. Some of our discussions may deal with sensitive topics (e.g., politics and race). Please consider how your words may sound to others. It is okay to disagree, but I expect you to treat each other with civility.

Please review the University policies outlined at http://syllabi.wwu.edu regarding: