This github repoitory contains code for Introduction to Supervised Learning and Introduction to Deep Learning courses. You can run the notebooks on Google colab. All code is in python and all neural networks are built in pytorch.
You will need to either login or sign up to a Google account to open and run notebooks on Google colab. (Note, I have included details of my conda environment in requirements.txt
file for running offline, but it is highly recommended to use Google colab during the tutorial to avoid any potential issues with package installations.)
We will use the California house price dataset
Use python packages to predict house prices starting with linear regression with one input feature, then adding more terms (polynomial regression), adding more variables (multivariate linear regression) and even regularisation.
Code up your own linear regression, gradient descent, and stochastic gradient descent. Compare the results to the python packages scikit-learn and even Pytorch.
We will use the Titanic survival rate dataset.
Use python packages to compare different classification methods. We will build a logistic regression and classification tree and at the end we will compare their true negatives, true positives, false negatives and false positives.
Code up your own logistic regression and classification trees. For logistic regression, we will consider how to interpret the coefficients we have learned. For the classification trees, we will think about which variable to split on by considering which variable leads to the greatest reduction in entropy.
We will use the same problem as part 1, the California house price dataset, although we are using a larger dataset. We will build a simple neural network in Pytorch and you will learn how to code up the training loop, following the steps discussed in class. Look out for signs of overfitting and explore different neural network structures. Compare the results to the linear regression model we built on Tuesday.
We will build a convolutional neural network (CNN) classifier to classify cats and dogs, using the Oxford-IIIT-Pet dataset. We will go through the steps needed to create a CNN to predict if an image is a dog (1) or a cat (0). Pay attention to the size of the output at each convolutional layer and check for any errors before running the training loop. At the end, you can look at what different layers of the network are doing, in other words, what kernels/filters have we learned?
Python for R users:
- Rebecca Barter Blog plot: From R to Python. Includes how to use Pandas dataframes, similar to R dataframes.
- R Vignettes: Python Primer. Includes an introduction to using classes, needed for deep learning tutorial.