A Journey Through Fastbook (AJTFB) - Chapter 6: Regression
fastai
fastbook
regression
computer vision
key point
Its the more things you can do with computer vision chapter of "Deep Learning for Coders with fastai & PyTorch"! Having looked at both multiclass and multilable classification, we now turn our attention to regression tasks. In particular, we’ll look at key point regression models covered in chapter 6. Soooo lets go!
A regression task is all about predicting a continous value rather than a particular cateogry.
Here we’ll consider a particular type of regression problem called image regression, where the “independent variable is an image, and the dependent variable is one or more float.” Our model is going to be a key point model that aims to predict a point (e.g., 2 labels … the x and y) on the image, which in our example is the center of a person’s face.
Defining your DataBlock
Again, the DataBlock is a blueprint for everything required to turn your raw data (images and labels) into something that can be fed through a neural network (DataLoaders with a numerical representation of both your images and labels). Below is the one presented in this chapter.
from fastai.vision.allimport*path = untar_data(URLs.BIWI_HEAD_POSE)path.ls()
“There are 24 directories numbered from 01 to 24 (they corresond to the different people photographed), and a corresponding .obj file for each (we won’t need them here)
Tip
Always EDA your dataset to make sure you understand how it is organized; this is especially important to ensure you create a good validation set without leakage from the training set.
Note
“… we should not just use a random splitter [so as] to ensure that our model can generalize to people that it hasn’t seen yet; a splitter function that returns True for just one person, resulting in a validation set containing just that one person.”
Looks like each person has multiple images, and for each image there is a text file telling us where the point is. We can write a function to get the .txt file for any given image as such
Define the data types for our inputs and targets via the blocks argument.
Here our targets are of type PointBlock. “This is necessary so that fastai knows that the labels represent coordinates … it knows that when doing data augmentation, it should do the same augmentation to these coordinates as it does to the images.”
Define how we’re going to get our images via get_items.
Can just use the get_image_files since we will be passing the path into our DataBlock.dataloaders() method
Define how, from the raw data, we’re going to create our labels via get_y.
Will simply use the get_img_center we defined above since we will get getting a bunch of paths to images.
Here we define a custom splitter using FuncSplitter, which gives us complete control in how our validation set is determined. Here it will be all the images associated to person “13”.
Define things we want to do for each item via item_tfms
Nothing for this example
Define things we want to do for each mini-batch of items via batch_tfms
For each minibatch of data, we’ll resize each image to 320x240 pixels and apply the default augmentations specified in aug_transforms. We’ll also normalize our images used the ImageNet mean/standard deviations since our pretrained model was trained on ImageNet.
Note
If you want to serialize your Learner, do not use lambda functions for defining your DataBlock methods! They can’t be pickled.
Since we know “coordinates in fastai and PyTorch are always rescaled between -1 and 1, we can use those values when defining our Learner
Define your loss function
As we didn’t define a loss function, fastai will pick one for us based on our task. Here is will be MSELoss (mean squared loss).
“… when coordinates are used as the dependent variable, most of the time we’re likely to be trying to predict something as close as possible; that’s basically what MSELoss does”
dls.loss_func
FlattenedLoss of MSELoss()
Metrics
Tip
“… MSE is already a useful metric for this task (although it’s probably more interpretable after we take the square root”
Pick your loss and metrics according to your task …
For single-label classification: nn.CrossEntropyLoss and accuracy, precision, recall, f1, etc…
For multi-label classification: nn.BCEWithLogitsLoss and accuracy, precision, recall, f1, etc…
For regression: nn.MSELoss and the square root of the validation loss as the metric
Resources
https://book.fast.ai - The book’s website; it’s updated regularly with new content and recommendations from everything to GPUs to use, how to run things locally and on the cloud, etc…