CS 5043: HW3: Convolutional Neural Networks
Objectives
- Implement general code that constructs Convolutional Neural
Network models based on a set of parameters.
- Identify two model configurations that performs "okay" on the
validation set (they should learn something, but performance does
not have to be stellar yet).
- Perform a full set of 5 rotations to demonstrate consistency in
the model performance.
Assignment Notes
- Deadline: Tuesday, March 8th @11:59pm.
- Hand-in procedure: submit to a zip file to the HW3 dropbox on
Gradescope (details below)
- This work is to be done on your own. While general discussion
about Python, Keras and Tensorflow is encouraged, sharing
solution-specific code is inappropriate. Likewise, downloading
solution-specific code is not allowed.
- Do not submit MSWord documents.
Data Set
The Core50 data set
is a large database of videos of objects as they are being
moved/rotated under a variety of different lighting and background
conditions. Our general task is to classify the object being shown in a
single frame of one of these videos.
Data Organization
Provided Code
We are providing the following code posted on the main course web page:
- hw3_base.py: An experiment-execution module. Parameter
organization, loading data, executing experiment, saving
results
Prediction Problem
We will focus on the distinction between mugs, scissors, and glasses,
for which we only have five distinct example objects (though, for
each, we have many different perspectives and conditions). Our goal
is to construct a model that will be generally applicable: ideally, it
will be able to distinguish between any mug, any pair of
scissors, and any glasses. However, given the small number of
objects, this is a challenge. For the purposes of this assignment, we
will use three objects from each class for training and one distinct
object from each class for each of validation and testing. For rotation 0:
- Training class 1 (scissors): objects o11-o13
- Training class 2 (mugs): o41-o43
- Training class 3 (glasses): objects o26-o28
- Validation class 1: object o14
- Validation class 2: object o44
- Validation class 3: object o29
- Testing class 1: object o15
- Testing class 2: object o45
- Testing class 3: object o30
- Conditions: We will use all possible conditions
Architectures
You will create two convolutional neural networks to distinguish the mug, scissors,
and glasses: one will be a shallow network and the other will be a deep
network. Each will nominally have the following structure:
- One or more convolutional filters, each (possibly) followed by a
max pooling layer.
- Use your favorite activation function
- In most cases, each conv/pooling layer will involve some
degree of size reduction (striding)
- Convolutional filters should not be larger than 5x5
(as the size of the filter gets larger, the memory
requirements explode)
- Flatten
- One or more dense layers
- Choose your favorite activation function
- One output layer with three units (one for each class). The
activation for this layer should be softmax
- Loss: categorical cross-entropy
- Additional metric: categorical accuracy
Since the data set is relatively small (in terms of the number of
distinct objects), it is important to take steps to
address the over-fitting problem. Here are the key tools that you have:
- L1 or L2 regularization
- Dropout. Only use dropout with Dense layers
- Try to keep the number of trainable parameters small (1-2 million)
- Data augmentation of the training set data (never use data
augmentation for the validation or testing sets)
Experiments
- The primary objective is to get your model building code
working properly and to execute some simple experiments.
- Spend a little time informally narrowing down the
details of your two architectures, including the
hyper-parameters (layer sizes, dropout, regularization).
Don't spend a lot of time on this step
- Once you have made your choice of architecture for each,
you will perform five rotation for each model (so, a total of
10 independent runs)
-
Figure 1 and 2: Learning curves (validation accuracy and
loss as a function of epoch) for the shallow and deep models.
Put all five curves on a single plot.
- Figure 3: Generate a histogram of test set accuracy (a
total of 10 samples). The shallow and deep samples should have
different colors (an alpha of 0.5 will give the histogram
values some transparency).
Hints / Notes
- Start small: get the pipeline working first on a small,
feasible problem (classes_mini.pkl references a very small data
set and matches core50_hw3_mini.zip).
- We are using independent training/validation/testing
ImageDataGenerators to produce small batches of samples at a
given instant in time for training and evaluation. You should
keep this approach. Consider adding data augmentation for the
training set.
- We use a general function for creating networks that takes as
input a set of parameters that define the configuration of the
convolutional layers and dense layers. By changing these
parameters, we can even change the number of layers. This makes
it much easier to try a variety of things without having to
re-implement or copy a lot of code.
- Remember to check your model summary to make sure that it
matches your expectations
- Before executing on the supercomputer, look carefully at your
memory usage (our big model requires almost 10GB of memory)
- CPUS_PER_TASK in the batch file and at the command line should
be 10
- Our default batch size is 32. This is probably sufficient for
use on the supercomputer (it works on moderate sized laptops)
- steps_per_epoch controls how many training set batches are used
for each epoch of training. More takes more time to execute
each epoch, but you want this to be large enough to achieve a
reasonable approximation of the true error gradient.
- validation_fraction determines what fraction of the available
validation set you use to evaluate the model after each epoch.
The larger this is, the more time it takes to finish execution
of the epoch, but it also gives you a more stable estimate of
the true validation performance of the model.
- testing_fraction: keep at 0.5.
What to Hand In
A single zip file that contains:
- All of your python code, including your network building code
- If your visualization code is a Jupyter Notebook, then export
a pdf of your notebook and include it
- File/Save and Export Notebook As/PDF
- Figures 1-3
- Your batch file(s)
- One sample stdout file
- A written reflection that answers the following questions:
- How many parameters were needed by your shallow and deep
networks?
- What can you conclude from the validation accuracy learning
curves for each of the shallow and deep networks? How
confident are you that you have created models that you
can trust?
- Did your shallow or deep network perform better with
respect to the test set? (no need for a statistical
argument here)
Include this reflection as a separate file or at the end of
your Jupyter notebook
Grading
- 20 pts: Clean code for model building (including documentation)
- 20 pts: Figure 1: Shallow/deep loss learning curves
- 20 pts: Figure 2: Shallow/deep accuracy learning curves
- 20 pts: Figure 3: Test set histograms
- 20 pts: Reflection
andrewhfagg -- gmail.com
Last modified: Tue Mar 8 13:09:44 2022