CS 5043: HW0
Executing DL Experiments on the Supercomputer
Objectives:
- Implement a shallow network that is capable of learning a
Boolean function.
- Implement experiment control code that executes a single
instance of a learning run and stores the results in a pickle
file.
- Use the supercomputer to execute a set of experiments.
- Implement a tool that brings the results together from the
different experiments so they can be represented in a common
set of figures.
Assignment notes:
- Deadline: Saturday, February 12th @11:59pm. Solutions may be
submitted up to 48 hours late, with a 10% penalty for each 24
hours. Note that HW1 will be assigned and due on the original
schedule.
- This work is to be done on your own. While general discussion
about Python, TensorFlow and Keras is okay, sharing
solution-specific code is inappropriate. Likewise, you may not
download code solutions to this problem from the network.
Data Set
All data sets for the class are available on the supercomputer in the
directory /home/fagg/aml_2022_datasets/. You do not need
to keep copies of these datasets in your own home directory on the
supercomputer.
For this homework assignment, we will be using hw0_dataset.pkl.
This file contains one python object, a
dictionary, that has two keys: ins and outs. The
following code snippet will load the data into your python environment:
fp = open("hw0_dataset.pkl", "rb")
foo = pickle.load(fp)
fp.close()
Notes:
- We have assumed that this file is in the same directory as
your notebook/python code.
- There is only a training set (and no validation or test sets).
So, you will need to "fake" a validation set to use EarlyStopping.
- Take some time to examine the data.
Part 1: Network
-
Write a function that constructs a relatively shallow neural network
that can regenerate the output from the corresponding inputs.
-
The first thing that you try will likely not work. You will
need to think about the appropriate non-linearities to use and
to play with the number of layers and neurons.
- Once you have settled on an architecture that works well, then
proceed to the next part.
Part 2: Multiple Runs
- Use the supercomputer to perform 10 independent learning runs.
- Plot the learning curves (MSE as a function of epoch) for all
10 runs on the same figure.
- Compute the absolute prediction errors for all runs, combine
these data and generate a single histogram of the
absolute errors.
Hints
Expectations
- Terminal MSE for the individual runs should be very low. If
this is not the case, then go back to your network design.
- It is very hard to learn a network that generates the correct
output for every example. Getting a couple examples wrong in
individual runs is okay.
What to Hand-In
Submit a zip file to the HW0 section of Gradescope (enter via Canvas)
that contains:
- your python code,
- your batch file,
- the learning curve figure,
- the histogram, and
- the stdout files from each of the 10 experiments.
Grading
- 30 pts: low MSE for every run
- 20 pts: few prediction errors greater than 0.4
- 20 pts: well structured code
- 10 pts: appropriate documentation
- 10 pts: figures are well formatted and have appropriate axis
labels
- 10 pts: 10 stdout files
- 10 pts (bonus): all prediction errors less than 0.4
andrewhfagg -- gmail.com
Last modified: Fri Feb 4 00:46:29 2022