CS 5043: HW1

Objectives:

Implement a shallow network that is capable of learning a Boolean function.
Implement experiment control code that executes a single instance of a learning run and stores the results in a pickle file.
Use the supercomputer to execute a set of experiments.
Implement a tool that brings the results together from the different experiments so they can be represented in a common set of figures.

Assignment notes:

Deadline: Thursday, February 18th @11:59pm. Solutions may be submitted up to 48 hours late, with a 10% penalty for each 24 hours.
This work is to be done on your own. While general discussion about Python, TensorFlow and Keras is okay, sharing solution-specific code is inappropriate. Likewise, you may not download code solutions to this problem from the network.

Class Github Repository

The class gihub repository is available here: https://github.com/ahfagg/aml_2021.git

An up-to-date copy is kept on OSCER at: /home/fagg/aml_2021

This repository contains:

Sample batch files (util)
Code for homework assignments (code)
Datasets for some of the homework assignments (datasets)

Data Set

The dataset is available in the Github repository and is called hw1_dataset.pkl. This file contains one python object, a dictionary, that has two keys: ins and outs. The following code snippet will load the data into your python environment:

fp = open("hw1_dataset.pkl", "rb")
foo = pickle.load(fp)
fp.close()

Notes:

We have assumed that this file is in the same directory as your notebook.
There is only a training set (and no validation or test sets).
Take some time to examine the data.

Part 1: Network

Write a function that constructs a relatively shallow neural network that can regenerate the output from the corresponding inputs.
The first thing that you try will likely not work. You will need to think about the appropriate non-linearities to use and to play with the number of layers and neurons.
Once you have settled on an architecture that works well, then proceed to the next part.

Part 2: Multiple Runs

Use the supercomputer to perform 10 independent learning runs.
Plot the learning curves (MSE as a function of epoch) for all 10 runs on the same figure.
Compute the absolute prediction errors for all runs, combine these data and generate a histogram of the absolute errors.

Hints

Once a model is trained, you can ask it to predict output values for a set of examples using the model.predict() function
Numpy provides an absolute value operator: np.abs()
Matplotlib provides an easy way to generate histograms:
plt.hist(errors, 50)
This generates a histogram with 50 bins
A figure can be written to a file by Matplotlib:
plt.savefig('foo.png')
will generate a PNG file of the specified name.

Expectations

Terminal MSE for the individual runs should be very low. If this is not the case, then go back to your network design.
It is very hard to learn a network that generates the correct output for every example. Getting a couple examples wrong in individual runs is okay.

What to Hand-In

Submit a zip file to the HW1 section of Gradescope (enter via Canvas) that contains:

your python code,
the learning curve figure, and
the histogram.

Grading

40 pts: low MSE for every run
20 pts: few prediction errors greater than 0.4
20 pts: well structured code
10 pts: appropriate documentation
10 pts: figures are well formatted and have appropriate axis labels
10 pts (bonus): all prediction errors less than 0.4

andrewhfagg -- gmail.com

Last modified: Fri Feb 12 01:16:18 2021