CS 5043: HW1
Objectives:
- Implement a shallow network that is capable of learning a
Boolean function.
- Implement experiment control code that executes a single
instance of a learning run and stores the results in a pickle
file.
- Use the supercomputer to execute a set of experiments.
- Implement a tool that brings the results together from the
different experiments so they can be represented in a common
set of figures.
Assignment notes:
- Deadline: Thursday, February 18th @11:59pm. Solutions may be
submitted up to 48 hours late, with a 10% penalty for each 24
hours.
- This work is to be done on your own. While general discussion
about Python, TensorFlow and Keras is okay, sharing
solution-specific code is inappropriate. Likewise, you may not
download code solutions to this problem from the network.
Class Github Repository
The class gihub repository is available here:
https://github.com/ahfagg/aml_2021.git
An up-to-date copy is kept on OSCER at: /home/fagg/aml_2021
This repository contains:
- Sample batch files (util)
- Code for homework assignments (code)
- Datasets for some of the homework assignments (datasets)
Data Set
The dataset is available in the Github repository and is called
hw1_dataset.pkl. This file contains one python object, a
dictionary, that has two keys: ins and outs. The
following code snippet will load the data into your python environment:
fp = open("hw1_dataset.pkl", "rb")
foo = pickle.load(fp)
fp.close()
Notes:
- We have assumed that this file is in the same directory as
your notebook.
- There is only a training set (and no validation or test sets).
- Take some time to examine the data.
Part 1: Network
-
Write a function that constructs a relatively shallow neural network
that can regenerate the output from the corresponding inputs.
-
The first thing that you try will likely not work. You will need to
think about the appropriate non-linearities to use and to play with the number
of layers and neurons.
- Once you have settled on an architecture that works well, then
proceed to the next part.
Part 2: Multiple Runs
- Use the supercomputer to perform 10 independent learning runs.
- Plot the learning curves (MSE as a function of epoch) for all
10 runs on the same figure.
- Compute the absolute prediction errors for all runs, combine
these data and generate a histogram of the absolute errors.
Hints
Expectations
- Terminal MSE for the individual runs should be very low. If
this is not the case, then go back to your network design.
- It is very hard to learn a network that generates the correct
output for every example. Getting a couple examples wrong in individual runs is okay.
What to Hand-In
Submit a zip file to the HW1 section of Gradescope (enter via Canvas)
that contains:
- your python code,
- the learning curve figure, and
- the histogram.
Grading
- 40 pts: low MSE for every run
- 20 pts: few prediction errors greater than 0.4
- 20 pts: well structured code
- 10 pts: appropriate documentation
- 10 pts: figures are well formatted and have appropriate axis labels
- 10 pts (bonus): all prediction errors less than 0.4
andrewhfagg -- gmail.com
Last modified: Fri Feb 12 01:16:18 2021