CS 5043: HW7: Attention
Assignment notes:
- Deadline: Tuesday, April 26th @11:59pm.
- Hand-in procedure: submit a zip file to Gradescope
- This work is to be done on your own. While general discussion
about Python, Keras and Tensorflow is encouraged, sharing
solution-specific code is inappropriate. Likewise, downloading
solution-specific code is not allowed.
- Do not submit MSWord documents.
The Problem
We are using the same problem space as in the previous homework
assignment. However, rather than using a RNN approach to connect
information across the amino acid chains, we will be employing
Attention mechanisms to do so. This approach dramatically reduces the
number of layers through which gradients must be propagated.
Data Set
Expect an updated data set; its form will be the same as what we have
been using for HW 6.
The Data set is available on SCHOONER:
- /home/fagg/datasets/pfam: directory tree containing the data
(including two zip files)
The data are already partitioned into five independent folds, with the
classes stratified across the folds (the samples for class k are
distributed equally across the five folds). However, the different
classes have different numbers of examples, with as much as a 1-10
ratio between the minority and majority classes.
Deep Learning Experiment
Objective: Create an Attention-based model to perform the amino acid
family classification. The architecture will be a form along the
lines of:
- Embedding layer
- (optional) 1D Convolutional layers
- Multiple Attention layers. I recommend investigating
tf.keras.layers.MultiHeadAttention
- One or more Dense layers, with the output using a softmax
non-linearity
Performance Reporting
Once you have selected a reasonable architecture and set of
hyper-parameters, produce the following figures:
- Figure 0: Network architectures from plot_model()
- Figure 1: Training set Accuracy as a function of epoch for each
rotation of five rotations.
- Figure 2: Validation set accuracy as a function of epoch for
each of the rotations.
- Figure 3: Histogram of accuracy for the test folds that shows
vertical lines that correspond to the average accuracy (also
show this average in text).
What to Hand In
Turn in a single zip file that contains:
- All of your python code (.py) and any notebook file (.ipynb)
[Gradescope can render notebook files directly - no need to
convert to pdf!]
- Figures 0-3
Grading
- 20 pts: Clean, general code for model building (including
in-code documentation)
- 15 pts: Figure 0
- 15 pts: Figure 1
- 15 pts: Figure 2
- 15 pts: Figure 3
- 20 pts: Reasonable test set performance for all rotations
References
- Full Data Set
Pfam: The protein families database in 2021: J. Mistry,
S. Chuguransky, L. Williams, M. Qureshi, G.A. Salazar,
E.L.L. Sonnhammer, S.C.E. Tosatto, L. Paladin, S. Raj,
L.J. Richardson, R.D. Finn, A. Bateman Nucleic Acids Research
(2020) doi: 10.1093/nar/gkaa913
- Keras Multi-headed Attention Layer
andrewhfagg -- gmail.com
Last modified: Fri Apr 15 01:35:47 2022