CS 5043: HW6: Convolutional Neural Networks

Assignment notes:

Data Set

The Core50 data set is a large database of videos of objects as they are being moved/rotated under a variety of different lighting and background conditions. Our task is to classify the object being shown in a single frame of one of these videos.

Data Organization

Data Handling

The following functions will read in a set of PNG files:

def readPngFile(filename):
    '''
    Read a single PNG file
    
    filename = fully qualified file name
    
    Return: 3D numpy array (rows x cols x chans)
    
    Note: all pixel values are floats in the range 0.0 .. 1.0
    
    This implementation relies on the pypng package
    '''
    #print("reading:", filename)
    # Load in the image meta-data
    r = png.Reader(filename)
    it = r.read()
    
    # Load in the image itself and convert to a 2D array
    image_2d = np.vstack(map(np.uint8, it[2]))
    
    # Reshape into rows x cols x chans
    image_3d = np.reshape(image_2d,
                         (it[0],it[1],it[3]['planes'])) / 255.0
    return image_3d

def read_images_from_directory(directory, file_regexp):
    '''
    Read a set of images from a directory.  All of the images must be the 
    
    directory = Directory to search
    
    file_regexp = a regular expression to match the file names against
    
    Return: 4D numpy array (images x rows x cols x chans)
    '''
    
    # Get all of the file names
    files = sorted(os.listdir(directory))
    
    # Construct a list of images from those 
    list_of_images = [readPngFile(directory + "/" + f) for f in files if re.search(file_regexp, f) ]
    
    # Create a 3D numpy array
    return np.array(list_of_images, dtype=np.float32)

def read_image_set_from_directories(directory, spec):
    '''
    Read a set of images from a set of directories
    
    directory  = base directory to read from
    
    spec = n x 2 array of subdirs and file regexps
    
    Return: 4D numpy array (images x rows x cols x chans)
    
    '''
    out = read_images_from_directory(directory + "/" + spec[0][0], spec[0][1])
    for sp in spec[1:]:
        out = np.append(out, read_images_from_directory(directory + "/" + sp[0], sp[1]), axis=0)
    return out

Here is an example of using the top-level function to create a data set in which th cans are positive examples of the class and mugs are negative examples:

directory2 = '/home2/fagg/datasets/core50/core50_128x128/s1'
ins_pos = read_image_set_from_directories(directory2, [['o21', '.*00.png'], ['o22', '.*00.png']])
ins_neg = read_image_set_from_directories(directory2, [['o41', '.*00.png'], ['o42', '.*00.png']])
outs_pos = np.ones(ins_pos.shape[0])
outs_neg = np.zeros(ins_neg.shape[0])

ins = np.append(ins_pos, ins_neg, axis=0)
outs = np.append(outs_pos, outs_neg, axis=0)
Note that this is a tiny example: we are only loading in one out of every 100 images. In practice, you should be able to load and process all of these files.


Part 1: Data / Tensor Flow Exploration

  1. Create a simple tensorflow pipeline that uses average pooling to reduce the size of an input image by a factor of 2 and by a factor of 4. Show the original image and the two reduced images.

  2. Create a pipeline that first converts the original image to grayscale (tf.image.rgb_to_grayscale will do this for you) and then convolve two different 3x3 filters over the grayscale image. The two filters are one-pixel width "bar" detectors that are most sensitive to edges at 45 and -45 degrees (the book gives you examples of bar detectors at 0 and 90 degrees). Show the original image, the grayscale image and the two filtered images.


Part 2: Classification

Create:


Hints / Notes


andrewhfagg -- gmail.com

Last modified: Thu Apr 5 21:37:18 2018