CMPSCI 377: Operating Systems
Lab 4: File Systems
Due Date: December 13, 2002 23:30
Addendum
December 13: Due to various circumstances, I've decided to reduce
the late penalty for a Saturday turn-in to 3% of your lab grade (from 10%).
Additional days will continue to accumulate at 10%. The absolute last day to
hand-in will be Tuesday (with a penalty of 33% of your lab grade). Also -
if you are handing in late, please update your hand-in directory on a daily
basis so that our robot picks up your latest copy, as it may be better
grade-wise to hand-in an incomplete version of your lab earlier. However -
we will only grade one of these copies and you must tell us which one to grade
(if we do not hear anything, we will grade the first version that you
hand in).
December 12: A couple people have asked about this: creating a
directory using a name that already exists (whether it be a file or a
directory) should result in an exception being raised.
Creating a file using a name that already exists will depend on
whether it is a file or a directory: if a directory, you should raise
an exception; if it is a file, then the file size should be reset to 0
(and any blocks associated with the file deallocated).
December 12: Note typo fix in the computation of the number
of valid file blocks.
December 11: In case I was not clear enough in class today:
you should not be making changes to any of the supplied classes
(unless you want to spruce up UserInterface.java). Any changes need
to be cleared by me first (and only a few proposed changes will be
accepted).
December 11: Minor correction in the move specification
// If name1 is a directory:
// If name2 does not exist, then name2 inherits the
--> // contents of name1
//
// If name2 is a directory, then name2/nm1 inherits the
// contents of name1
//
December 10: A small update has been made to UserInterface.java
that fixes the "cat of a zero-length file" problem.
A number of people have asked about blockPtr[]. Note that in all of our
Inodes, blockPtr is of a fixed length (20 to be exact). The reason
we do this is that when we write our data out to the simulated disk, it
must be placed in a known location (e.g., the starting location of each
Inode must be known within the set of bytes representing the directory
so that we know how to unpack the data). If different Inodes had different
sizes (e.g., different numbers of blockPtr's or different length names),
then the Inodes would not start at known and regularly-spaced locations.
So - most of the time, the values stored within blockPtr are going to be
invalid. It is only as we assign blocks to the directory that we fill in
the entries with numbers that correspond to blocks drawn from the free block
pool. Specifically, a directory will have exactly one block assigned to
it (so blockPtr[1..20] will contain invalid entries), and a file will
have any number of valid entries (from 0 to 20). How many valid
entries will a file have? ((size-1)/1024)+1 (assuming -1/N = -1)...
The derivation is left to the reader.
The byte offset for the first free byte in a file is (size)%1024
The job of AllocBlocks() is to find blocks in the free block pool, and
destructively modify blockPtr[] so that it points to these new blocks
(recall that objects in Java are passed by reference and not by value). Also,
these newly assigned blocks must be removed from the pool.
The job of DeallocBlocks() is to return a set of known blocks to the
pool.
Finally: a number of people have had trouble with this: integers in
your computers are stored in binary format (signed integers are stored
in two-compliment binary, to be exact). Whether we write these
numbers in decimal, hex, octal, or base 42, is just about making it
easier for us humans to read the numbers (and nothing else). So - it does
not make any sense to implement code that performs conversions between the
different formats (unless you are talking about a user interface).
So - how does one manipulate these numbers in Java? The easiest operators
to use are / and % (the integer, not the floating poing operators), and
<< and >> (the bit shift left and bit shift right operators).
December 5:
FileSystemAbstract.java has been
changed. One line in the specification of create() read:
// 2. The file does not exist
and has now been changed to read:
// 2. The directory does not exist
December 4: See below to for some example logs from the test scripts.
Also - you will find these variable declarations
in FileSystemAbstract.java, but that they are not used anywhere. You may
choose to use them as you see fit (or you may ignore them):
protected int f_index; // Set by getFileRef()
protected DirectoryBlock db; // Set by getFileRef()
protected String fname; // Set by getFileRef()
December 3: having trouble with missing String methods
split() and matches()? Make sure that you are using version 1.4 or
1.4.1 of the jvm.
All of the files that this page references are bundled into
filesystem.tar.
Purpose of Assignment
In this assignment, you will learn about how the abstract notion of a
file system is mapped onto a raw disk.
You may work with one other person in the class. If
you choose to do so, please send email to the professor and the TAs
in order to indicate who your partner will be and which of the two
of you will be handing in the program (if this is different from lab 3). Do not share/accept
code with/from anyone else or the net.
If you choose to turn in your program late (up to 4 days with
penalties), you must send email when you are ready for it to be picked up.
Where to do your work
For this lab, place your final submission in the following directory:
~/cs377/lab4
If your files are not located in the right place or our robot cannot read
your lab at the time of submission, then we will not be able to
grade it. In order to make sure that our robot can pick up your program,
please check things by executing the ~fagg/bin/checkhw
script with the lab4
parameter. Programs that are not
handed in properly will not be graded.
You need only hand-in one copy of your program.
The Problem
We will simulate a disk using a randomAccess file class in Java. This
approach allows us to open the simulated disk in a combined read/write
mode and allows us to arbitrarily position the read/write head
anywhere in the file between read and write operations. Many of these
low-level details are already handled for you in the provided classes, but
understanding how they work will aid in your debugging process.
The simulated disk is partitioned into fixed-size blocks of 1024 bytes
each (there are 256 of these blocks). Our file system that is built
on top of these blocks will use blocks in one of three ways: to
represent the free space, to represent directories, and to represent
file data. The specifics of how this representation is done are
already handled (for the most part) by the provided classes:
DirectoryBlockAbstract, FreeBlockAbstract, and DataBlock. Here are
the rules for our implementation
Free Blocks
There is exactly one free block in our file system, which is stored on
block 0. We are using a bitmap representation to capture the free
block information: each byte in the block represents the free state
for 8 blocks, a '1' in the corresponding bit location means that the
block is free. Note that there are far more bytes in the free block
than there are blocks in our file system (so we will only use a subset
of the bytes).
Data Blocks
Data blocks are dedicated entirely to the representation of file data,
so all 1024 bytes are used for this purpose (ie there is no
specific structure).
Directory Blocks
Directory blocks consist of an array of file descriptors (16 to be exact).
Each file descriptor (or an inode) contains the following information:
- boolean used_p: set to true if this inode is used
- boolean file_p: set to true if this inode is a file (false if it
is a directory).
- byte[] name: the name of the file/directory (up to 20 bytes). Valid
characters are a-z,A-Z,0-9, ".", and "_".
- short[] blockPtr: a list of blocks (in order) that contain the
contents of the file/directory. If this is a file, 0 to 20 of these entries
might be filled with valid block reference. If it is a directory, then
exactly one (the first) entry will be a valid block reference.
- short size: the length of the file in bytes (not used for
directories). This size implicitly specifies the number of blocks
that are allocated for this file.
Directories will only contain 16 entries (and are considered full otherwise).
Files are limited in length by the number of blockPtr entries, and are
considered full otherwise.
blockPtr entries for files must be dynamically allocated, depending
upon the size of the file.
Files are specified in the file system with respect to the root
directory (i.e., there is no concept of changing the current working
directory). Thus, full-path file names may contain the characters
specified above and the character '/' as the directory separator. It
is considered legal to not include a / at the beginning of an absolute
path (so /foo/bar/baz.txt and foo/bar/baz.txt are both legal names for
the same file). However, it is not considered legal for a path to
contain two or more /'s in a row (i.e., /foo/bar//baz.txt is not
a legal name).
Your Job
Your job in this lab is to employ these low-level building blocks in building
a simple implementation of a file system. Here is some of the functionality
that you will implement:
- listing of files and directories
- creating/removing directories
- creating/removing files
- appending bytes onto files
- reading bytes from files
- moving a file from one location to another
On top of these primitives, you will implement the following:
- copying of a file to another file
- appending a file onto another file
We also provide some operations that utilize your primitives, including:
- importing a file from the base file system
- exporting a file to the base file system
- appending a few, fixed bytes to a file (for testing purposes)
The Details
You are being provided with the following classes. You must use these classes without
changes.
Your job is to implement the following classes:
- FreeBlock.java: extension of FreeBlockAbstract. You must implement the following
methods:
- FreeBlock() (just call super())
- AllocBlocks(): allocate a set of blocks
- DeallocBlocks(): deallocate a set of blocks
- DirectoryBlock.java: extension of DirectoryBlockAbstract. You must implement the
following methods:
- DirectoryBlock() (just call super())
- findName(): find the name-specified entry in the directory.
- numEntries(): return the number of valid entries in the directory.
- AllocFree(): find an unused entry in the directory, mark it as allocated,
and set the entry's name.
- FileSystem.java: extension of FileSystemAbstract. You
must implement the following methods:
- FileSystem(): call super() and perform any other
initialization that you will need
- list(): list a named file or the contents of a named
directory
- createDir(): create a named directory
- deleteDir(): delete a named directory
- create(): create a zero-length file
- delete(): delete a file
- read(String name): read all of the bytes in the named file
- read(String name, int offset, int length): read a
subset of bytes from the named file
- existsFile(): test whether a file exists
- existsDirectory(): test whether a directory exists
- append(): append a sequence of bytes to the end of a file
- move(): move a file/directory from one location to another
- fileCopy(): copy the contents of a file to a new file.
Implement this on top of your create() and append() methods
(do not manipulate the file system directly).
- fileAppend(): append the contents of one file to a new file.
Implement this on top of your create() and append() methods
(do not manipulate the file system directly).
An important note for the FileSystem class: each of these individual
operations need to be atomic. In other words, after each operation,
you must ensure that the complete state of the file system is
preserved in the simulated disk itself. If you do this properly, when
your program quits between operations (e.g. by the request of the
user), the program will be able to start again using the stored state
without any problems or without loss of data. We will be checking for
this
For more details on the specifics of what you need to do
Implementation Process
You should implement your program incrementally, testing the
individual pieces before you move on to the next piece. This is a
good approach to programming (yes, we generally design the whole
system beforehand), but also ensures that you have some basic
functionality that you can show for by the due date. Here is one
possible ordering of implementation that worked well for me:
- File/directory listing, directory creation
- directory deletion
- file creation (zero-length files only)
- file deletion
- appending bytes to the end of a file
- reading bytes from a file
- copying files
- appending files onto other files
- moving files/directories (DO THIS ONE LAST)
Debugging
A very good way to do your debugging is to examine the Disk0 file
directly after your file system makes changes to it. The easy way to do this
under unix is to use a program called "octal dump":
od -v -t x1a Disk0 | less
This will show you the hex value of each byte in the file and
the corresponding ascii character (16 bytes are shown on each line).
The first column gives you the offset within the file in octal
(each block is octal 2000 in length). If you look at the provided
Disk1 file using this method, you will be able to find the
blocks that correspond to directories and to the file contents (and you will
be able to do this with your own Disk0 once you have started to
implement your file system).
Caveats
Be careful about "caching": when you load a block from the simulated
disk into an instance of one of the Block classes and you make a
change to this block, make sure that you write this block back
out to the simulated disk. This way, your user will be able to quite
the user interface at any time, and your disk will be left in
a valid state for later.
Testing
We will be testing your program in a variety of ways. Most of this
testing will be done automatically in the following way: we have a
variety of test files (some samples are below). These test files will
be piped to your program & we will check the output. This is done
using a command line in unix like:
java UserInterface < cmdtest5.txt
You should make sure that your program properly handles this form of
input (if you do not change UserInterface and provide all of the
necessary functionality, you should be all set). If this does not
work, we will not be able to give you many "Correctness" points.
Here are some of the test files we will use (roughly ordered by level
of difficulty):
Here are the outputs that are generated by our implementation for each
of the above test files (yours don't have to match exactly, but they
should be very similar):
Random Notes
Bytes and characters are not the same thing (characters in Java are
actually 2 bytes). You personally don't have to deal with this, but
this is why we go through various hoops in manipulating file/directory
name entries.
We provide a special disk loaded with interesting contents that
you are welcome to explore and manipulate. It is called "Disk1" - to
use it, just copy it over to "Disk0" and your program should be able
to use it with no problems. If it does not, then there is something
wrong with your implementation (and you must correct it in order to
get full points).
What to Hand In
Place in your Hand-In directory the following:
- All of the java and class files associated with your program
(including the ones that we provide).
- A "README" file that contains:
- Your name(s)
- Your UID(s)
- The name of your top-level java class
- A list that describes the functionality that you have implemented (including any
"extra" things that you have added).
Grading
This lab project will count for 30% of
your lab grade (so 11.7% of your final grade).
- Program Design (supporting methods and extra data structures): 25%
- Program Implementation and correctness: 55%
- directory creation/deletion: 10%
- file creation/deletion: 10%
- file/directory listing: 10%
- file import/export (if you get your FileSystem interface right,
this is implemented for you already): 5%
- file copying/appending: 10%
- moving files/directories: 10%
- In-line Documentation: 20%
Java Hints
This page is online at http://www-all.cs.umass.edu/~fagg/classes/377/labs/lab4/index.html
Andrew H. Fagg