CMPSCI 377: Operating Systems
Lab 4: File Systems

Due Date: May 12, 2003 23:30


All of the files that this page references are bundled into filesystem.tar.


Purpose of Assignment

In this assignment, you will learn about how the abstract notion of a file system is mapped onto a raw disk.

You may work with one other person in the class. Do not share/accept code with/from anyone else or the net.

If you choose to turn in your program late (up to 4 days with penalties), you must send email when you are ready for it to be picked up.

Where to do your work

For this lab, place your final submission in the following directory:

~/cs377/lab4

If your files are not located in the right place or our robot cannot read your lab at the time of submission, then we will not be able to grade it. In order to make sure that our robot can pick up your program, please check things by executing the checkhw   script with the lab4 parameter. Programs that are not handed in properly will not be graded.

You need only hand-in one copy of your program.

The Problem

We will simulate a disk using a randomAccess file class in Java. This approach allows us to open the simulated disk in a combined read/write mode and allows us to arbitrarily position the read/write head anywhere in the file between read and write operations. Many of these low-level details are already handled for you in the provided classes, but understanding how they work will aid in your debugging process.

The simulated disk is partitioned into fixed-size blocks of 1024 bytes each (there are 256 of these blocks). Our file system that is built on top of these blocks will use blocks in one of two ways: to represent directories, and to represent file data. The specifics of how this representation is done are already handled (for the most part) by the provided classes: DirectoryBlockAbstract, and DataBlock. Here are the rules for our implementation.

Blocks

Each block has a nextBlkPtr (of type short) reserved to point to the next block. This short occupies the first two bytes in every block. Block 0 is the root directory and Block 1 is a Data block in which the short points to one of the available free blocks (a free block is one which is not in use). This free block in turn points to the next available and so on. All the free blocks are linked together in a similar fashion, with the last available free block pointing to block 0. The short type is also used to point to the next valid block for a file. For all the directory blocks this short is set to 0. Also all blocks other than Block 0 and Block 1 are available to be used as directory blocks or data blocks.

Data Blocks

Data blocks are used for the representation of file data; with 2 bytes used up to store the short described above only 1022 bytes are available to store data.

Directory Blocks

Directory blocks consist of an array of file descriptors (36 to be exact). Each file descriptor (or an inode) contains the following information: Directories will only contain 36 entries (and are considered full otherwise).

Files are limited in length only by the number of blocks available in the file system.

Files are specified in the file system with respect to the root directory (i.e., there is no concept of changing the current working directory). Thus, full-path file names may contain the characters specified above and the character '/' as the directory separator. It is considered legal to not include a / at the beginning of an absolute path (so /foo/bar/baz.txt and foo/bar/baz.txt are both legal names for the same file). However, it is not considered legal for a path to contain two or more /'s in a row (i.e., /foo/bar//baz.txt is not a legal name).

Your Job

Your job in this lab is to employ these low-level building blocks in building a simple implementation of a file system. Here is some of the functionality that you will implement: On top of these primitives, you will implement the following: We also provide some operations that utilize your primitives, including:

The Details

You are being provided with the following classes. You must use these classes without changes. Your job is to implement the following classes: An important note for the FileSystem class: each of these individual operations need to be atomic. In other words, after each operation, you must ensure that the complete state of the file system is preserved in the simulated disk itself. If you do this properly, when your program quits between operations (e.g. by the request of the user), the program will be able to start again using the stored state without any problems or without loss of data. We will be checking for this.

For more details on the specifics of what you need to do:

Implementation Process

You should implement your program incrementally, testing the individual pieces before you move on to the next piece. This is a good approach to programming (yes, we generally design the whole system beforehand), but also ensures that you have some basic functionality that you can show for by the due date. Here is one possible ordering of implementation that worked well for me:

Debugging

A very good way to do your debugging is to examine the Disk0 file directly after your file system makes changes to it. The easy way to do this under unix is to use a program called "octal dump":

od -v -t x1a Disk0 | less

This will show you the hex value of each byte in the file and the corresponding ascii character (16 bytes are shown on each line). The first column gives you the offset within the file in octal (each block is octal 2000 in length). If you look at the provided Disk1 file using this method, you will be able to find the blocks that correspond to directories and to the file contents (and you will be able to do this with your own Disk0 once you have started to implement your file system).

Caveats

Be careful about "caching": when you load a block from the simulated disk into an instance of one of the Block classes and you make a change to this block, make sure that you write this block back out to the simulated disk. This way, your user will be able to quite the user interface at any time, and your disk will be left in a valid state for later.

Testing

We will be testing your program in a variety of ways. Most of this testing will be done automatically in the following way: we have a variety of test files (some samples are below). These test files will be piped to your program & we will check the output. This is done using a command line in unix like:

java UserInterface < cmdtest5.txt

You should make sure that your program properly handles this form of input (if you do not change UserInterface and provide all of the necessary functionality, you should be all set). If this does not work, we will not be able to give you many "Correctness" points.

Here are some of the test files we will use (roughly ordered by level of difficulty):

Here are the outputs that are generated by our implementation for each of the above test files (yours don't have to match exactly, but they should be very similar):

Random Notes

Bytes and characters are not the same thing (characters in Java are actually 2 bytes). You personally don't have to deal with this, but this is why we go through various hoops in manipulating file/directory name entries.

We provide a special disk loaded with interesting contents that you are welcome to explore and manipulate. It is called "Disk1" - to use it, just copy it over to "Disk0" and your program should be able to use it with no problems. If it does not, then there is something wrong with your implementation (and you must correct it in order to get full points).

You may look at the javadoc from our code.

What to Hand In

Place in your Hand-In directory the following:

Grading

This lab project will count for 30% of your lab grade (so 11.7% of your final grade).

Java Hints


This page is online at http://www-all.cs.umass.edu/~fagg/classes/377/labs/lab4/index.html
Andrew H. Fagg