CS 3113: Project 4
File System Implementation (Files and Pipes)
Introduction
With project 3, you implemented a hierarchical directory structure on
top of a set of fixed-size blocks on a virtual disk. For this
project, you will augment your implementation to include the
representation of files and a form of pipe, including their content.
Files may occupy
multiple blocks on the disk, and will rely on the linked-list
structure that the blocks already support to connect these blocks
together. A pipe in MYFS connects to a file in your Linux file system.
We provide functionality that allows:
- opening/closing of MYFS files
- reading/writing content from/to MYFS files
- A set of application programs that manipulate files in the MYFS
system.
You will add functionality to:
- Include opening/closing of MYFS pipes
- Include reading/writing content from/to MYFS pipes
- Creation of hard links to an existing file or pipe
- Removal of a file or pipe
- Moving a file or pipe from one part of the MYFS file system to another
- Application programs to copy the contents of a file/pipe and to
compute statistics about a file/pipe
Objectives
By the end of this project, you should be able to:
- Describe the logical data structures that are used by an OS
to represent files.
- Show how these logical data structures change as files are
created and written to.
- Map the logical data structure onto a block-level data
storage device.
- Manipulate the block-level data storage device to support file
creation/deletion/reading/writing.
- Use a file manipulation API (a set of "system calls") to create
application programs.
Overview
Files in MYFS are represented using an index node (the meta-data
for the file) and some number of data blocks, the latter of which
contain the byte-level data that make up the content of a file. As in
Unix, there are standard file operations: open/close/read/write. For
this project, these operations are provided for you.
Pipes in MYFS act like Unix named pipes in the sense
that they appear as files in the file system: they are named entities
in the MYFS file system (and hence can be listed), and
they can be manipulated using the same file open/close/read/write
operations. The key difference is that a MYFS pipe stores its content
within a designated file contained in the host operating system.
Within MYFS, a pipe is represented using:
- An index node, whose type is set to T_PIPE
- Exactly one data block. This data block contains a path to a
file in the host operating system.
You are responsible for augmenting the file open/close/read/write
operations to also handle pipes.
A hard link (or link for short) is the mapping from a directory
entry to an index node. For directories, there is a one-to-one
mappings from directory entry to directory index node (the exceptions
are: the root index node and the names . and ..). For files and
pipes, there can be any number of directory entries that refer to a
single index node. There is an application program that allows the
user to create new links to existing index nodes
Your Responsibilities
The key new pieces that you are implementing / augmenting are:
- myfs_lib_support (TO BE AUGMENTED) provides reusable
functionality for manipulating files at the block level ***
- myfs_lib (TO BE AUGMENTED) provides a set of virtual
system calls that make up the user API for file/pipe manipulation.
Specifically:
- myfs_fopen() (augment for pipes)
- myfs_fclose() (augment for pipes)
- myfs_fread() (augment for pipes)
- myfs_fwrite() (augment for pipes)
- myfs_delete_file() (augment for multiple hard links)
- myfs_link() (implement)
- myfs_move() (implement)
- myfs_mkp() (implement)
- application program:
- myfs_lc: count the number of new lines and characters
in a file/pipe (implement). You must use the
MYFS API for this (it is not a system call itself!).
Proper Academic Conduct
The code solution to this project must be done individually. Do not
copy solutions to this problem from the network and do not look at or
copy solutions from others. However, you may use the net for
inspiration and you may discuss your general approach with
others. These sources must be documented in your README file.
Representing File Content
The rules for representing file content are:
- The index_node.size property reflects the number of bytes currently
stored in the file
- The index_node.type property must be T_FILE
- Only the blocks necessary to represent a file are allocated to
that file. This means that a file will zero size has no blocks
allocated to it
- The first DATA_BLOCK_SIZE bytes of a file will be represented
in the first block allocated to the file (which is pointed to
by index_node.content)
- Subsequent bytes will be represented in additional blocks.
Bytes DATA_BLOCK_SIZE to 2*DATA_BLOCK_SIZE-1 are stored in the
second block; bytes 2*DATA_BLOCK_SIZE to 3*DATA_BLOCK_SIZE-1
are stored in the third block
- The sequence of blocks is represented as a linked list, using
block.next to point to the next block. The last block
will point to UNALLOCATED_BLOCK
- A file can have at most MAX_BLOCKS_IN_FILE blocks allocated to it
Representing A Pipe
The data associated with a pipe are stored within a separate file in
the host operating system (Linux, in our case). A pipe is represented
using:
- An index_node of type T_PIPE
- A single data block that stores the file path in the Linux file
system.
- Note: At the time of creation, the underlying file for a pipe does
not have to exist.
Representing Open Files/Pipes
myfs.h provides a definition of the MYFILE structure. This
structure is used to capture the details of a file that is already
opened by a program. In particular:
/**********************************************************************/
// Representing files (project 4!)
#define MAX_BLOCKS_IN_FILE 1000
typedef struct
{
INDEX_NODE_REFERENCE index_node_reference;
char mode;
int offset;
int fd_external;
// Cache for file content details. Use of these is optional
int n_data_blocks;
BLOCK_REFERENCE block_reference_cache[MAX_BLOCKS_IN_FILE];
} MYFILE;
The properties are as follows:
- index_node_reference: the index node that represents the open
file/pipe.
- mode: character that captures how the file was opened
- 'r': file is opened for reading
- 'w' / 'a': file is opened for writing
- offset:
- Files: the current offset for reading or writing
- Pipes: always -1
- n_data_blocks: number of blocks allocated to the file; -1 for
pipes
- block_reference_cache: an array of block references. Entries
from n_data_blocks to the end of the array are
considered to be undefined. This array is not used for pipes
- fd_external:
- Pipes: the file descriptor in the host operating system
that references the underlying file for the pipe
- Files: always -1
MYFILE is exclusively an in-memory data structure.
It only exists for the duration of the process that opened the
file. This structure is all about providing the process the
information that it needs to access an already-opened file.
Hard Links
The rules for hard links are as follows:
- When a file or pipe is first created, there is one link from
the parent
directory to the newly allocated index node.
- index_node.references = 1
- myfs_link(src, dest) will create a new directory entry
(corresponding to dest) that points to an existing index node
(identified by src)
- The index_node.references property is incremented.
- Removing a file or pipe:
- Remove the directory entry
- Decrement the index_node.references property
- If index_node.references is zero, then deallocate the
index_node and the content blocks
- Note: the myfs_delete_file() function that we supply already
works under the assumption that references == 1. You
must add the additional logic for values larger than 1.
MYFS File/Pipe Manipulation API
myfs_fopen()
MYFILE* myfs_fopen(char *cwd, char *path, char *mode);
Open a file or a pipe. path is either absolute or relative.
If it is relative, then it is taken to be relative to cwd.
mode is one of:
- "r": read
- File:
- Offset is set to zero
- Cache the list of all data blocks
- It is an error if the file or the parent of the
file does not exist
- Pipe:
- Open underlying Linux file for reading
- It is an error if the underlying file does not exist
- Directory:
- It is an error to attempt to open a directory
- "w": write with truncation
- File:
- Deallocate all of the data blocks associated with
the file
- Offset is set to zero
- If the file does not exist, then it is created
with size zero
- It is an error if the parent does not exist
- Pipe:
- Open underlying Linux file for writing with truncation
- If the underlying Linux file does not exist, then
it is created
- It is an error if the parent of the underlying
file does not exist
- Directory:
- It is an error to attempt to open a directory
- "a": write with append
- File:
- Offset is set to the end of the file
- If the file does not exist, then it is created
- It is an error if the parent does not exist
- Pipe:
- Open underlying Linux file for writing/appending
- If the underlying Linux file does not exist, then
it is created
- It is an error if the parent of the underlying
file does not exist
- Directory:
- It is an error to attempt to open a directory
If the file is opened successfully, then a dynamically allocated
MYFILE structure is returned that contains all of the meta data
associated with the open file.
myfs_fclose()
void myfs_fclose(MYFILE* fp);
Close the MYFS file or pipe and deallocate the MYFILE structure.
myfs_fread()
int myfs_fread(MYFILE *fp, unsigned char * buf, int len);
Read a specified number of bytes from a file/pipe that is open.
- buf must be large enough to hold at least len
bytes.
- The number of actually read bytes is returned.
- If the file offset is at the end of the file before this
function is called, then zero is returned, indicating that the
End-of-File has been reached
myfs_fwrite()
int myfs_fwrite(MYFILE *fp, unsigned char * buf, int len);
Write a specified number of bytes to a file/pipe that is open.
- buf must contain at least len bytes.
- The number of actually written bytes is returned.
Additional System Calls
int myfs_link(char *cwd, char *path_src, char *path_dst) (implement)
Create a new hard link from a new directory entry to an existing index
node.
- The existing index node is specified by path_src
(relative to cwd if necessary)
- The new directory entry is specified by path_dst (again,
relative to cwd if necessary)
- It is an error if the SRC child does not exist (i.e., you have
to be able to find the index node)
- It is an error if the DST parent does not exist
- It is an error if the DST parent is not a directory
- It is an error if the DST child exists
- It is an error if there are no free blocks to allocate for the
possible lengthening of the directory block linked list
- When the link is established:
- The size of the DST parent increases by one
- An entry is inserted into the DST parent directory list
- The references of the specified index node is
increased by one
int myfs_delete_file(char *cwd, char *path) (augment)
Delete a file/pipe from a directory. Part of this implementation is
provided to you. You are adding functionality to allow for more than
one reference to an index node
- The file/pipe is specified by path (relative to
cwd)
- It is an error if the child does not exist
- It is an error if the child is a directory
- Remove the entry from the parent directory
- Subtract 1 from the index node's references property
- If the number of references is zero, then:
- Deallocate the index node
- Deallocate any content blocks
int myfs_move(char *cwd, char *src_path, char *dest_path) (implement)
Move a directory entry for a file or pipe to another directory/name.
- The source location is specified by src_path (relative
to cwd)
- The destination location is specified by dest_path
(relative to cwd)
- It is an error if the SRC child does not exist
- Three cases:
- The SRC and DEST parents are the same. The name
property in the directory entry is changed to the new
child name.
- The SRC child is a file/pipe and the DEST child is a
directory. The directory entry is removed from the SRC
parent and added to the DEST child directory. The entry
name stays the same. The size of the SRC directory is
reduced by one; the size of the DEST directory is
increased by one.
- The SRC child is a file/pipe, the DEST parent exists,
but the DEST child does not exist. The directory entry
is removed from the SRC parent and added to the DEST
parent, with the DEST child name. The size of the SRC directory is
reduced by one; the size of the DEST directory is
increased by one.
- In general, it is an error if a directory that must be extended
cannot be allocated the additional block.
int myfs_mkp(char *cwd, char *path_host, char *path_myfs) (implement)
Create a new pipe. The location of the pipe in MYFS is
path_myfs (relative to cwd, if necessary). The location
of the file in the host operating system is determined by
path_host
- It is an error if the child exists
- It is an error if the parent does not exist
- It is an error if the parent directory and the index node block
list cannot be extended (if they need to be)
- It is an error if a new block for the pipe cannot be allocated
- Allocate a new index node. Type = T_PIPE; references = 1; size
= 0
- Add a new entry to the parent directory
- Allocate a new data block. Set all bytes to zeros; copy the
path_host to the first data bytes within this block
- Increase the size of the parent index node
- Note: when multiple blocks must be allocated during pipe
creation, they are allocated in the following order: index node
block, parent directory block and finally the new directory
block.
MYFS Application Programs (new ones)
The high-level implementation of most of the MYFS application programs
is given. You will be responsible for implementing/modifying some of
the underlying system calls.
Note that although all of the application programs are described below
in terms of files, they operate on both files and pipes.
myfs_touch <fname> (provided)
- If the specified file does exist, then it is left unchanged
- If the specified file does not exist, then it is created with
zero content
- It is an error if the parent does not exist or if the specified
child is a directory
myfs_create <fname> (provided)
- If the specified file does not exist, then it is created.
- If the specified file does exist, then it is first truncated
(data blocks deallocated and the index_node.size set to zero).
- In either case, any input to STDIN is written to the file.
This process stops when an EOF is received from STDIN.
- It is an error if the parent does not exist or if the specified
child is a directory
myfs_append <fname> (provided)
- If the specified file does not exist, then it is created
- If the specified file does exist, then the file is opened, with
the offset set to the end of the file.
- In either case, any input to STDIN is written to the file.
This process stops when an EOF is received from STDIN.
- It is an error if the parent does not exist or if the specified
child is a directory
myfs_more <fname> (provided)
- Prints the contents of the specified file to STDOUT.
- It is an error if the parent or child do not exist, or if the specified
child is a directory
myfs_copy <SRC> <DEST> (provided)
- Copies the contents of SRC to DEST.
- It is an error if the SRC parent or child do not exist, or if the SRC
child is a directory.
- It is an error if the DEST parent does not exist.
- If DEST does exist and is a file, its existing file contents
are first deallocated.
myfs_link <SRC> <DEST> (provided)
- Links the SRC index node to a new directory entry specified by DST
- It is an error if the SRC parent or child do not exist, or if the SRC
child is a directory.
- It is an error if the DEST parent does not exist or if the DEST
file already exists.
myfs_rm <name> (provided)
- Removes the specified file from its parent.
- If this is the only link from a directory to this child, then
the contents of this file and its index node are deallocated.
- It is an error if the parent or the child do not exist.
myfs_mkp <path_host> <path_myfst> (provided)
- Creates a pipe in the MYFS file system, as designated by
path_myfs
- The pipe connects to the file in the host file system (Linux in
our case), specified by path_host
- It is an error if the parent of path_myfs does not exist
- It is an error if the path_myfs exists
myfs_move <src> <dest> (provided)
Move a file or pipe from one location to another.
- It is an error if src does not exist
- It is an error if src is a directory
- dest may specify a non-existent file/pipe name.
- It is an error if the parent does not exist.
- It is an error if dest exists
- dest may specify a directory.
- (the only way that this can be the case is that the
directory exists)
Three cases:
- src and dest are within the same containing
directory. In this case, the name in the directory listing is
changed.
- src and dest specify an existing source file/pipe
and a non-existent destination file/pipe, respectively. The directory entry
in src is removed; and a new directory entry is added to
the parent of dest. The new entity name is specified by
dest
- src and dest specify an existing source file/pipe
and an existing destination directory. The directory entry
in src is removed; and a new directory entry is added to
dest. The entity name stays the same.
myfs_lc <name> (to be implemented)
- Counts the number of newlines and characters in the specified
file. Prints two integers on the same line to STDOUT in this
order.
- It is an error if the file does not exist.
Deallocation
- Data blocks that are deallocated must have their bit in the
deallocation table turned from one to zero. In addition, the
next references for the blocks are changed to
UNALLOCATED_BLOCK.
- Index nodes that are deallocated must have their content
property set to UNALLOCATED_BLOCK.
- Directory entries that are removed must have their
index_node_reference set to UNALLOCATED_INDEX_NODE and their
name set to the empty string.
Data Structure Examples
The following is an example of creating a hard link to an existing
file:
The following is an example of creating a pipe and storing data in the
pipe:
Examples
Example Interactions
Checklist
Write / Implement the following files / functions.
Supporting Materials
MYFS API (Library)
The documentation for the following can be found in the skeleton file.
- myfs_fopen() (augment)
- myfs_fclose() (augment)
- myfs_fwrite() (augment)
- myfs_fread() (augment)
- myfs_delete_file() (augment)
- myfs_link() (implement)
- myfs_move() (implement)
- myfs_mkp() (implement)
- myfs_list() (augment to place the character '|' after names that correspond to
PIPEs)
Application Programs
Implement:
The supplied application programs must be compiled and work properly:
- myfs_format
- myfs_inspect
- myfs_list
- myfs_mkd
- myfs_rmd
- myfs_touch
- myfs_rm
- myfs_append
- myfs_more
- myfs_create
- myfs_link
- myfs_copy
- myfs_move
- myfs_mkp
Submitting Your Program
- Your submission is composed of the files required to compile
all of the executables. Those highlighted in bold are the ones
that you are to create / complete. All others can be handed-in
as provided:
- myfs_format.c
- myfs.h
- myfs_inspect.c
- myfs_lib.c/h
- myfs_lib_support.c/h
- myfs_list.c
- myfs_lc.c
- myfs_mkd.c
- myfs_mkp.c
- myfs_rmd.c
- myfs_stats.c
- myfs_touch.c
- myfs_append.c
- myfs_more.c
- myfs_create.c
- myfs_copy.c
- myfs_link.c
- myfs_move.c
- myfs_rm.c
- vdisk.c/h
- Makefile: does the following:
- all: compiles all executables
- clean: deletes the executable files, any
intermediate files (.o, specifically)
- README.txt: documentation
- Include your name and project number at the top of the file
- Document any Internet resources that you
consulted (URLs). State NONE if there are none.
- Document any peer class members that you
discussed your solution with. Note that you may
not look at or share code that solves this
specific problem . State NONE if there are none.
- Documentation requirements: function-level and inline
documentation are important components of writing code. They help
you to organize your thoughts while programming and help to
communicate the method and the intent of the code to your future
self or to collaborators (of course, you will not have collaborators
in this class). While we will not be specifically grading
documentation in this class, we will not be able to comment on your
code unless it is sufficiently documented. This will be true whether
you are asking for help before the submission deadline or looking
for feedback on your solution after the deadline. In short, you
should take the time to properly document your code as you develop
it.
- Submit your files:
- To be counted as on time, your solution must be submitted to
the Gradescope server by 11:45:00 pm on Tuesday, Dec 8th.
Grade penalties will be imposed for
late submissions (see the syllabus for the details).
Grading Criteria
- README.txt (10%): contains the required information
- Correctness (90%): passes all of the (hidden) unit tests
- If you use non-constant global variables, then 10% will be
deducted from you grade. The exception to this rule are the two
global variables that are declared for you by us.
Downloads
The following file contains several header and C files: project4_skel.tar.
Notes:
Hints
- Don't implement everything at once. Instead, get
format and list working first.
- Any debugging code should only print to STDERR. This way, it
won't interfere with our testing procedures.
- Use the debug global variable to turn on debugging
output for many of the provided functions.
- Useful command: myfs_inspect -data # (where # is a
block number)
Addenda
- 2020-11-24: Added a description of the behavior for the
myfs_move executable
- 2020-11-25: Added updated find_entry_in_directory() function to
myfs_lib_support_skel_p4.tar
myfs_find_directory_hole(): Should return 1 when a hole has
been found, and a 0 (zero) if no hole was found. If you have a
fatal error, then you should simply exit. [myfs_create_file()
depends on this behavior].
myfs_find_index_node_hole(): Should return an index node
reference if it finds one; UNALLOCATED_INDEX_NODE if it does
not find a hole. Fatal errors should exit.
myfs_move.c added to skel_p4.tar
- 2020-12-05: removed the data block inspect executions for blocks
3 and 4 after the last rm of project4_test_7. These blocks were no longer being
used to store content.
- 2020-12-06: rm1 test on Gradescope changed. When removing a
pipe, you must make sure that its content block is
deallocated. The code that we distributed did not include this
test when deciding to deallocate.
- 2020-12-08: myfs_lc should exit with code zero, even if a file
is not found.
andrewhfagg at gmail.com
Last modified: Tue Dec 8 15:06:18 2020