Project 3 -- POSIX Files, Directories, Permissions, and Signals

Due Tuesday, December 3

(Note that due date is later than originally listed in the class schedule.)

This assignment will be worth 20% of your course grade.

(Note that value is greater than originally listed in the course syllabus.)

NOTE: This assignment, like the other projects in this class, is due at the beginning of the class period. This means that if you are even a minute late, you lose 20%. If you are worried about potentially being late, turn in your assignment ahead of time. Do this by submitting it electronically then giving the hard copy to me or the TA during office hours or by sliding it under my office door within twenty-four hours after the time it is due. Do not send assignments to me through email or leave them in my departmental mail box.

PART I.

When removing files on some operating systems, the default action is to simply move the file from its original location to a "trash can" or "recycling bin" which the user then needs to "empty" to really delete the file. While UNIX systems could adopt this approach, the typical implementation has instead been to give users the ability to use a single command to remove hard links to files. This means that, if there is only a single hard link to a file (as is typically the case), a user only needs to take a single step to remove his or her access to the file and reassign that file's data blocks to the list of free blocks on the system.

The up side to the standard UNIX method is that is quick and easy. The down side is that it is quick and easy to do the wrong thing. This is particularly true if the user makes use of regular expressions or the -r option to recursively descend into sub-directories. For this reason, the -i flag is available for the rm command. When using this flag, a user is prompted whether or not to remove each file individually.

(As an aside, I encourage you to alias rm to rm -i so that you don't accidentally remove files that you don't intend to. Similarly, I encourage you to alias mv to mv -i and cp to cp -i so that you don't accidentally overwrite files when moving or copying them.)

The down side to rm -i is that it is tedious to use when removing many files, as you are likely to be doing with regular expressions or the -r option. For this reason, we are going to implement a "remove safe" command, which we will call rms, that will implement a "trash can" approach for UNIX systems.



The Assignment

For this assignment, you will implement rms as follows.

FILES

When a user enters

    rms <filename>
where <filename> is the name of an ordinary file, rms will attempt to move <filename> to the user's "trash can." The trash can will simply be a directory specified by the TRASH environment variable. (Note that you will need to set this variable.)

If there is no file named <filename> in the trash can, rms will move <filename> to the trash can.

If there is already a file named <filename> in the trash can, rms will give the user the message

File <filename> exists in trash can, (o)verwrite, (c)ancel, (r)emove?

DIRECTORIES

When a user enters

    rms <dirname>
where <dirname> is the name of a directory, rms will attempt to move the directory and all of its ordinary files to the trash can.

If there is no directory named <dirname> in the trash can, rms will move <dirname> and all of its ordinary files to the trash can.

If there is already a directory named <dirname> in the trash can, rms will check to see if there are files with the same names in <dirname> and in <dirname> in the trash can. If there are no such files, rms will remove <dirname> in the trash can, then move <dirname> and all of its ordinary files to the trash can. If there are such files, rms will give the user the message

Directory <dirname> with conflicting files exists in trash can, (l)ist, (o)verwrite directory, (m)erge overwrite, (s)ave new, (i)ndividual/intermingle, (c)ancel, (r)emove?

RECURSIVE

When a user enters

    rms -r <dirname>
where <dirname> is the name of a directory, rms will attempt to move the directory and all of its ordinary files and subdirectories (recursively including their files and subdirectories) to the trash can.

If there is no directory named <dirname> in the trash can, rms will move <dirname> and all of its ordinary files and subdirectories to the trash can.

If there is already a directory named <dirname> in the trash can, rms will recursively check to see if there are files or subdirectories with the same names in <dirname> and in <dirname> in the trash can. If there are no such files or subdirectories, rms will move <dirname> and all of its ordinary files and subdirectories to the trash can. If there are such files or directories, rms will give the user the message

Directory <dirname> with conflicting files exists in trash can, (l)ist, (o)verwrite directory, (m)erge overwrite, (s)ave new, (u)pdate, (i)ndividual/intermingle, (c)ancel, (r)emove?
  • If the user enters "l", rms will list the path and file names of all of the conflicting files, then return to the message above, except that this time the (l)ist option will not be given.
  • If the user enters "o", rms will remove <dirname> and all of its files and subdirectories from the trash can, then move <dirname> and all of its ordinary files and subdirectories to the trash can.
  • If the user enters "m", rms will leave all of the ordinary files from <dirname> in the trash can there initially, then move all of the ordinary files of <dirname> to the trash can, overwriting the files with matching names in the trash can. It will also recursively descend into all of the subdirectories, treating them likewise.
  • If the user enters "s", rms will leave all of the ordinary files from <dirname> in the trash can there, then move all of the ordinary files of <dirname> that do not conflict with file names in the trash can to the trash can, without overwriting any files already there. It will also recursively descend into all of the subdirectories, treating them likewise.
  • If the user enters "u", rms will leave all of the ordinary files from <dirname> in the trash can there initially, then move all of the ordinary files of <dirname> to the trash can, overwriting the files with matching names in the trash can only if the files from <dirname> have more recent modification timestamps than those already in the trashcan. It will also recursively descend into all of the subdirectories, treating them likewise.
  • If the user enters "i", rms will look at the individual files and subdirectories within both directories and act as if it has been called as "rms <filename>" for all filenames in <dirname> and will act as if it has been called as "rms -r <subdirname>" for all subdirectories in <dirname>.
  • If the user enters "c", rms will cancel the remove operation leaving both directories named <dirname> alone.
  • If the user enters "r", rms will remove the ordinary files and subdirectories (recursively) from <dirname> and leave <dirname> in the trash can alone.
  • IMPLEMENTATION

    Recursion

    For the non-recursive parts, you may implement rms as best you know how. For the recursive option to rms, however, you will have your code recursively fork children, one for each subdirectory found within the subtree starting at <dirname>, each one checking for conflicting file and directory names between its own directory and the corresponding subdirectory in the trash can. As soon as any of these processes (including the original parent rms and all of its descendants) finds a conflict, it will signal the rest of the processes to stop searching for conflicts. The original parent rms will then print the message listed above.

    If the user selects (l)ist, all of these processes will continue where they left off, searching for conflicting file and subdirectory names; however, now they will print the path and file names of all conflicts that they find, rather than signaling the rest of the processes to stop searching for conflicts. (Don't worry about the order in which these names are printed.) A process that had already found a conflict and signaled the rest of the processes to stop searching will print that name before continuing with its recursive search.

    When a process finishes working on its directory, it will wait for its children, then exit, except for the original parent rms, which will continue carrying out the duties of rms (moving the files and subdirectories or prompting the user, depending on whether conflicts are found or not).

    If the user selects (o)verwrite directory, (m)erge overwrite, (s)ave new, (i)ndividual/intermingle, or (r)emove, the original parent rms will kill all of its searching descendants that may still be running, then carry out whichever of these options the user has selected by recursively creating a child for each subdirectory found.

    If the user selects (c)ancel, the original parent rms will kill all of its searching descendants that may still be running, then exit.

    Error Checking

    Before anything else, all versions of rms should check to be sure that the directory from which they are to remove files is not the trash can or a subdirectory of the trash can. Similarly, rms -r should check to be sure that the trash can is not a subdirectory of the directory from which it is to remove files. Further, rms -r should not follow symbolic (soft) links.



    PART II.

    Just as rms might be handy, some people might find a similar "move safe" command to be handy. For PART II, you will create mvs which will behave almost identically to rms. The main difference will be that mvs will be able to move files between two arbitrary (non-nested for mvs -r) directories, not just between an arbitrary directory and the trash can. Other minor differences, such as giving the message "File <filename> exists in destination directory ..." rather than "File <filename> exists in trash can ...", should be intuitive.



    PART III.

    Just as rms and mvs might be handy, some people might find a similar "copy safe" command to be handy. For PART III, you will create cps which will behave very similarly to mvs. The main difference will be that cps will not remove the files or subdirectories from the source directory. Other minor differences, such as not giving the (r)emove option, should be intuitive.

    Notes on this assignment

    All of the children created by rms, mvs, and cps should be created using fork.



    What to turn in.

    You will turn in both a hard copy and an electronic copy of your assignment. Please follow the instructions on how to send electronic copies. Do not send them to my email address.

    Both the hard copy and the electronic copy will contain a write-up and all source code you used in this project. The electronic copy will also contain the executable version of your code. The electronic copy of your write-up should not be in a proprietary format (such as MS Word); it should be either in plain ASCII text or in a portable format (such as Postscript or PDF). Your source code should either be in a three files called rms.c, mvs.c, and cps.c, (or rms.cxx, mvs.cxx, and cps.cxx) or in several files that can be compiled and linked using make. In the latter case, rms.c, mvs.c, and cps.c, (or rms.cxx, mvs.cxx, and cps.cxx) will contain the function main() and you will include your makefile and it will be named makefile. Your executable code should be called rms, mvs, and cps.

    Your source code should be well structured and well commented. It should conform to good coding standards (e.g., no memory leaks).

    Your write-up will include 1/2 to 1 page (roughly 80 characters per line, 50 lines per page) explaining the data structures and algorithms used in your code. This page limitation does not include figures used in your explanation, which are encouraged and may take up any amount of space. (This explanation does not remove the requirement that your code be well commented.)



    Other

    You may write your program from scratch or may start from programs for which the source code is freely available on the web or through other sources (such as friends or student organizations). If you do not start from scratch, you must give a complete and accurate accounting of where all of your code came from and indicate which parts are original or changed, and which you got from which other source. Failure to give credit where credit is due is academic fraud and will be dealt with accordingly.

    As noted in the syllabus, you are required to work on this programming assignment in a group of at least two people. It is your responsibility to find other group members and work with them. The group should turn in only one (1) hard copy and one (1) electronic copy of the assignment. Both the electronic and hard copies should contain the names and student ID numbers of all group members. If your group composition changes during the course of working on this assignment (for example, a group of five splits into a group of two and a separate group of three), this must be clearly indicated in your write-up, including the names and student ID numbers of everyone involved.

    Each group member is required to contribute equally to each project, as far as is possible. You must thoroughly document which group members were involved in each part of the project. For example, if you have three functions in your program and one function was written by group member one, the second was written by group member two, and the third was written jointly and equally by group members three and four, both your write-up and the comments in your code must clearly indicate this division of labor.