Project 1 -- Standard I/O and Device Drivers

Due Tuesday, September 18

(Note that due date is later than originally listed in the class schedule.)

NOTE: This assignment, like the other projects in this class, is due at the beginning of the class period. This means that if you are even a minute late, you lose 20%. If you are worried about potentially being late, turn in your homework ahead of time. Do this by submitting them electronically then giving the hard copy to me or the TA during office hours or by sliding it under my office door within twenty-four hours after the time it is due. Do not send assignments to me through email or leave them in my departmental mail box.

As discussed in class, there are several types of I/O devices. One axis that we can use to divide these devices up into catagories is character-oriented vs. block oriented. Another axis that we can use is communication vs. storage.

Another topic we discussed in class was the use of the O/S to overlap device usage to maximize system throughput.

This assignment will investigate the use of two different devices and the effects of I/O overlap on system throughput.



The Assignment

First, write a program that reads in a file once and writes it to the standard output 1000 times, one character at a time. This program should get the name of this file from the command line. This program will be called out_test1, so the syntax for calling your program will be out_test1 pathname, where pathname is replaced by the absolute or relative pathname of the file the user wants to have read. This program should also record the total amount of time it takes the system to write out the file 1000 times.

Second, modify out_test1 so that it runs through the writing process once, sleeps for 5 seconds, then runs again, sleeps for 5 seconds, etc., until it has run 100 times. Call this new program out_test2. As it runs, out_test2 should collect the time data for all 100 runs. On completion, out_test2 should have 100 sets of time data for 1000 writes of the file each, and it should write this data to standard error.

Third, rewrite out_test1, so that writes the contents of the file to a new file (rather than the standard output) 1000 times, one character at a time. Call this new program out_test3. The name of the new file will be given by the user on the command line. The syntax for out_test3 will be out_test3 input_pathname output_pathname, where input_pathname is the name of the file to be read in and output_pathname is the name of the new file to create. Have your program check to ensure that output_pathname refers to a new file -- don't allow out_test3 to overwrite an existing file.

Fourth, modify out_test3 so that it runs through the writing process once, sleeps for 5 seconds, then runs again, sleeps for 5 seconds, etc., until it has run 100 times. Call this new program out_test4. As it runs, out_test4 should collect the time data for all 100 runs. On completion, out_test4 should have 100 sets of time data for 1000 writes of the file each, and it should write this data to standard error. After each complete write of the file, out_test4 should back up to the start of the file to write over it again, so that the output file never grows larger than the input file. However, again have your program check to ensure that output_pathname refers to a new file the first time through -- don't allow out_test4 to overwrite an existing file on its first run.

Fifth, combine out_test2 and out_test4 into a new program, out_test5. This program should read in the file, then write out the contents of it to both the standard output and a file, one character at a time. It should send each character to both places -- this means it should write the first character to the standard output, then write that same character to the new file, then write the second character to standard out, then write that same character to the new file, etc. As with out_test2 and out_test4, out_test5 should run through the writing process once, sleep for 5 seconds, then run again, sleep for 5 seconds, etc., until it has run 100 times, should collect 100 sets of time data, and should write this data to standard error.

Fifth, modify out_test5 so that it writes the complete file to the standard output one character at a time, then writes the complete file to output_pathname one character at a time, then sleeps for 5 seconds, etc. It should still run 100 times, should still collect 100 sets of time data, and should still write this data to standard error.

Sixth, make predictions to answer the following questions. For each prediction, give a reason for the prediction you made.

Seventh, run your programs, collect the data they produce, and compare these empirical results to the predictions you made. If there are any differences between your predictions and your results, give reasons why you think things turned out the way they did.



Notes on this assignment

You won't be graded on how many predictions you got right. You will be graded on whether your reasons were sound, either at the prediction stage (for correct predictions) or the comparison stage (for incorrect predictions).

All of your input and output in this assignment should use C Standard I/O function calls, not POSIX system calls.



First notes on empirical testing.

Unfortunately, many of you will not have had any experience doing empirical experimentation. People tend to forget that computer science is an empirical science, as well as a mathematical discipline and an engineering field. For this reason I will provide you with some guidelines for how to test for statistical significance. I'll let you know when these guideline are available. For now, though, work on getting your code functioning.

To minimize noise caused by network traffic and disk use by other students, you should consider using the local hard disk of the machine you are using. The /tmp directory should be on a local disk and accessible by everyone. Note that you should not plan to keep anything on /tmp long term. This includes your code (both source code and executable). These should be kept in your own directories. However, you can specify a filename on /tmp, both on the command line and for output redirection. You should make sure the names that you give your files on /tmp are unique (possibly including your username) so that you don't have problems with other files there by the same name.



What to turn in.

You will turn in both a hard copy and an electronic copy of your assignment. You will be given instructions on how to send electronic copies. Do not send them to me though email.

Both the hard copy and the electronic copy will contain a write-up (see predictions and results, above) and all source code you used in collecting your results. The electronic copy will also contain the six executable versions of your code. The electronic copy of your write-up should not be in a proprietary format (such as MS Word); it should be either in plain ASCII text or in a portable format (such as Postscript or PDF). Your source code should be in a single file called out_testi.c and your executable code should be called out_testi, where i is 1 through 6 for the six versions.

Your source code should be well structured and well commented. It should conform to good coding standards (e.g., no memory leaks).

Besides the statistics and explanations mentioned above, your write-up will include 1/2 to 1 page (roughly 80 characters per line, 50 lines per page) explaining the data structures and algorithms used in your code. This page limitation does not include figures used in your explanation, which are encouraged and may take up any amount of space. (This explanation does not remove the requirement that your code be well commented.)



Other

You may write your program from scratch or may start from programs for which the source code is freely available on the web or through other sources (such as friends or student organizations). If you do not start from scratch, you must give a complete and accurate accounting of where all of your code came from and indicate which parts are original or changed, and which you got from which other source. Failure to give credit where credit is due is academic fraud and will be dealt with accordingly.

As noted in the syllabus, you are required to work on this programming assignment in a group of at least two people. It is your responsibility to find other group members and work with them. The group should turn in only one (1) hard copy and one (1) electronic copy of the assignment. Both the electronic and hard copies should contain the names and student ID numbers of all group members. If your group composition changes during the course of working on this assignment (for example, a group of five splits into a group of two and a separate group of three), this must be clearly indicated in your write-up, including the names and student ID numbers of everyone involved.

Some useful commands and system calls for you to consider using in this assignment (besides the ones covered in class) are: