Project 1 -- I/O and Devices

Electronic Copy Due Tuesday, September 18, 1:00 pm

Hard Copy Due Tuesday, September 18, 1:30 pm

NOTE: This assignment, like the other projects in this class, is due at a particular time, listed above. This means that if you are even a minute late, you lose 20%. If you are worried about potentially being late, turn in your project ahead of time. Do this by submitting it electronically before it is due and giving the hard copy to me during office hours or by sliding it under my office door before it is due. Do not send assignments to my personal email address. Do not leave hard copies in my departmental mail box or attempt to give them to departmental staff (who cannot and will not accept them).

As discussed in class, disk I/O scheduling can have a dramatic impact on system performance. In particular, various disk scheduling algorithms can affect performance metrics including average response time, maximum response time, response time variance, and total system throughput. To understand these algorithms, we would like to visualize their behavior.



The Assignment

You are to write a program to graph the behavior of several disk scheduling algorithms we have covered in class.

  1. Part one of the assignment is to write a program called dsg (for Disk Schedule Graph) to graph the behavior for each disk scheduling algorithm on a given set of I/O requests.

    Scheduling Algorithms
    The scheduling algorithms to be covered by dsg are:
    • FCFS
    • SSTF
    • SCAN
    • LOOK
    • C-SCAN
    • C-LOOK
    • FSCAN
    • N-step SCAN
    For all algorithms, dsg should assume that the disk head starts on Track 0, Sector 0 when the first request comes in and that the disk never stops spinning (spins down) but that the arm will stop moving if there are no requests to service at a given time.

    Invocation
    dsg is invoked from the command line as follows:
    dsg -c CONFIGFILE -r REQUESTFILE
    dsg --config=CONFIGFILE --request=REQUESTFILE
    dsg reads from standard in if a single hyphen-minus (-) is given for either filename.

    Configuration File
    The configuration file (CONFIGFILE) specifies information about the disk drive being modeled. Each line of file species one value of this configuration, as follows:
    Line 1, number of tracks. (Integer)
    The disk has the number of tracks specified; default is 1024
    Line 2, number of sectors. (Integer)
    The disk has the number of sectors specified; default is 512
    Line 3, number of platters. (Integer)
    The disk has the number of platters specified; default is 8
    Line 4, maximum rotational-delay. (Integer)
    The disk has the maximum rotational delay specified; default is 4 ms
    Line 5, arm start time. (Integer)
    The arm of the disk has the movement start time specified; default is 4 ms
    Line 6, arm settle time. (Integer)
    The arm of the disk has the settling time specified; default is 4 ms
    Line 7, maximum arm move time. (Integer)
    The arm of the disk has the maximum move time specified; default is 8 ms
    Line 8, data-transfer-time. (Integer Float)
    The time to transfer one sector’s worth of data (i.e., to read or write one sector); default is maximum rotational delay divided by number of sectors. If a value for data transfer time is specified that is less than the default value, the user is notified that the parameter value is invalid and dsg exits normally with a return value of 1.
    Line 9, algorithm. (String)
    The disk scheduling algorithm to use. (Recognized values are FCFS, SSTF, SCAN, LOOK, C-SCAN, C-LOOK, FSCAN, N-step); default is FCFS
    Line 10, N. (Integer)
    The size of N for N-step SCAN; default is 16. Note that this line will only be present if line 9 is N-step.
    Request File
    The request file (REQUESTFILE) specifies the sequence of requests received. Each line specifies the requests received at a particular time, starting at time 0 and incrementing by 1 ms for each line including the first. The format for each request is
    t, s, p
    where t is the requested track, s is the requested sector, and p is the requested platter. If the requested track, sector, and/or platter is beyond the capacity of the drive as specified in the configuration, dsg should send an error message to standard out to notify the user of this fact and exit normally with a return value of 2. Multiple requests may be specified on a single line, indicating that they arrived within the same 1 ms time period. Such requests will be separated by a semicolon (;). Blank lines are allowed, indicating that no new requests arrived within that time period. Note that the period/full stop character (.) at the end of any line is the indicator to your program that the data is complete and the behavior of the system should be graphed.

    Graphs
    Your program should create two graphs, the sector graph and the track graph, as follows.

    Sector Graph
    The sector graph will plot the sector under the read/write head of the disk versus time, highlighting when data transfers are actually taking place. That is, the x-axis of the graph show time in ms, while the y-axis will show the sector that is under the read/write head at each time. On many time steps, no data will be actually read or written — the disk is simply rotating into position under the head. These time steps should be plotted in the graph in a standard color such as black (assuming a white background). However, on other time steps, data will actually be read or written. These time steps should be plotted in the graph in a color that stands out noticeably such as red.

    Track Graph
    The track graph will be similar to the sector graph in all ways, except that it will plot the track, rather than the sector, at which the read/write head of the disk is located versus time.

    For calculating y values for your graphs, you may assume that the rotational delay is simply a linear function of the number of sectors between the current sector under the read head and the sector to be read with the range [0, R], where R is the maximum rotational delay of the disk specified in the configuration file. In contrast, seek time will be 0 if the track sought is the current track, otherwise it will consist of a fixed cost for start up and settling equal (both specified in the configuration file) plus a linear function of the number of tracks between the current track at which the arm is positioned and the track to which the arm will be moved with the range [(S)/(t-1), S], where S is the maximum seek time of the disk and t is the number of tracks on the disk. Note that seek time should be calculated before rotational delay because the disk does not stop rotating while a seek takes places. Therefore, to know which sector is under the read head at the start of the rotational delay, you must first calculate the amount of rotation that takes place during the seek.

    Note that if you do not have sufficient familiarity with C/C++ graphics routines to generate such graphs within your code, you may choose instead to write out the (x,y) data for each graph (including an indication of whether each data point corresponds to a data transfer or not) to a file and use available graphing software to generate the graphs specified above. If you go this route, you must specify in your write-up (see below) what graphing software you used and must include a copy of your output file in your electronic submission (see below).

  2. Part two of the assignment is to test dsg and analyze what your testing demonstrates.

    Test Data
    Test data is a moderate sized sequence (dozens or hundreds) of I/O requests to be made to a disk drive. You should create at least three sets of such test data, conforming to the request file specification given above. This data should be sufficient to test the performance of the various scheduling algorithms listed above under various conditions, including low, moderate, and high levels of proximity for subsequent I/O requests.

    Test Configurations
    In addition to the test data itself, you will need to create test configurations for disk drives on which to test your data. Configurations consist of the number of tracks, sectors, and platters on a drive, as well as the rotational delay, seek time, and data transfer time for that disk. You should create the test configurations in conjunction with the test data, so that all requests in the each set of test data are valid on at least one disk configuration.

    Results
    Once you have created your test data and configuration sets, you should invoke dsg with each scheduling algorithm for each configuration on all of its valid data sets and graph the results you received for each.

    Analysis
    Once you have collected your graphs, you need to compare what you found to performance expected for each algorithm, according to your text and the in-class discussion. This analysis should consider whether characteristics of the test data (such as proximity level) affect performance.


What to Turn In

You will turn in both a hard copy and an electronic copy of your assignment. Electronic copies must be submitted to the appropriate drop box in D2L for the course. Do not send them to my email address.

Both the hard copy and the electronic copy will contain a cover sheet documenting group membership and contributions (see below), your analysis document, all source code you created for dsg and a write-up of 1/2 to 1 page (roughly 80 characters per line, 50 lines per page) explaining the data structures and algorithms used in your code. This page limitation does not include figures used in your explanations, which are encouraged and may take up any amount of space. (The explanations do not remove the requirement that your code be well commented.)

The electronic copy will also contain an executable for dsg which should be called dsg, your test request files (clearly labeled), your test configuration files (clearly labeled), and your graphs (clearly labeled). If you chose to use third-party graphing software, you must include your (x,y) data files (clearly labeled).

Your source code should be well structured and well commented. It should conform to good coding standards (e.g., no memory leaks).



Other

You may write your program from scratch or may start from programs for which the source code is freely available on the web or through other sources (such as friends or student organizations). If you do not start from scratch, you must give a complete and accurate accounting of where all of your code came from and indicate which parts are original or changed, and which you got from which other source. Failure to give credit where credit is due is academic fraud and will be dealt with accordingly.

As noted in the syllabus, you are required to work on this programming assignment in a group of at least two people. It is your responsibility to find other group members and work with them. The group should turn in only one (1) hard copy and one (1) electronic copy of the assignment. Both the electronic and hard copies should contain the names and student ID numbers of all group members. If your group composition changes during the course of working on this assignment (for example, a group of five splits into a group of two and a separate group of three), this must be clearly indicated in your cover sheet (see below), including the names and student ID numbers of everyone involved and details of when the change occurred and who accomplished what before and after the change.

Each group member is required to contribute equally to each project, as far as is possible. Your cover sheet must thoroughly document which group members were involved in each part of the project. For example, if you have three functions in your program and one function was written by group member one, the second was written by group member two, and the third was written jointly and equally by group members three and four, your cover sheet must clearly indicate this division of labor.

Note that all personally identifying information (names, student ID numbers, 4x4s, etc.) must only be included on the cover sheet and nowhere else in the project materials.