Project 1 -- Standard I/O and POSIX I/O

Due Wednesday, September 29, 9:00 pm

(Note that due date is later than originally listed in the class schedule.)

NOTE: This assignment, like the other projects in this class, is due at a particular time, listed above. This means that if you are even a minute late, you lose 20%. If you are worried about potentially being late, turn in your project ahead of time. Do this by submitting it electronically before the due date (the electronic copy is what is due by the time given above) then giving the hard copy to me or the TA during office hours or by sliding it under my office door (the hard copy is due within twenty-four hours after the electronic copy is due). Do not send assignments to my personal email address or to the personal email address of the TA. Do not leave hard copies in my departmental mail box or attempt to give them to departmental staff (who cannot and will not accept them).

As discussed in class, the I/O routines of high-level languages need to be implemented in terms of the I/O routines of the operating system on which the code is to run. This assignment will investigate how this implementation takes place.

As further discussed in class, I/O can be a security risk, if not handled properly. One risk we discussed was that of buffer overflows, which often enter into our code through poorly designed input functions, such as gets(). This assignment will consider an alternate input function to handle this problem.

Also as discussed in class, using a global variable for errno can lead to confusing results, since errno can be set to one value when an error occurs, then modified due to another error before its value was checked. This may mean that a check that comes too late (after errno is modified) may reflect a different error than the code's author desires. This assignment will consider an alternate way of handling this problem.

Finally, as discussed in class, allowing implementation details of our high-level languages to influence the functionality of our library routines can be confusing for users. For example, it may not be clear to users why they cannot read after writing or write after reading without using fseek or rewind in between. Therefore, in this assignment, we will not include this restriction. Instead, your routines should keep track of which operation was performed last and automatically take care of buffer flushing or filling so that reads and writes can be intermixed by users of your library without intermediate seeking.

The Assignment

You are to write several I/O routines, similar to those found in the ANSI C standard I/O library. These will be called:

  1. my_getchar
  2. my_fgetchar
  3. my_fungetchar
  4. my_gets
  5. my_fgets
  6. my_putchar
  7. my_fputchar
  8. my_puts
  9. my_fputs
  10. my_fread
  11. my_fwrite
  12. my_fopen
  13. my_fclose
  14. my_fseek
  15. my_fflush

To make it possible to write many of these routines, you will also define your own file type called MY_FILE. This file type will be used with all of the functions listed above that begin with "my_f". (Note that, to make our naming more consistent than that used in the C standard I/O library, all of our I/O functions that explicitly refer to files start with "my_f". Also note that there have been other minor changes in names for our I/O routines from their corresponding ANSI C routines, to make our names more internally consistent.)

We will see later in this course that it is possible to redirect input and output. In particular, we'll note things like redirecting standard output to a file. However, to simplify this assignment, you may assume that routines not starting with "my_f" do not use ordinary files but only the special files standard in and standard out. Moreover, you may make these unbuffered and call read() and write() directly for them, without needing to use your special file type MY_FILE. Further, you may assume that all routines starting with "my_f" will always use ordinary files. Finally, you do not need to be worried about the potential for data corruption caused by multiple processes or threads attempting to use the same file at the same time. (So, for example, where the introductory remarks at the beginning of this assignment mention intermixing reads and writes, these can be assumed to be in the same process, not across multiple processes.)

You will implement all of your I/O routines in terms of POSIX I/O routines and you will implement your file type as a structure or object with a POSIX file descriptor, a buffer of appropriate size to hold the input or output, a pointer to your place in this buffer, and any other data structures you determine to be appropriate for this type. In your writeup you will describe how you determined an appropriate size for your buffer as well as explain and justify any additional data structures included in your file type.

Within your implementation of the "my_f" functions, you should be striving for maximum disk efficiency.

More details follow for each of the I/O functions that you are to implement. (Note that, for each function, my_errno is a local variable in your code whereas errno is the POSIX errno variable.)

int my_getchar(int *my_errno);

Similar to the ANSI C getchar. Reads one character from standard input and returns it as an unsigned char cast to an int, or EOF on end of file or error. If an error occurs while calling my_getchar, my_errno is set to the value of errno within the implementation of my_getchar.

int my_fgetchar(MY_FILE *stream, int *my_errno);

Similar to the ANSI C fgetc. Reads one character from stream and returns it as an unsigned char cast to an int, or EOF on end of file or error. If an error occurs while calling my_fgetchar, my_errno is set to the value of errno within the implementation of my_fgetchar.

int my_fungetchar(int c, MY_FILE *stream, int *my_errno);

Similar to the ANSI C ungetc. Pushes c back onto stream where it is available for subsequent input operations. If an error occurs while calling my_fungetchar, my_errno is set to the value of errno within the implementation of my_fungetchar.

char *my_gets(char *buf, my_size_t size, int *my_errno);

Similar to the ANSI C gets. Reads up to size - 1 characters from standard input. Note that, unlike gets, my_gets does allow the programmer to specify the maximum size of the buffer into which the input will be placed. If an error occurs while calling my_gets, my_errno is set to the value of errno within the implementation of my_gets.

char *my_fgets(char *buf, my_size_t size, MY_FILE *stream, int *my_errno);

Similar to the ANSI C gets. Reads up to size - 1 characters from stream. If an error occurs while calling my_fgets, my_errno is set to the value of errno within the implementation of my_fgets.

int my_putchar(int c, int *my_errno);

Similar to the ANSI C putchar. Writes c to standard output, cast to an unsigned char. If an error occurs while calling my_putchar, my_errno is set to the value of errno within the implementation of my_putchar.

int my_fputchar(int c, MY_FILE *stream, int *my_errno);

Similar to the ANSI C putc. Writes c to stream, cast to an unsigned char. If an error occurs while calling my_fputchar, my_errno is set to the value of errno within the implementation of my_fputchar.

int my_puts(const char *buf, int *my_errno);

Similar to the ANSI C puts. Writes buf to standard output followed by a newline character. If an error occurs while calling my_puts, my_errno is set to the value of errno within the implementation of my_puts.

int my_fputs(const char *buf, MY_FILE *stream, int *my_errno);

Similar to the ANSI C fputs. Writes buf minus its concluding '\0' to stream. If an error occurs while calling my_fputs, my_errno is set to the value of errno within the implementation of my_fputs.

size_t my_fread(void *buf, size_t el_size, size_t num_el, MY_FILE *stream, int *my_errno);

Similar to the ANSI C fread. Reads up to num_el times el_size bytes of data from stream into buf. The concept here is that you can read a number of data elements corresponding to num_el, each of size el_size, into your buffer in a single read. If an error occurs while calling my_fread, my_errno is set to the value of errno within the implementation of my_fread.

int my_fwrite(const void *buf, size_t el_size, size_t num_el, MY_FILE *stream, int *my_errno);

Similar to the ANSI C fwrite. Writes up to num_el times el_size bytes of data from buf to stream. The concept here is that you can write a number of data elements corresponding to num_el, each of size el_size, out to your file in a single write. If an error occurs while calling my_fwrite, my_errno is set to the value of errno within the implementation of my_fwrite.

MY_FILE *my_fopen(const char *path, const char *mode, int *my_errno);

Similar to the ANSI C fopen. Opens a file corresponding the the path name path for reading and/or writing at the beginning or end, depending on the value of mode. Returns a pointer to type MY_FILE which will be used by other "my_f" functions to access the file data. If an error occurs while calling my_fopen, my_errno is set to the value of errno within the implementation of my_fopen.

int my_fclose(MY_FILE *stream, int *my_errno);

Similar to the ANSI C fclose. Closes an open file. If an error occurs while calling my_fclose, my_errno is set to the value of errno within the implementation of my_fclose.

off_t my_fseek(MY_FILE *stream, off_t offset, int whence, int *my_errno);

Similar to the ANSI C fseek in its use of a file pointer rather than a file descriptor, it is otherwise more similar to POSIX lseek in its use of types and return values. Repositions the file pointer to offset bytes past the position indicated by whence which may take on the value SEEK_SET (indicating the start of the file), SEEK_CUR (indicating the current position in the file), and SEEK_END (indicating the final position in the file). If an error occurs while calling my_fseek, my_errno is set to the value of errno within the implementation of my_fseek.

int my_fflush(MY_FILE *stream, int *my_errno);

Similar to the ANSI C fflush. Requests that the OS flush the buffer corresponding to stream. If an error occurs while calling my_fflush, my_errno is set to the value of errno within the implementation of my_fflush.



Notes on this assignment

All of your implementations of input and output in this assignment should use POSIX system calls, not C Standard I/O function calls.



What to turn in.

You will turn in both a hard copy and an electronic copy of your assignment. Please follow the instructions on how to send electronic copies. Do not send them to my email address.

Both the hard copy and the electronic copy will contain a write-up explaining your implementation of the assigned I/O library functions plus all source code you created for this implementation. Both the hard copy and the electronic copy should also contain a sample program demonstrating the functioning of your library routines and the writeup should explain the functioning of this program. The electronic copy will also contain the executable version of this sample program. The electronic copy of your write-up should not be in a proprietary format (such as MS Word); it should be either in plain ASCII text or in a portable format (such as Postscript or PDF). Your source code for the sample demonstration program should be in a single file called demo.c or demo.cxx and your executable code should be called demo.

Your source code should be well structured and well commented. It should conform to good coding standards (e.g., no memory leaks).

Your write-up will include 1/2 to 1 page (roughly 80 characters per line, 50 lines per page) explaining the data structures and algorithms used in your code. This page limitation does not include figures used in your explanation, which are encouraged and may take up any amount of space. (This explanation does not remove the requirement that your code be well commented.)



Other

You may write your program from scratch or may start from programs for which the source code is freely available on the web or through other sources (such as friends or student organizations). If you do not start from scratch, you must give a complete and accurate accounting of where all of your code came from and indicate which parts are original or changed, and which you got from which other source. Failure to give credit where credit is due is academic fraud and will be dealt with accordingly.

As noted in the syllabus, you are required to work on this programming assignment in a group of at least two people. It is your responsibility to find other group members and work with them. The group should turn in only one (1) hard copy and one (1) electronic copy of the assignment. Both the electronic and hard copies should contain the names and student ID numbers of all group members. If your group composition changes during the course of working on this assignment (for example, a group of five splits into a group of two and a separate group of three), this must be clearly indicated in your write-up, including the names and student ID numbers of everyone involved and details of when the change occurred and who accomplished what before and after the change.

Each group member is required to contribute equally to each project, as far as is possible. You must thoroughly document which group members were involved in each part of the project. For example, if you have three functions in your program and one function was written by group member one, the second was written by group member two, and the third was written jointly and equally by group members three and four, both your write-up and the comments in your code must clearly indicate this division of labor.