Second FAQ

for

Project 1 -- Standard I/O and Device Drivers

Q. I just had question that I would like clarified. In project 1 you ask that the project reads in a file once, does this mean that you want us to store the file in a data structure and not use the fgetc(FILE *in_file) command? Of did you mean read in a file once to be one fopen call?

A. You will need to use a data structure to hold the contents of the file when it is read in because you really will be reading the file just once. Take a look at the first question and answer in the first FAQ for this assignment. If we didn't use a data structure to hold the contents of the file, there is no way we could separate part 1(a-c) from part2(a-c).

Whether you use fgetc or some other C Standard I/O routine to read in the file is up to you.

Q. I have a very elementary problem. Since morning I have been trying to get to unix system. I have logged on SSH with my password and login but i dont know where to type my program, how to compile and execute it. My partner is out of town so i cannot ask him. Can you please explain it to me.

A. To get your program typed in, there are two basic approaches you can take.

If you are at home, you can use whatever editor(s) you are comfortable using and/or have at your disposal on your home machine to type in your program, save it as (plain, ASCII) text, then copy it to the CS UNIX machines using sftp, scp, or other such utilities.

If you are at school or home, you can use an editor on the CS UNIX machines to type in your program here directly. The two most common editors on UNIX machines are emacs and vi. Most intro to UNIX books, such as those I list on the class web pages, have sections or chapters on using one or both of these editors. It would be good for you to learn one of them. (Actually, vim, the improved version of vi, is widely available and probably a better choice than just vi.)

Which of these approaches you take is up to you and your choice should probably be based on things like your connection quality (speed and reliability) from home and how convenient it is for you to work at school.

(An approach somewhere between these two would be to type in your program at home, as with the first approach, select all of the text and copy it, then transfer it to the CS UNIX machines by starting up an editor on the CS UNIX machines in your ssh session and pasting into that editor. This would be reasonable if you are having a problem finding sftp, scp, or other secure copy programs.)

To compile programs, you can use the command cc or gcc. (On the CS systems, cc is actually just a link to gcc, so they are the same thing. On other systems, cc may be another C compiler.) The format is

% gcc <program_name>

where <program_name> is replaced with the actual name of the program you are compiling. So, for example

% gcc out_test1.c

would compile your first program. Note that the default name for the executable created will be a.out. You'll want to change the names of the executables to those names given in the assignment. You can either rename the file using mv (move) or you can use the -o flag with gcc. For example

% gcc -o out_test1 out_test1.c

would compile the code in out_test1.c and put the executable in out_test1.

To execute your program after it is compiled, you simply type in its name, just as if it were any other program on the system. For example

% out_test1

would run the executable we just created using gcc in the example above. There are two things to note, however. One is that the assignment says the program is supposed to get the name of the file from the command line, so we should add that when we run the program. Two is that smart people don't include the current working directory (.) in their PATH environment variable, so you'll need to use the path on the command line as well. For example

% ./out_test1 my_input_file

would run out_test1 in the current directory and pass the name my_input_file as the name of the file to read from.

To find out more about the editors and compilers available on the system, try commands such as man -k editor and man -k compiler.

Q. I have done all the six phases using time(), of time_t datatype. Later, I came to know that we have to use times(), of clock_t datatype. I have tried a lot using times() function....Alas!!!!, I'm getting vague answers for the time recorded...

Can you please help me sir. In general, I'm using the code below.


struct tms *buffer; 
clock_t start,end; 
double cpu_time; 
start = (clock_t) times(buffer); 
do{ 
....... 
....... // say i*i;
} 
end = (clock_t) times(buffer); 
cpu_time =((double)((end-start)/CLOCKS_PER_SEC));

.. But this's giving me answer like 0.000002 , just for calculating square of a number 100000 times.. i think thats incorrect... Is my way of approach wrong!!!.. If so, please help me out....

A. Before I answer your question, let me say that the problem with using the time() system call is that it measures time in seconds, which is too crude for what we are measuring. This is why I suggested using the times() system call instead. However, you don't "have to" use the times() system call either, it is simply a suggestion for a way to get the information you need with a fine enough grain to be useful. If you want to use another fine grained time function, you may do so, as long as what you are measuring is "wall time" (that is, time in the world).

Okay, on to the question itself. The problem you are having here is that CLOCKS_PER_SEC doesn't have any relationship to the number of clock ticks as reported by the times() system call. CLOCKS_PER_SEC is meant to be used with the ANSI C clock() function, which returns time in microseconds. (Therefore, CLOCKS_PER_SEC is defined to be 1,000,000, since there are 1,000,000 microseconds in a second.)

What probably confused you is that both clock() and times() have a return data type of clock_t. Despite this, they are measuring and returning very different things (microseconds vs. clock ticks).

Don't worry about converting your results to seconds. You can keep them as clock ticks, since we are really only concerned about the relative speeds of the different versions.

(As you are cleaning up your code, note that casting the value returned by times() to clock_t type before assigning it to start and end is not necessary -- times() already returns clock_t type.)

Q. For phase 3,4,and 5, do we really need to check if the filename to which the data is to be written, exists or not..??? I know of a function ofs::noreplace in C++ to check that..But I donno if ther's anything like noreplace in C. How shall I approach for this problem?

A. Yes, you should check if the output file already exists. However, you can't do this correctly using the ANSI C Standard I/O library that we covered in class, so I'll let you get by with a "close but not quite right" solution for this assignment. This "close but not quite right" solution is a two stage process:

Try to open the file for reading using fopen(). If fopen() succeeds, then the file exists and shouldn't be written over -- your program should exit and give a non-zero return value. If fopen() fails, then the file does not exist and you can proceed to step 2.
Open the file for writing using fopen().

As an aside, ask yourself why this solution is not quite right. (We'll find the right way to solve this problem when we come to POSIX I/O.)

Q. Just making sure, do you consider HTML a portable format that is allowable for the report for our project? This seems like a rather obvious question, but I just want to make sure so my grade isn't docked if this is wrong.

A. Yes, I consider HTML a portable format and you can use it for your write-up for your project.

However, the write-up that you turn in should be self-contained. That is, there can be links to other files that are turned in at the same time (e.g., a main text body with links to a couple of .gif figures, all of which are turned in electronically at the same time), but there shouldn't be links to outside sources (e.g., the "write-up" that you turn is just a link to the real write-up at <www.myisp.com/~me/my_proj1_write_up.html>).
Note that this is also true for other formats (such as PDF) that include the ability to embed URL's or other references.

Q. i've been writing my program in viscual c++ in windows and compiling it with ms dos or the command prompt windows has to offer. everything seems to be working fine. is that going to be a problem (not compiling it on unix, i mean), do you think. if it is, i was just going to write it here and then take it to the engineering lab and make sure it works on the unix system. that should be fine, right.

A. As mentioned in class, we will be using ANSI C and POSIX for this class. Windows NT does have a POSIX subsystem but most Windows operating systems (e.g., 95, 98) do not. Therefore, you will not be able to easily port your code written for these other Windows versions to the CS UNIX machines. If you are using one of those other Windows versions, you would probably be better off writing and debugging your code on a POSIX-compliant OS to begin with. If you are using Windows NT and if Visual C++ complies with ANSI C standards, you should be okay but I would still recommend leaving plenty of time for uploading and testing on the CS UNIX machines.

Q. the sleep function doesn't seem to be working on my computer. at first i thought i just need the right 'include' statement but now i'm thinking it's in a special library only unix has to offer. is that true.

A. Please note that sleep() is not an ordinary function, it is a system call. Any-POSIX compliant OS must provide access to a sleep() system call. If you can't run it on your OS, that is a good indication that your OS is not POSIX-compliant and won't work well for testing code for this course.

Q. because the sleep function doesn't work. i was thinking i could write an infinite loop for five minutes (well, okay, maybe it wouldn't be classified as an infinite loop). something like:

for (startTime=time(NULL);
timeElapsed!=5;timeElapsed=difftime(time(NULL),startTime));

or something like that. that should accomplish the same thing. but i'm thinking it's not the same, since sleeping actually implies doing nothing and in this case, it is looping. but will i be able to use this, is the question.

A. You are right, a "busy wait" (which is what your loop is) is not the same thing as sleeping. If you want to use a "busy wait" for your code while it is in development, you may do so. For the final version that you turn in, however, you must use a true sleeping routine (i.e., one that doesn't waste CPU cycles).

Q. It takes a long time for my code to run, especially with all these sleeps in it. Do I really have to sit around forever waiting for it to run?

A. Fortunately, no. You can log out and leave it running. You do this by "putting it in the background." We'll talk more about this later in the course, but the basic way to do it from the command line is to put an ampersand (&) at the end of the line. So, for example

% ./out_test3 my_input_file my_output_file &

would run out_test3 in the background, reading from my_input_file and writing to my_output_file. One difference that you will notice immediately is that the shell prompt reappears right away, rather than waiting for the program to complete. Now, you can do other things at the prompt, including logging out. When you come back later, the program may still be running or it may have completed.

If you decide later to cancel the running process, you can use the kill command, which is similar to the kill() system call that we discussed briefly in class. You simply need to know the PID of the process you want to stop.

You probably saw the PID given in brackets for you when you put the process in the background. If not, you can find it using ps which we previously discussed in class. For example, I could use

ps -ef | grep hougen

to get all of the processes that I own that are running on the system at this minute. I might find something like

hougen 10188 8372 0 16:36:17 pts/24 0:00 grep hougen hougen 8372 8370 0 11:51:59 pts/24 0:00 -bash hougen 10106 1 0 16:36:10 pts/24 0:00 out_test3

This shows that I am running a Bourne Again Shell (bash) and looking for a string using grep (a powerful tool you should get to know) and that out_test3 is still running, even though I logged out of that session. (Note that out_test3 has been adopted by init, just as we said in class that it would if its parent exited first.)

Now, I can send the termination signal SIGTERM to out_test3 by typing

% kill -s SIGTERM 10106

(Note that I could have shortened this to just % kill 10106, since SIGTERM is the default signal sent by kill but I gave the longer version so that you could see the general form.)