Exam 2, Solution

Question 1: Memory Management (20 points)

A. What is compaction and why was it invented?

5 pts.

Compaction is a method of fighting external fragmentation by moving the data and text segments for all the processes to one end of the memory. This creates a large contiguous free section of memory which allows larger processes to be loaded. It was invented to maximize the efficient use of what was (and perhaps still is) very expensive system memory.

B. Is compaction used with paging systems? Why or why not?

5 pts.

No, paging and compaction are not generally used together. Paging involves using fixed size chunks of memory called frames or pages. Paging systems suffer from internal fragmentation (because a program's address space or data segment may not end on a page boundary), not the external fragmentation that compaction can help reduce.

C. What special hardware is required to make compaction feasible? Explain your answer.

5 pts.

A Relocation Register and a Relative Address Register are needed for compaction. These two registers make it possible for the OS to dynamically allocate memory for a process. The RR provides the base address for the process. The RAR is added to this base address to translate relative addresses into physical addresses.

D. Once all the hardware needed for compaction is present on a machine, what additional hardware is needed to make paging feasible? Explain your answer.

5 pts.

No additional hardware is needed for paging. This is because the RR and the RAR must already exist in a machine that is capable of compaction.

Question 2: Files (20 points)

In UNIX systems, three time values are stored for each file. These are the last time of access of the file data (atime), last time of modification of the file data (mtime), and last time of change of i-node status for the file (ctime). Users of UNIX systems may use the utime system call to request that the O/S set a file's atime or mtime values (or both) to any valid value of type time_t, but may not similarly request that the O/S set the value of ctime. Instead, ctime is only set automatically by the system to the current time when certain system calls are made, including utime.

A. List and explain one good reason for allowing users to arbitrarily set the atime and mtime values of their files.

5 pts.

(This question has many valid answers, this is only an example.)

One reason to allow a user to arbitrarily change atime and mtime, is to allow the user to synchronize files between two locations. The user could copy a file from one system to another and then change the time values on the new copy to more closely match those of the source file.

(Other answers similarly need to explain that giving users control over their files is generally to their benefit.)

B. List and explain one good reason for not allowing users to arbitrarily set the ctime values of their files.

5 pts.

(This question has many valid answers, this is only an example.)

Security: By not allowing a user to arbitrarily change ctime, a system administrator or T.A. can determine when the file had last been changed, regardless of the atime or mtime values. In addition, ctime is meant to represent the last time that the i-node information was modified. Allowing arbitrary changes conflicts with this goal.

(Other answers similarly need to explain that giving users complete control over their files is generally a bad idea.)

C. Should setting ctime values be atomic with the other action(s) of a system call that causes ctime to be changed? Explain your answer.

5 pts.

Yes, ctime is meant to represent the last time the i-node information for file had been changed. If another system call that would change ctime is called while the ctime is being set, we could end up with an inaccurate ctime. Further, if a system call that would change ctime is interupted between setting ctime and its other action(s) (regardless of which happened first) and the process that called it exited, then ctime could be greatly different from what it should be. This defeats the purpose of ctime.

(By the way, the fact that part D of this question asks about another system call (besides utime) that needs to be atomic as well, should have been a give-away that utime needs to be atomic. If you realized that, then you just needed to provide the explanation.)

D. Give an example of another of a system call (besides utime) that modifies i-node information and which should do so atomically with the other action(s) that the system call takes. Explain why this should be atomic as well.

5 pts.

(This question has many valid answers, this is only an example.)

System calls link and unlink are two good examples. For example, suppose we have a file with only a single link. We decide to add a new link to the file and then remove the old link. As long as these operations are atomic, things will go as planned. If they were not atomic, the outcome would be unpredictable. We might wind up with a directory entry but no file, a file but no directory entry, a file with an incorrect link count, etc.

(Other answers similarly need to give examples of system calls that modify i-node data, as many do, and explain what problem(s) might arrise if this wasn't atomic with the system call's other action(s).)

Question 3: Directories (20 points)

While directories in UNIX systems are just files containing lists of i-node numbers and filenames, they are distinguished in their own i-nodes by a bit (in the mode bytes) that marks them as being directories. Why is this bit crucial for file system integrity? Explain what could happen to the directory structure if this bit were not present.

(This question has many valid answers, this is only an example.)

Because it is this bit that allows the OS to distinguish between other file types and directories, if it were not present, users could use system calls intended for ordinary files on directories. A user could, for example, open a directory using open() and write to it using write(), thereby creating all kind of loops and cycles in what should be a directory tree. Or, the user could use these system calls to remove directory entries without the corresponding reduction of the link count in the i-node, thereby removing all references to these files without removing the files themselves.

(Other answers, such as that programs that recursively move through the file tree would be lost because they wouldn't know which directory entries refered to directories and which refered to other file types, are also valid. These answers need to demonstrate that you understand the role of this information in the file system by discussing the effect of its loss on other system calls or programs.)

Question 4: Permissions (20 points)

On UNIX systems, a file's access control list (ACL) could be stored in the file's i-node, in the file's data block(s), or in another block (or blocks) on the device where the file is stored.

(These questions have many valid answers, those given are only examples.)

A. Give one advantage of storing the ACL in the file's i-node over each of the other possibilities.

6 pts.

Because the i-node must be read in order to access a file (to locate it on the device, see its basic permissions, etc.), storing the ACL in the i-node would never require any extra reads to find out if the process has permission to access the file by virtue of the ACL. Storing the ACL in the data block would require an extra read for processes that don't have permission to access the file and storing the ACL in another block (or blocks) on the device where the file is stored would require an extra read unless the process had access to the file by virtue of the basic permissions.

B. Give one advantage of storing the ACL in the file's data block(s) over each of the other possibilities.

6 pts.

An advantage of storing the ACL in the file's data block(s) over storing it in the i-node is that each i-node must be the same size, meaning that we either have to make all of the i-nodes large even though most files will have very small ACLs or none at all, which wastes space, or that the ACLs will be very limitted in size, which reduces the flexibility that they are supposed to provide.

An advantage of storing the ACL in the file's data block(s) over storing it in another block (or blocks) on the device where the file is stored is that the data block(s) are very likely to be read in when the file is accessed -- this means that, in this (common) case, there will need to be one less disk access than if the OS needed to first read another block for the ACL, then start reading the file.

C. Give one advantage of storing the ACL in another block (or blocks) on the device where the file is stored over each of the other possibilities.

6 pts.

An advantage of storing the ACL in another block (or blocks) on the device where the file is stored over over storing it in the i-node is that each i-node must be the same size, meaning that we either have to make all of the i-nodes large even though most files will have very small ACLs or none at all, which wastes space, or that the ACLs will be very limitted in size, which reduces the flexibility that they are supposed to provide.

An advantage of storing the ACL in another block (or blocks) on the device where the file is stored over storing the ACL in the file's data block(s), is that we don't have to worry about the complexity of using the same block for both the ACL and the data. This complexity would be reflected, for example, in where we start the data within a file -- if we start it just after the ACL, then we need to move it when the ACL grows, if we leave space for the ACL to grow, then we are wasting space in the data block if the ACL doesn't grow, etc.

Question 5: Signals (20 points)

You find yourself working on a system that lacks the wait and waitpid system calls, yet you want to use wait(NULL) to cause a parent to do nothing until its child exits. (You don't care about the return value from wait(NULL) or removing zombies.) "You are only worried about the process coordination aspects of wait()." How could you get around this problem using the system calls you do have available to you? Be specific about what system calls you would use, in what order, and what other conditionals and control structures would be needed to implement this. (You do not need to write out the code itself, although you may do so if you would rather do this than explain things in words.)

[Note that the text above in red was written on the board and instructions were given orally to put add it to your test where it is included above. The text above in blue was spoken aloud as additional information.]

When a child process exits, the OS will automatically send a SIGCHLD signal to its parent. By default this signal is ignored. However, if we establish a signal handler in the parent by using sigaction(), we can catch that signal. This will allow the parent to know when the child has exited.

We'll also need to have the parent "do nothing" until the child exits, just as would happen with wait(NULL). We can use pause() to accomplish this.

Using pause() in conjuction with catching the SIGCHLD signal means that the signal handling routine can be empty -- as long as the signal is caught after the parent calls pause(), simply returning from the signal handler will move us on to the next line in the code after pause().

The order will be:

The process will set up the signal handler -- create an empty function and put it in place with sigaction().

The process will fork().

The parent (the original process) will call pause() while the child carries out its task (e.g., calls an exec family system call, etc.).

(We really should account for the possibility that the SIGCHLD signal will be delivered before the parent pauses. Unfortunately, there is no way to make this atomic, so the best we can do without wait() or waitpid() is to have the child sleep briefly to ensure that the parent has had a chance to call pause().)

(Some students tried to catch other signals, such as SIGUSR1 or SIGINT. This has two drawbacks. First, only children designed to work with this parent will send that signal, whereas we might want the child to exec some existing program. Second, if the child fails to send this signal before it exits (e.g., if it crashes), then the parent will never find out that the child has exited.)

(Some students also tried to have the parent use other methods of waiting besides pause(). The problem with most of these methods is that they produced busy-waits and/or caused the parent to wait much longer than needed to find out that the child had exited.)