Frequently Asked Questions - CS1323 Section 30, Spring 1997

Index to Questions
What to do if garbage collection fails
When I try to execute 'main' on the rainfall.dat file, I get the message, "Garbage collection fails to reclaim sufficient space." What causes this and how can I fix it?
Answer
Depending on how you described the computation in Project 5, it can take up a widely varying amount of space. If your program encounters a garbage collection failure, get out of Hugs and re-start it with this command:
hugs -h500000 yourProgramFile.lhs
That will give you five times the amount of space that you get with a normal invocation of Hugs.

Sorting a sequence of tuples
Is there any way to sort a sequence of tuples, [(String,Float)], using quicksort? Or do I just need to define my own tuple sorting function.
Answer
Use quickSortWith :: (a -> a -> Bool) -> [a] -> [a] As you can see from the type, it can sort any kind of sequence. In your invocation of quicksortWith, you provide, as the first argument, the function that compares two entries from the sequence to be sorted (tuples in this case) and delivers True iff the first argument should precede the second in the sorted sequence. For example, if you call your comparison function
firstTupleComesFirst :: (String, Float) -> (String, Float) -> Bool
then
firstTupleComesFirst ("abc", 1.3) ("def", 2.7) would be False,
assuming your criterion is that a tuple with a bigger Float component precedes one with a smaller Float component in a sorted sequence of tuples. (The quicksort function will also sort tuples of the type you're interested in, but it uses a default ordering, which probably won't conform to the ordering you're looking for.)

Normalization of data and standard deviations
What is data normalization? What is the standard deviation for a column of numbers?
Answer
Data normalization is a general term for alterations in data to make the data more amenable to accurate analysis for some purpose or other. The form that normalization takes varies with the nature of the data and the form of analysis.

In this case, each column of data represents several different measurements of the same parameter. These measured values vary more widely for some of these parameters than others, and the normalization seeks to make the variation in each column about the same as the variation in the others, but without disturbing ratios between pairs of measurements in the same column.

The statistic used in this case to calculate the variation in a collection of measurements (that is, a column of numbers) is "standard deviation," which is the square root of the average squared deviation from the mean of a collection of numbers. There is a function in NumericUtilities to compute this statistic, given a finite sequence of numbers. So, to compute the standard deviation of a column, package the numbers in the column as a sequence and apply the standard deviation function to it.

Note: There are two standard deviation functions in NumericUtilities. The one with "unbiased" in the name assumes that the measurements are samples from a Gaussian statistical distribution and adjusts the computation to avoid a bias towards underestimation that occurs (on the average) when the standard deviation of an infinite population is estimated from a finite collection of samples from that population.

Curried forms hard to read
When I'm reading a Haskell program, I find it difficult to decipher curried invocations unless I can clearly determine that the original form actually needs another argument. I can see how curried invocations eliminate a lot of excess variables floating around the program, but I have to keep a list of all the functions and their arguments to make it possible to understand the program when I try to read it. Is there a better way?
Answer
You're right to observe that there is no way to understand a curried invocation without knowing the type of the function. So, lists of functions and their types are needed. I don't believe there is any way around this.

You are also correct to observe that curried invocations avoid a clutter of variables. They do this by making it possible to to create functions without naming them.

For example, the curried invocation (take 3) denotes a function that delivers the three-element prefix of a sequence. Now, suppose you are defining a function that must extract a subsequence of its argument to make its computation. And, suppose you want the function to be abstract with respect to the subsequence it extracts. Your definition would look something like this (where a and/or b would be more specific types, or b could be a).

f :: ([a] -> [a]) -> [a] -> b
f extract xs = ... some formula involving extract and xs ...
If curried invocations weren't available to denote functions anonymously, you'd have to name a different extraction function for every different invocation of f:
take3 -- the function that delivers the three-element prefix
take5 -- five-element prefix
drop2 -- suffix without first two elements
evOth -- subsequence consisting of every other element
... etc, ad nauseam ...
So, curried invocations avoid this kind of clutter, at the price of requiring people who read Haskell programs to maintain a list of the types of the functions invoked in the program. Seems like a good trade-off to me, but you'll have a chance to see the trade-off for yourself when you write programs in C, C++, Visual Basic, Java, Ada, Fortran 90, PowerBuilder, Pascal, Cobol, PL/I, or any of myriad other languages whose expressions cannot deliver functions for which the program has not, at some point, specified a name.

Project 3B clarification
Should the function "justify" put spaces in between and around the words of the sequence or between each letter of the words?
   ws = ["One", "Two", "Three"]
   justify 16 ws = "One  Two  Three" (word space space word space space word)
   justify 16 ws = " One Two Three " (space word space word space word space)
   justify 16 ws = "O n e T w o  ..." (spaces between the letters of the strings)
Which should it be?
Answer
Put spaces between the words, as in the first example in the question, not at the beginning or end of the line or between letters within words.

The meaning of "predicate"
What part of the function does the word "predicate" refer to? This seems to be odd syntax. In English grammar it is the part of a sentence that contains the verb and associated words.
Answer
The sense of the word "predicate" when it is used to describe grammatical constructs in Haskell is the meaning listed first in Merriam Webster's Collegiate Dictionary, Tenth Edition (1993):
predicate
1 a: something that is affirmed or denied of the subject in a proposition in logic
b: a term designating a property or relation
The first argument of takeWhile, for example, is a predicate. That argument must be a logical function (a function that delivers a value of type Bool) that affirms or denies a property of its argument, which will be an element of the sequence supplied as the second argument of takeWhile. Other functions that have predicates as arguments include dropWhile, span, break, and iterate. A guard in a list comprehesion is also a predicate.

Here is a nitpick with the questioner's phrasing, but a nitpick with a point that is relevant to computing science: This questioner refers to this use of predicate as "odd syntax." In fact, the question has to do with the meaning of the word "predicate". This is a semantic question. A syntactic question might have brought up an issue related to the part of speech taken by the word "predicate." A noun of any meaning could be substituted for "predicate" in the sentences where it occurred without affecting the grammatical correctness of those sentences in standard English. Computing science has had much more success dealing with issues of syntax than with issues of semantics. Semantics seems to be a much more difficult issue to put into an effective scientific framework.

takeWhile, predicates, operator sections, and where clauses
I would like to use takeWhile to return a sequence of strings, from a sequence of strings. The predicate that I want to use is that I don't want the length of the strings to exceed a certain integer. I tried the following setup:
  lastFit :: Int -> [String] -> String
  lastFit n ws = takeWhile (n >= length) ws
and got the following response:
ERROR "d:\school\cs1323\proj3\tester.lhs" (line 13): Type error in application
*** expression     : takeWhile (n >= length) ws
*** term           : n >= length
*** type           : Bool
*** does not match : a -> Bool    
I take this to mean that the function wants a Boolean predicate and I'm giving it a function that generates one. Am I right? Can I fix this? How?
Answer
First, I hope what you want to do is to deliver a prefix of a sequence of strings (supplied as the second argument of takeWhile). That's what takeWhile does. It delivers all elements of a sequence up to, but not including, the first one that doesn't pass the test specified in the predicate (supplied as the first argument of takeWhile).

Your definition

  lastFit :: Int -> [String] -> [String]
  lastFit n ws = takeWhile (n >= length) ws
has the right idea, just not quite the right syntax. (Question: you did mean to deliver a sequence of strings, right? Not a single string, as indicated in the type specification part of your definition. If you meant to deliver a single string, I would guess from the name you've given the function that you mean it to be the last short one. To get this string, you can simply apply the function "last" to prefix that the above function is trying to deliver.)

The formula (n >= length) has something wrong with it. What's wrong is this: the right hand operand of the >= operator must be a number, but in your formula it's not a number. It's a function:

   length :: [a] -> Int
The length function delivers the number of elements in it argument, which must be a sequence (well, a string in this case, since that's the type you've given lastFit).

You've tried to create a predicate for takeWhile that accepts short strings and rejects long ones (n being the boundary between long and short). Here are a few ways to do that:

Way 1
Define a function in a where clause that takes the value True if its argument is a short string, and use that function as the predicate:

  lastFit n ws = takeWhile short ws
    where
    short w  =  n >= length w
Way 2
Compose the operator-section (n >=) with the function length:
   lastFit n ws = takeWhile ((n >=) . length) ws
This formula for the predicate says to first apply the length function to a string from the parameter ws (this delivers a number), then apply the operator-section (n >=) to that number. The operator-section will deliver True if n exceeds (or is equal to) its argument (which is the value delivered by length).

Way 3
Way 1 is equivalent ot Way 2. In fact, the definition of the function short in Way 1 could have taken the same form as the predicate in Way 2:

  lastFit n ws = takeWhile short ws
    where
    short = (n >=) . length
Way 4
The definition of short (in both Way 1 and Way 3) illustrates an important aspect of where clauses: that the formulas defining entities in where clauses may refer to parameters of the function whose definition contains the where clause. You don't have to use a where clause to define the predicate (even if you do want to have a name for the predicate, as in Ways 1 and 3, rather than leaving the predicate anonymous, as in Way 2). But, if you don't use a where clause, then you need to supply another argument to the predicate to tell it how short you want the sequences to be when lastFit includes them in the sequence of strings it delivers.
  lastFit n ws = takeWhile (shorterThan n) ws
 
  shorterThan :: Int -> String -> Bool
  shorterThan n w = n >= length w
In Way 4, the predicate in the definition of lastFit is denoted by a curried invocation of the function shorterThan.

You brought up a good example. There are lot's of good lessons in it.

Building assembly lines
What does it mean to put the composition operator (.) between two sequences? Isn't that what you're doing in a formula like
ownersOfAll carMakes = foldr1 (.) [ownersOf make | make <- carMakes]
Answer
Yes, the formula does put the composition operator between elements of a sequence, but those elements are not, themselves, sequence.

If, for examples, carMakes = [a, b, c, d], then
foldr1 (.) [ownersOf make | make <- carMakes] = (ownersOf a) . (ownersOf b) . (ownersOf c) . (ownersOf d)

The curried invocation (ownersOf x) delivers a function that extracts, from a sequence of owners, those who own a car of make x.

This construction is also used to build the function removePunctuation in the Haskell text. It's a powerful notion -- a way to build an assembly line of filters that, in principle at least, could be operating concurrently. This is the fundamental idea behind pipelined computing hardware - pipelines being one way to squeeze extra speed out of a given class of computing technology. The idea has been used in hardware design for over two decades. It reaches a massive level of implementation in vector-oriented supercomputers (e.g., Cray computers), and it's also one of the key ideas in RISC processors (most of the chips used as processing elements in workstations are RISC processors).

Getting started on the ownerZipcodes function
I've been thinking for three hours about how to define the ownerZipcodes function, and I still don't have any idea where to start. Can you give me a hint?
Answer
One step towards putting together the ownerZipcodes function is to write a function that removes certain elements from the car data sequence (the list whose elements are owner/address/... information). Namely, it must remove owners who don't own all of the cars in a particular list of car makes (e.g., ["Ford", "Honda"]). Suppose you could write a function that could remove those owners who didn't own car make x ("Ford", say) -- that is, a function that filters out the owners who don't own a car of make x. Then, you could construct a function that removes owners who don't own all of the cars in a list by following the strategy that the removePunctuation function in Lesson 6 uses to build a more selective filter from a bunch of less selective ones.

Downloading software from the Web to a Windows system
Hey! I did what you said, and I didn't get a copy of the SequenceUtilities.lhs file. What do I do now?
Answer
The original write-up for Individual Project 1 failed to take some deficiencies of Windows systems into account. The problem has to do with long file names. Additional instructions for downloading files from the CS1323h Web site on a Windows system have now been included in the write-up.

Reading email that resides on UC&TS from a foreign system.
My email resides on my OU account, but I'm using another computing system right now. How can I get my email?
Answer
Get into Netscape and click on the email icon. Then pull down the options menu, and choose the Mail and News Preferences option. On the preferences page, click on the servers tab.
Under the servers tab:
change the incoming server to pop.ou.edu
and change the user name to your ECN userID
Click the OK button, and you'll be up and running. Note: If you're doing this from your own computing system, you only have to do it once. However, if you're doing it in a lab, you'll have to do it every time you login (because lab computers erase information specific to your accounts when you log out).

Reading email that resides on ECN from a foreign system.
My email resides on my ECN account, but I'm using another computing system right now. How can I get my email?
Answer
Get into Netscape and click on the email icon. Then pull down the options menu, and choose the Mail and News Preferences option. On the preferences page, click on the servers tab.
Under the servers tab:
change the incoming server to mailhost.ecn.ou.edu
and change the user name to your ECN userID
Click the OK button, and you'll be up and running. Note: If you're doing this from your own computing system, you only have to do it once. However, if you're doing it in a lab, you'll have to do it every time you login (because lab computers erase information specific to your accounts when you log out).

Where are my grades?
You said I'd be able to use the "Individual Grades" hyperlink from the CS1323 Honors Welcome Page on the web to find a summary of my grades. They don't seem to be there. Why is that?
Answer
Your grade summary should be identified by the four-digit personal code that you supplied to the instructor. If your code doesn't appear in the list, send the instructor an email note containing your code.

Not getting email from instructor? Use .forward
You said I would be getting a lot of email from you, but I haven't gotten any. Why is that?
Answer
Maybe it's going to the wrong place. Did you let the instructor know where you wanted to receive email? If not, it's going to your ECN account. To receive it where you really want it, do this:

Or, maybe the instructor recorded the wrong address. Send him a note with your address, and ask him to check.


CS 1323 Section 30 - Fundamentals of Computer Programming - Spring 1997
Instructor: Rex Page (Email: page@ou.edu)
Up to: Welcome Page ~~~ Up to: University of Oklahoma ~~~ Go to: Next Semester ~~~ Go to: Previous Semester
Last Modified: Friday, 02-Dec-2005 11:43:34 CST