> module Project3A(linePackets) > where > import SequenceUtilities(takeUntil, prefixes) linePackets separates the supplied sequence of strings (words) into several sequences (packets) of words, each of which would form a line with length as close as possible to the supplied limit linePackets k [w1, w2, ..., wn] = [p1, p2, ..., pm] such that: 1. all supplied words are contained, in the same order, in the delivered packets -- that is: concat[p1, p2, ..., pm] = [w1, w2, ..., wn] 2. each delivered packet is as long as it can be without exceeding the supplied limit for line length -- that is: for any packet p except the last packet (that is, pm) length(unwords p) <= k and length(unwords(p ++ [w]) > k, where w is the first word in the next packet after p 3. exception: any word whose length exceeds the line length limit will comprise its own packet examples: linePackets 12 ["The", "purpose", "of", "computing"] = [["The", "purpose"], ["of computing"]] linePackets 6 ["is", "insight,", "not", "numbers."] = [["is"], ["insight,"], ["not"], ["numbers."]] linePackets 6 ["insight,", "not", "numbers."] = [["insight,"], ["not"], ["numbers."]] linePackets 14 (repeat "blah") = repeat["blah", "blah", "blah"] > linePackets :: Int -> [[a]] -> [[[a]]] > linePackets lengthLimit ws = > [packet | (packet, whatsLeft) <- takeUntil (null . snd) splits] > where > splits = iterate (glomNextPacket lengthLimit) firstSplit > firstSplit = glomNextPacket lengthLimit ([ ], ws) glomNextPacket splits the second component of its argument, which is a sequence of strings (words), into 1. its longest prefix that, when expanded by unwords, has a length that does not exceed a given limit and 2. the suffix associated with that prefix glomNextPacket ignores the first component of its argument examples: glomNextPacket 12 (dontCare, ["The", "purpose", "of", "computing"]) = [["The", "purpose"], ["of", "computing"]] glomNextPacket 6 (dontCare, ["is", "insight,", "not", "numbers."]) = [["is"], ["insight,", "not", "numbers."]] glomNextPacket 6 ["insight,", "not", "numbers."] = [["insight,"], ["not", "numbers."]] > glomNextPacket :: Int -> ([[a]], [[a]]) -> ([[a]], [[a]]) > glomNextPacket lengthLimit (previousPacket,ws) = (packet,restOfws) > where > packet = nextPacket lengthLimit (prefixes ws) > restOfws = drop (length packet) ws given a sequence of sequences of strings, nextPacket delivers the last of those sequences of strings that, when expanded by unwords, doesn't exceed a given length limit (or, if there is no such sequence, nextPacket delivers the first sequence in its argument) > nextPacket :: Int -> [[[a]]] -> [[a]] > nextPacket lengthLimit possiblePackets = > last([[ ]] ++ take 1 possiblePackets ++ > takeWhile lengthLimitNotExceeded (drop 1 possiblePackets)) > where > lengthLimitNotExceeded packet = > sum[length w | w <- packet] + length(packet)-1 <= lengthLimit Implementation Note: The intended use of linePackets would give it the following type: linePackets :: Int -> [String] -> [[String]] The (more general) declared type allows greater applicability. I don't see any need for this generality, but the only complication it causes in the implementation is the definition of the function lengthLimitNotExceeded, which could used the simpler formula (length . unwords) packet <= lengthLimit if the more restrictive type for linePackets had been used. A Property of the Function linePackets: if (1) ps = linePackets k ws and (2) p is an element of ps and (3) sum[length w | w <- p] + length(p) - 1 > k, then length(p) = 1 proof: From the definition of linePackets, it is apparent that each element of ps is the first component of a tuple delivered by (glomNextPacket k). Therefore, it is sufficient to show that if (p, _) = glomNextPacket k (_, [w1, w2, ...]) and length(unwords p) > k, then length(p) = 1 (the values of the components marked with an underscore are irrelevant). The definition of glomNextPacket says that p = nextPacket k (prefixes [w1, w2, ...]) and the specification of prefixes says that p = nextPacket k [[w1], [w1, w2], [w1, w2, w3], ...] The definition of nextPacket implies that p = nextPacket k [[w1], [w1, w2], [w1, w2, w3], ...] = last([[w1]] ++ wss) where each element ws in wss has the property that sum[length w | w <- ws] + length(ws) - 1 <= k The elements of [[w1]] ++ wss are [w1] and the elements of wss. Therefore, p, being the last of those elements, must either be [w1] or an element of wss. If the latter is the case, then sum[length w | w <- p] + length(p) - 1 <= k But, this violates assumption (3). Therefore, it must the the case that p = [w1], which means that length(p) = length[w1] = 1 because [w1] is a sequence with exactly one element, namely w1. QED