% % $Header: /home/cvs/root/haskell-report/report/literate.verb,v 1.5 2002/12/02 14:53:30 simonpj Exp $ % %**The Haskell 98 Report: Literate Comments %**~header \subsection{Literate comments} \label{literate} \index{literate comments} The ``literate comment'' convention, first developed by Richard Bird and Philip Wadler for Orwell, and inspired in turn by Donald Knuth's ``literate programming'', is an alternative style for encoding \Haskell{} source code. The literate style encourages comments by making them the default. There are multiple styles of literate Haskell, which it is [possible/permissible] but not advisable to mix. [In summary,] the [literate] program text is split into lines; then, lines between @\begin{code}@$\ldots$@\end{code}@ delimiters [on lines of their own] (LaTeX style), as well as [lines beginning ``@>@''] [lines in which ``@>@'' is the first character] (Bird style), are treated as part of the program; all other lines are comment. The lines that are part of the program due to beginning with ``@>@'' are translated by replacing the leading ``@>@'' with a space. All non-program lines are comment. Layout and comments apply exactly as described in Appendix~\ref{syntax} in the resulting text. To capture some cases where one omits an ``@>@'' by mistake, it is an error for a program line to appear adjacent to a non-blank comment line, where a line is taken as blank if it consists only of whitespace. (@\begin{code}@ and @\end{code}@ are considered blank lines for this purpose.) By convention, the style of comment is indicated by the file extension, with ``@.hs@'' indicating a usual Haskell file and ``@.lhs@'' indicating a literate Haskell file. (Normal comments are still permissible in the program sections of literate Haskell code.) Using the Bird style, a simple factorial program would be: \bprog @ This literate program prompts the user for a number and prints the factorial of that number: > main :: IO () > main = do putStr "Enter a number: " > l <- readLine > putStr "n!= " > print (fact (read l)) This is the factorial function. > fact :: Integer -> Integer > fact 0 = 1 > fact n = n * fact (n-1) @ \eprog Another program, in the LaTeX style: \bprog @ \documentstyle{article} \begin{document} \section{Introduction} This is a trivial program that prints the first 20 factorials. \begin{code} main :: IO () main = print [ (n, product [1..n]) | n <- [1..20]] \end{code} \end{document} @ \eprog It is permissible / implementations MAY permit (at their option/discretion) (are not required to diagnose) (for the parametricity of translating from literate to plain Haskell), but not advisable (in order that the literate comment could be translated to an ordinary Haskell comment regardless of its contents), to put a substantive literate comment within a single lexical element, namely, a block comment or a string gap. Note that string literals are not meaningful at this point; in the snippet \bprog @ foo = "hello\ \end{code}" @ \eprog the @\end{code}@ line is interpreted as such (and in error due to the text @"@ at its end). However it's fine if the line does not BEGIN with the offending marker: \bprog @ foo = "hello\ \end{code}" @ \eprog Note that although literate Haskell files containing no code are not expressly forbidden, they translate to the equivalent of @module Main(main) where {}@, which is in error. A declarative syntax, overly permissive in that it permits comment lines next to program lines: Over multiple lines: @@@ file -> \{ literateCommentLine | birdProg | texBlock \} texBlock -> beginCode \{texProg\} endCode @@@ Over single lines: [should bad encoding be forbidden? literateCommentLine's definition seems to do so, but then so do comment and ncomment's current definitions] [I could use \{space | tab\} instead of \{any_{\langle{}graphic\rangle}\}] [beginCode and endCode could have \{any_{\langle{}graphic\rangle}\} deleted] @@@ literateCommentLine -> \{any\}_{\langle{}birdProg | possibleLhsTexCmd\rangle} blankLiterateCommentLine -> \{any_{\langle{}graphic\rangle}\} birdProg -> @>@ \{any\} texProg -> \{any\}_{\langle{}possibleLhsTexCmd\rangle} possibleLhsTexCmd -> (@\begin{code}@ | @\end{code}@) \{any\} beginCode -> @\begin{code}@ \{any_{\langle{}graphic\rangle}\} endCode -> @\end{code}@ \{any_{\langle{}graphic\rangle}\} @@@ A Haskell inductive/imperative-style definition: \bprog @ import Maybe (isJust, listToMaybe) import Char (isSpace) type LhsString = String type HsString = String literateCommentLine, emptyLiterateCommentLine, nonEmptyLiterateCommentLine, possibleLhsTexCmd, beginCode, endCode :: LhsString -> Bool birdProg, texProg :: LhsString -> Maybe HsString literateCommentLine l = not (isJust (birdProg l) || possibleLhsTexCmd l) emptyLiterateCommentLine = all isSpace nonEmptyLiterateCommentLine l = not (emptyLiterateCommentLine l || isJust (birdProg l) || possibleLhsTexCmd l) birdProg ('>':l') = Just (' ':l') birdProg _ = Nothing texProg l = if possibleLhsTexCmd l then Nothing else Just l possibleLhsTexCmd l = isJust (takeBegin l) || isJust (takeEnd l) beginCode l = fmap (all isSpace) (takeBegin l) == Just True endCode l = fmap (all isSpace) (takeEnd l) == Just True takeBegin, takeEnd :: LhsString -> Maybe String takeBegin = (`stripPrefix` "\\begin{code}") takeEnd = (`stripPrefix` "\\end{code}") -- I really need to get around to proposing this for the standard libraries -- , says Ian Lynagh stripPrefix :: Eq a => [a] -> [a] -> Maybe [a] xs `stripPrefix` [] = Just xs [] `stripPrefix` _ = Nothing (x:xs) `stripPrefix` (y:ys) | x == y = xs `stripPrefix` ys | otherwise = Nothing badAdjacentLines :: LhsString -> LhsString -> Bool badAdjacentLines l1 l2 = bad l1 l2 || bad l2 l1 where bad a b = isJust (birdProg a) && nonEmptyLiterateCommentLine b --[should I mind using pattern guards?] --[except for errors, I always return as many lines as input, -- to preserve line numbering and such; is there a good -- higher-order function to express the following pattern?] file, texBlock :: [LhsString] -> [HsString] file [] = [] file (l:ls) | (fmap (badAdjacentLines l) (listToMaybe ls)) == Just True = error "literate comment adjacent to code" | Just l' <- birdProg l = l' : ls | literateCommentLine l = "" : file ls | beginCode l = "" : texBlock ls | endCode l = error "\\end{code} not in code block" | otherwise = error "{code} followed in line by visible text" texBlock (l:ls) | Just l' <- texProg l = l' : texBlock ls | beginCode l = error "\\begin{code} inside code block" | endCode l = "" : file ls | otherwise = error "{code} followed in line by visible text" texBlock [] = error "\\end{code} expected" translateFile :: LhsString -> HsString translateFile = unlines . file . lines @ \eprog %**~footer