%
% $Header: /home/cvs/root/haskell-report/report/literate.verb,v 1.5 2002/12/02 14:53:30 simonpj Exp $
%
%**
The Haskell 98 Report: Literate Comments
%**~header
\subsection{Literate comments}
\label{literate}
\index{literate comments}
The ``literate comment''
convention, first developed by Richard Bird and Philip Wadler for
Orwell, and inspired in turn by Donald Knuth's ``literate
programming'', is an alternative style for encoding \Haskell{} source
code.
The literate style encourages comments by making them the default.
There are multiple styles of literate Haskell, which it is [possible/permissible]
but not advisable to mix.
[In summary,] the [literate] program text is split into lines; then,
lines between @\begin{code}@$\ldots$@\end{code}@ delimiters
[on lines of their own] (LaTeX style),
as well as [lines beginning ``@>@''] [lines in which ``@>@'' is the first character] (Bird style),
are treated as part of the program; all other lines are comment.
The lines that are part of the program due to beginning with ``@>@''
are translated by replacing the leading ``@>@'' with a space.
All non-program lines are comment.
Layout and comments apply
exactly as described in Appendix~\ref{syntax} in the resulting text.
To capture some cases where one omits an ``@>@'' by mistake, it is an
error for a program line to appear adjacent to a non-blank comment line,
where a line is taken as blank if it consists only of whitespace.
(@\begin{code}@ and @\end{code}@ are considered blank lines for this purpose.)
By convention, the style of comment is indicated by the file
extension, with ``@.hs@'' indicating a usual Haskell file and
``@.lhs@'' indicating a literate Haskell file.
(Normal comments are still permissible in the program sections
of literate Haskell code.)
Using the Bird style, a
simple factorial program would be:
\bprog
@
This literate program prompts the user for a number
and prints the factorial of that number:
> main :: IO ()
> main = do putStr "Enter a number: "
> l <- readLine
> putStr "n!= "
> print (fact (read l))
This is the factorial function.
> fact :: Integer -> Integer
> fact 0 = 1
> fact n = n * fact (n-1)
@
\eprog
Another program, in the LaTeX style:
\bprog
@
\documentstyle{article}
\begin{document}
\section{Introduction}
This is a trivial program that prints the first 20 factorials.
\begin{code}
main :: IO ()
main = print [ (n, product [1..n]) | n <- [1..20]]
\end{code}
\end{document}
@
\eprog
It is permissible / implementations MAY permit (at their option/discretion)
(are not required to diagnose)
(for the parametricity of translating from literate
to plain Haskell), but not advisable (in order that
the literate comment could be translated to an ordinary Haskell comment
regardless of its contents), to put a substantive literate
comment within a single lexical element, namely, a block comment or
a string gap.
Note that string literals are not meaningful at this point; in the snippet
\bprog
@
foo = "hello\
\end{code}"
@
\eprog
the @\end{code}@ line is interpreted as such (and in error due to the
text @"@ at its end). However it's fine if the line does not BEGIN with
the offending marker:
\bprog
@
foo = "hello\
\end{code}"
@
\eprog
Note that although literate Haskell files containing no code
are not expressly forbidden, they translate to the equivalent of
@module Main(main) where {}@, which is in error.
A declarative syntax, overly permissive in that it permits comment
lines next to program lines:
Over multiple lines:
@@@
file -> \{ literateCommentLine | birdProg | texBlock \}
texBlock -> beginCode \{texProg\} endCode
@@@
Over single lines:
[should bad encoding be forbidden? literateCommentLine's definition seems
to do so, but then so do comment and ncomment's current definitions]
[I could use \{space | tab\} instead of \{any_{\langle{}graphic\rangle}\}]
[beginCode and endCode could have \{any_{\langle{}graphic\rangle}\} deleted]
@@@
literateCommentLine -> \{any\}_{\langle{}birdProg | possibleLhsTexCmd\rangle}
blankLiterateCommentLine -> \{any_{\langle{}graphic\rangle}\}
birdProg -> @>@ \{any\}
texProg -> \{any\}_{\langle{}possibleLhsTexCmd\rangle}
possibleLhsTexCmd -> (@\begin{code}@ | @\end{code}@) \{any\}
beginCode -> @\begin{code}@ \{any_{\langle{}graphic\rangle}\}
endCode -> @\end{code}@ \{any_{\langle{}graphic\rangle}\}
@@@
A Haskell inductive/imperative-style definition:
\bprog
@
import Maybe (isJust, listToMaybe)
import Char (isSpace)
type LhsString = String
type HsString = String
literateCommentLine, emptyLiterateCommentLine, nonEmptyLiterateCommentLine,
possibleLhsTexCmd, beginCode, endCode :: LhsString -> Bool
birdProg, texProg :: LhsString -> Maybe HsString
literateCommentLine l = not (isJust (birdProg l) || possibleLhsTexCmd l)
emptyLiterateCommentLine = all isSpace
nonEmptyLiterateCommentLine l =
not (emptyLiterateCommentLine l || isJust (birdProg l) || possibleLhsTexCmd l)
birdProg ('>':l') = Just (' ':l')
birdProg _ = Nothing
texProg l = if possibleLhsTexCmd l then Nothing else Just l
possibleLhsTexCmd l = isJust (takeBegin l) || isJust (takeEnd l)
beginCode l = fmap (all isSpace) (takeBegin l) == Just True
endCode l = fmap (all isSpace) (takeEnd l) == Just True
takeBegin, takeEnd :: LhsString -> Maybe String
takeBegin = (`stripPrefix` "\\begin{code}")
takeEnd = (`stripPrefix` "\\end{code}")
-- I really need to get around to proposing this for the standard libraries
-- , says Ian Lynagh
stripPrefix :: Eq a => [a] -> [a] -> Maybe [a]
xs `stripPrefix` [] = Just xs
[] `stripPrefix` _ = Nothing
(x:xs) `stripPrefix` (y:ys)
| x == y = xs `stripPrefix` ys
| otherwise = Nothing
badAdjacentLines :: LhsString -> LhsString -> Bool
badAdjacentLines l1 l2 = bad l1 l2 || bad l2 l1
where bad a b = isJust (birdProg a) && nonEmptyLiterateCommentLine b
--[should I mind using pattern guards?]
--[except for errors, I always return as many lines as input,
-- to preserve line numbering and such; is there a good
-- higher-order function to express the following pattern?]
file, texBlock :: [LhsString] -> [HsString]
file [] = []
file (l:ls)
| (fmap (badAdjacentLines l) (listToMaybe ls)) == Just True
= error "literate comment adjacent to code"
| Just l' <- birdProg l = l' : ls
| literateCommentLine l = "" : file ls
| beginCode l = "" : texBlock ls
| endCode l = error "\\end{code} not in code block"
| otherwise = error "{code} followed in line by visible text"
texBlock (l:ls)
| Just l' <- texProg l = l' : texBlock ls
| beginCode l = error "\\begin{code} inside code block"
| endCode l = "" : file ls
| otherwise = error "{code} followed in line by visible text"
texBlock [] = error "\\end{code} expected"
translateFile :: LhsString -> HsString
translateFile = unlines . file . lines
@
\eprog
%**~footer