(x )(y ) (x )(y z) (x z)(y ) (x z)(y z)
Modules
It’s about time to add support for modules: our last compiler is almost 2000 lines of code.
Party
Continuing our RPG analogy, we liken modules to parties of heroes. Even in single-player games, we often direct a group of specialists rather than one powerful multi-talented being. Perhaps we enjoy observing teamwork, or following gaming traditions.
But perhaps we also like parties for the same reasons we decompose a large problem into digestible subproblems. For example, we might only need to think about the intricacies of magic spells when controlling the wizard character.
As usual, our first stab has limitations. This party is just getting started.
-
To avoid dealing with files, our compiler reads the concatenation of all the modules on standard input.
-
To keep parser changes minimal, all symbols are exported, and all module imports are unqualified.
-
Fixity declarations must precede all occurrences of their corresponding operators in the standard input.
-
At most one module should declare FFI imports.
-
Cyclic module dependencies cause an infinite loop.
On the other hand, we proudly support import
statements anywhere in the file,
and multiple modules in the same file.
In fact, this is why fixity declarations must appear first in the input. GHC insists on one module per file with imports appearing before other declarations, hence its parser can process imports before reaching any expressions and determine the fixity of any operators that appear when it later reaches them. With our scheme, we may encounter an operator in an expression before learning its fixity, which confuses our simple parser. In a later compiler we’ll address this issue.
We tweak the parser to support module
and import
, and add a new field to
Neat
that hold the imports of a module. A module with no explicit module
declaration is taken to be the Main
module. Concatenation implies such a
module would have to appear first.
We add a new Link
field to the Extra
data type, which holds the module,
symbol, and type of a top-level variable defined in another module. During
inference, we replace a V
field with a Link
field if we find it is exported
from one of the imported modules.
We introduce a one-off Dep
monad because we lack monad transformers, and
would like a combination of the Either
and State
monads when finding the
dependencies of a definition.
Up until now, all symbols were global across a single file. As we Scott-encoded
ADTs and generated types and selector functions for typeclass methods, we
simply threw them on a big pile in a Neat
value being passed around.
Modules force us to be more careful.
We invent a special module "#" preloaded with built-in definitions required by the Haskell syntax we support:
-
The unit type and value
()
is part of the language. -
If expressions and guards require
Bool
,True
, andFalse
. -
Pairs are part of the language, even though suitably defined ADTs could be used instead (the IOCCC edition of our compiler does this to save room). Curiously, Haskell has no built-in type for the dual of pairs; requiring the programmer to define
Either
. -
Lists are famously built into Haskell.
-
String literals require lists.
-
We compile recursive let definitions with
fix
. -
Operations involving native integer types:
chr ord intAdd intMul
and so on. -
Primitives for IO monad methods.
-
The RTS reduces
fail#
on failed case matches.
Each module implicitly imports this special "#" module, so these built-in primitives are accessible to all.
This is a good time to mention that rewriting means:
-
Ranges become expressions involving
enumFromTo
andenumFrom
. -
Failed pattern matches are
undefined
. -
We need
pure
(forpure ()
) and>>=
to supportdo
notation. -
Patterns containing integer and character literals require
(==)
. -
List comprehensions are expressed in terms of
concatMap
andpure
.
None of these are built-in; they must be explicitly defined at the top level if
these language features are used. The last of these implies we must define an
Applicative
instance for lists. To remove these gotchas, we could define
low-level primitives as we do for the others.
Code generation now has two phases. The first corresponds to GHC incrementally
compiling a module: it resolves all locally defined symbols, and leaves Link
values indicating where to put addresses of symbols defined elsewhere. The
generated code is not position-independent; rather, for each module, we are
given the current heap pointer, and return an updated heap pointer.
The second phase replaces all Link
values with heap addresses, as all
entities are in the heap by this point.
Modules make us regret older expedient decisions regarding typeclasses.
We threw default method definitions in one data structure, and lumped together
method type signatures and instances in another. But now we might find a
typeclass definition in one module, and an instance of it in another, so our
code that searches imports for this information is messy. For example,
the fillSigs
helper raids other modules for the types of methods.
We had previously substituted the syntax trees for default method
implementations straight into instances. If we one day want incremental
compilation, then it is likely easier to compile a default implementation
once, then access it from other modules via a layer of indirection. With this
in mind, for each method foo
, we generate a method called {default}foo
whose body is the default method implementation of foo
if given, and
fail#
otherwise.
Since we toss dictionary selector functions on to a big pile of ordinary
definitions, to find the type of a method we add typeOfMethod
, whose logic is
similar to findImportSym
, but differs enough that we implement it
independently.
We modify the code to insert dictionaries one strongly-connected-component at a time rather than one function at a time. This is required to correctly compile mutually recursive functions that use typeclasses. Each function of the component may wind up calling any other, so it needs all the relevant dictionaries.
Up until now we had preserved topological order of the top-level definitions as they made their way through our compiler. We change the code generator so it no longer needs this precondition, so that we can store compiled functions and modules in maps rather than delicately manicured lists.
We introduce a single combinator to act as BK
which frequently occurs due to
Scott encoding.
If B K x y z = x y
is reduced individually, our virtual machine allocates
a new app-cell for K (x y)
, only to immediately rewrite it as I (x y)
,
which again must be reduced to yield x y
at last. The BK
combinator
avoids this needless song and dance.
A dedicated BK
combinator is also aesthetically pleasing. Consider some
three-argument combinator given x y z
. We can leave x
alone or apply it to
z
, and similarly for y
, and then apply the first thing to the second:
The last 3 are the B C S
combinators. The first one is BK
.
Smullyan appears not to have assigned a bird to this combinator, so we resort
to the clunky name BK
throughout our code.
The BK combinator makes it easy for optim
to rewrite B BK V
as CONS
.
We add the LEFT
combinator, which is equivalent to B BK T
and also arises
frequently in Scott encodings; indeed, the data constructor Left
compiles to
LEFT
. We add the KI combinator to shave off a few more reductions.
-- Modules. infixr 9 ! infixr 9 . infixl 7 * , `div` , `mod` infixl 6 + , - infixr 5 ++ infixl 4 <*> , <$> , <* , *> infix 4 == , /= , <= infixl 3 && , <|> infixl 2 || infixl 1 >> , >>= infixr 0 $ foreign import ccall "putchar_shim" putChar :: Char -> IO () foreign import ccall "getchar_shim" getChar :: IO Char foreign import ccall "eof_shim" isEOFInt :: IO Int foreign import ccall "getargcount" getArgCount :: IO Int foreign import ccall "getargchar" getArgChar :: Int -> Int -> IO Char libc = [r|#include<stdio.h> static int env_argc; int getargcount() { return env_argc; } static char **env_argv; int getargchar(int n, int k) { return env_argv[n][k]; } static int nextCh, isAhead; int eof_shim() { if (!isAhead) { isAhead = 1; nextCh = getchar(); } return nextCh == -1; } void exit(int); void putchar_shim(int c) { putchar(c); } int getchar_shim() { if (!isAhead) nextCh = getchar(); if (nextCh == -1) exit(1); isAhead = 0; return nextCh; } void errchar(int c) { fputc(c, stderr); } void errexit() { fputc('\n', stderr); } |] class Functor f where fmap :: (a -> b) -> f a -> f b class Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (<$>) = fmap liftA2 f x y = f <$> x <*> y (>>) f g = f >>= \_ -> g class Eq a where (==) :: a -> a -> Bool instance Eq Int where (==) = intEq instance Eq Char where (==) = charEq ($) f x = f x id x = x const x y = x flip f x y = f y x (&) x f = f x class Ord a where (<=) :: a -> a -> Bool x <= y = case compare x y of LT -> True EQ -> True GT -> False compare :: a -> a -> Ordering compare x y = if x <= y then if y <= x then EQ else LT else GT instance Ord Int where (<=) = intLE instance Ord Char where (<=) = charLE data Ordering = LT | GT | EQ instance Ord a => Ord [a] where xs <= ys = case xs of [] -> True x:xt -> case ys of [] -> False y:yt -> if x <= y then if y <= x then xt <= yt else True else False compare xs ys = case xs of [] -> case ys of [] -> EQ _ -> LT x:xt -> case ys of [] -> GT y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT data Maybe a = Nothing | Just a data Either a b = Left a | Right b fst (x, y) = x snd (x, y) = y uncurry f (x, y) = f x y first f (x, y) = (f x, y) second f (x, y) = (x, f y) not a = if a then False else True x /= y = not $ x == y (.) f g x = f (g x) (||) f g = if f then True else g (&&) f g = if f then g else False instance Eq a => Eq [a] where xs == ys = case xs of [] -> case ys of [] -> True _ -> False x:xt -> case ys of [] -> False y:yt -> x == y && xt == yt take 0 xs = [] take _ [] = [] take n (h:t) = h : take (n - 1) t maybe n j m = case m of Nothing -> n; Just x -> j x instance Functor Maybe where fmap f = maybe Nothing (Just . f) instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x foldr c n = \case [] -> n; h:t -> c h $ foldr c n t length = foldr (\_ n -> n + 1) 0 mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure []) mapM_ f = foldr ((>>) . f) (pure ()) foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0 instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y) instance Monad IO where return = ioPure ; (>>=) = ioBind instance Functor IO where fmap f x = ioPure f <*> x class Show a where showsPrec :: Int -> a -> String -> String showsPrec _ x = (show x++) show :: a -> String show x = shows x "" showList :: [a] -> String -> String showList = showList__ shows shows = showsPrec 0 showList__ _ [] s = "[]" ++ s showList__ showx (x:xs) s = '[' : showx x (showl xs) where showl [] = ']' : s showl (y:ys) = ',' : showx y (showl ys) showInt__ n | 0 == n = id | True = showInt__ (n`div`10) . (chr (48+n`mod`10):) instance Show () where show () = "()" instance Show Bool where show True = "True" show False = "False" instance Show a => Show [a] where showsPrec _ = showList instance Show Int where showsPrec _ n | 0 == n = ('0':) | 1 <= n = showInt__ n | 2 * n == 0 = ("-2147483648"++) | True = ('-':) . showInt__ (0 - n) showLitChar__ '\n' = ("\\n"++) showLitChar__ '\\' = ("\\\\"++) showLitChar__ c = (c:) instance Show Char where showsPrec _ '\'' = ("'\\''"++) showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':) showList s = ('"':) . foldr (.) id (map go s) . ('"':) where go '"' = ("\\\""++) go c = showLitChar__ c instance (Show a, Show b) => Show (a, b) where showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b isEOF = (0 /=) <$> isEOFInt putStr = mapM_ putChar putStrLn = (>> putChar '\n') . putStr print = putStrLn . show getContents = isEOF >>= \b -> if b then pure [] else getChar >>= \c -> (c:) <$> getContents interact f = getContents >>= putStr . f getArgs = getArgCount >>= \n -> mapM (go 0) [1..n-1] where go k n = getArgChar n k >>= \c -> if ord c == 0 then pure [] else (c:) <$> go (k + 1) n error s = unsafePerformIO $ putStr s >> putChar '\n' >> exitSuccess undefined = error "undefined" foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a foldl1 f (h:t) = foldl f h t elem k xs = foldr (\x t -> x == k || t) False xs find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs (++) = flip (foldr (:)) concat = foldr (++) [] map = flip (foldr . ((:) .)) [] head (h:_) = h tail (_:t) = t isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160] instance Functor [] where fmap = map instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f instance Monad [] where return = (:[]); (>>=) = flip concatMap concatMap = (concat .) . map lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing filter f = foldr (\x xs -> if f x then x:xs else xs) [] union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt init (x:xt) = case xt of [] -> []; _ -> x : init xt intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt) all f = foldr (&&) True . map f any f = foldr (||) False . map f and = foldr (&&) True or = foldr (||) False zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt zip = zipWith (,) data State s a = State (s -> (a, s)) runState (State f) = f instance Functor (State s) where fmap f = \(State h) -> State (first f . h) instance Applicative (State s) where pure a = State (a,) (State f) <*> (State x) = State \s -> case f s of (g, s') -> first g $ x s' instance Monad (State s) where return a = State (a,) (State h) >>= f = State $ uncurry (runState . f) . h evalState m s = fst $ runState m s get = State \s -> (s, s) put n = State \s -> ((), n) either l r e = case e of Left x -> l x; Right x -> r x instance Functor (Either a) where fmap f e = either Left (Right . f) e instance Applicative (Either a) where pure = Right ef <*> ex = case ef of Left s -> Left s Right f -> either Left (Right . f) ex instance Monad (Either a) where return = Right ex >>= f = either Left f ex class Alternative f where empty :: f a (<|>) :: f a -> f a -> f a asum = foldr (<|>) empty (*>) = liftA2 \x y -> y (<*) = liftA2 \x y -> x many p = liftA2 (:) p (many p) <|> pure [] some p = liftA2 (:) p (many p) sepBy1 p sep = liftA2 (:) p (many (sep *> p)) sepBy p sep = sepBy1 p sep <|> pure [] between x y p = x *> (p <* y) -- Map. data Map k a = Tip | Bin Int k a (Map k a) (Map k a) instance Functor (Map k) where fmap f m = case m of Tip -> Tip Bin sz k x l r -> Bin sz k (f x) (fmap f l) (fmap f r) size m = case m of Tip -> 0 ; Bin sz _ _ _ _ -> sz node k x l r = Bin (1 + size l + size r) k x l r singleton k x = Bin 1 k x Tip Tip singleL k x l (Bin _ rk rkx rl rr) = node rk rkx (node k x l rl) rr doubleL k x l (Bin _ rk rkx (Bin _ rlk rlkx rll rlr) rr) = node rlk rlkx (node k x l rll) (node rk rkx rlr rr) singleR k x (Bin _ lk lkx ll lr) r = node lk lkx ll (node k x lr r) doubleR k x (Bin _ lk lkx ll (Bin _ lrk lrkx lrl lrr)) r = node lrk lrkx (node lk lkx ll lrl) (node k x lrr r) balance k x l r = f k x l r where f | size l + size r <= 1 = node | 5 * size l + 3 <= 2 * size r = case r of Tip -> node Bin sz _ _ rl rr -> if 2 * size rl + 1 <= 3 * size rr then singleL else doubleL | 5 * size r + 3 <= 2 * size l = case l of Tip -> node Bin sz _ _ ll lr -> if 2 * size lr + 1 <= 3 * size ll then singleR else doubleR | True = node insert kx x t = case t of Tip -> singleton kx x Bin sz ky y l r -> case compare kx ky of LT -> balance ky y (insert kx x l) r GT -> balance ky y l (insert kx x r) EQ -> Bin sz kx x l r insertWith f kx x t = case t of Tip -> singleton kx x Bin sy ky y l r -> case compare kx ky of LT -> balance ky y (insertWith f kx x l) r GT -> balance ky y l (insertWith f kx x r) EQ -> Bin sy kx (f x y) l r mlookup kx t = case t of Tip -> Nothing Bin _ ky y l r -> case compare kx ky of LT -> mlookup kx l GT -> mlookup kx r EQ -> Just y fromList = foldl (\t (k, x) -> insert k x t) Tip member k t = maybe False (const True) $ mlookup k t t ! k = maybe undefined id $ mlookup k t foldrWithKey f = go where go z t = case t of Tip -> z Bin _ kx x l r -> go (f kx x (go z r)) l toAscList = foldrWithKey (\k x xs -> (k,x):xs) [] keys = map fst . toAscList -- Syntax tree. data Type = TC String | TV String | TAp Type Type arr a b = TAp (TAp (TC "->") a) b data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat] data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred data Constr = Constr String [Type] data Pred = Pred String Type data Qual = Qual [Pred] Type instance Eq Type where (TC s) == (TC t) = s == t (TV s) == (TV t) = s == t (TAp a b) == (TAp c d) = a == c && b == d _ == _ = False instance Eq Pred where (Pred s a) == (Pred t b) = s == t && a == b data Instance = Instance -- Type, e.g. Int for Eq Int. Type -- Dictionary name, e.g. "{Eq Int}" String -- Context. [Pred] -- Method definitions (Map String Ast) data Tycl = Tycl [String] [Instance] data Neat = Neat (Map String Tycl) -- | Top-level definitions [(String, Ast)] -- | Typed ASTs, ready for compilation, including ADTs and methods, -- e.g. (==), (Eq a => a -> a -> Bool, select-==) [(String, (Qual, Ast))] -- | Data constructor table. (Map String [Constr]) -- AdtTab -- | FFI declarations. [(String, Type)] -- | Exports. [(String, String)] -- | Module imports. [String] patVars = \case PatLit _ -> [] PatVar s m -> s : maybe [] patVars m PatCon _ args -> concat $ patVars <$> args fvPro bound expr = case expr of V s | not (elem s bound) -> [s] A x y -> fvPro bound x `union` fvPro bound y L s t -> fvPro (s:bound) t Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts _ -> [] overFreePro s f t = case t of E _ -> t V s' -> if s == s' then f t else t A x y -> A (overFreePro s f x) (overFreePro s f y) L s' t' -> if s == s' then t else L s' $ overFreePro s f t' Pa vsts -> Pa $ map (\(vs, t) -> (vs, if any (elem s . patVars) vs then t else overFreePro s f t)) vsts beta s a t = case t of E _ -> t V v -> if s == v then a else t A x y -> A (beta s a x) (beta s a y) L v u -> if s == v then t else L v $ beta s a u showParen b f = if b then ('(':) . f . (')':) else f -- Parser. data ParserState = ParserState [(Char, (Int, Int))] String [Int] (Map String (Int, Assoc)) readme (ParserState x _ _ _) = x landin (ParserState _ x _ _) = x indents (ParserState _ _ x _) = x precs (ParserState _ _ _ x) = x putReadme x (ParserState _ a b c) = ParserState x a b c putLandin x (ParserState a _ b c) = ParserState a x b c modIndents f (ParserState a b x c) = ParserState a b (f x) c data Parser a = Parser (ParserState -> Either String (a, ParserState)) getParser (Parser p) = p instance Functor Parser where fmap f x = pure f <*> x instance Applicative Parser where pure x = Parser \inp -> Right (x, inp) (Parser f) <*> (Parser x) = Parser \inp -> do (fun, t) <- f inp (arg, u) <- x t pure (fun arg, u) instance Monad Parser where return = pure (Parser x) >>= f = Parser \inp -> do (a, t) <- x inp getParser (f a) t instance Alternative Parser where empty = bad "" x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp getPrecs = Parser \st -> Right (precs st, st) putPrecs ps = Parser \(ParserState a b c _) -> Right ((), ParserState a b c ps) notFollowedBy p = do saved <- Parser \pasta -> Right (pasta, pasta) ret <- p *> pure (bad "") <|> pure (pure ()) Parser \_ -> Right ((), saved) ret parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] $ singleton ":" (5, RAssoc) where rowcol s rc = case s of [] -> [] h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc) advanceRC n (r, c) | n `elem` [10, 11, 12, 13] = (r + 1, 1) | n == 9 = (r, (c + 8)`mod`8) | True = (r, c + 1) indentOf pasta = case readme pasta of [] -> 1 (_, (_, c)):_ -> c ins c pasta = putLandin (c:landin pasta) pasta angle n pasta = case indents pasta of m:ms | m == n -> ins ';' pasta | n + 1 <= m -> ins '}' $ angle n $ modIndents tail pasta _ -> pasta curly n pasta = case indents pasta of m:ms | m + 1 <= n -> ins '{' $ modIndents (n:) pasta [] | 1 <= n -> ins '{' $ modIndents (n:) pasta _ -> ins '{' . ins '}' $ angle n pasta sat f = Parser \pasta -> case landin pasta of c:t -> if f c then Right (c, putLandin t pasta) else Left "unsat" [] -> case readme pasta of [] -> case indents pasta of [] -> Left "EOF" m:ms | m /= 0 && f '}' -> Right ('}', modIndents tail pasta) _ -> Left "unsat" (h, _):t | f h -> let p' = putReadme t pasta in case h of '}' -> case indents pasta of 0:ms -> Right (h, modIndents tail p') _ -> Left "unsat" '{' -> Right (h, modIndents (0:) p') _ -> Right (h, p') _ -> Left "unsat" char c = sat (c ==) rawSat f = Parser \pasta -> case readme pasta of [] -> Left "EOF" (h, _):t -> if f h then Right (h, putReadme t pasta) else Left "unsat" eof = Parser \pasta -> case pasta of ParserState [] [] _ _ -> Right ((), pasta) _ -> badpos pasta "want eof" comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *> (rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True spaces = isNewline <$> rawSat isSpace whitespace = do offside <- or <$> many (spaces <|> comment) Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta) hexValue d | d <= '9' = ord d - ord '0' | d <= 'F' = 10 + ord d - ord 'A' | d <= 'f' = 10 + ord d - ord 'a' isNewline c = ord c `elem` [10, 11, 12, 13] isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:") isSmall c = c <= 'z' && 'a' <= c || c == '_' small = sat isSmall large = sat \x -> (x <= 'Z') && ('A' <= x) hexit = sat \x -> (x <= '9') && ('0' <= x) || (x <= 'F') && ('A' <= x) || (x <= 'f') && ('a' <= x) digit = sat \x -> (x <= '9') && ('0' <= x) decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit nameTailChar = small <|> large <|> digit <|> char '\'' nameTailed p = liftA2 (:) p $ many nameTailChar escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure (chr 0) <|> char 'x' *> (chr <$> hexadecimal)) tokOne delim = escape <|> rawSat (delim /=) charSeq = mapM char tokChar = between (char '\'') (char '\'') (tokOne '\'') quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"' quasiquoteStr = charSeq "[r|" *> quasiquoteBody quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody tokStr = quoteStr <|> quasiquoteStr integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr varish = lexeme $ nameTailed small bad s = Parser \pasta -> badpos pasta s badpos pasta s = Left $ loc $ ": " ++ s where loc = case readme pasta of [] -> ("EOF"++) (_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c varId = do s <- varish if elem s ["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"] then bad $ "reserved: " ++ s else pure s varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol) varSym = lexeme $ do s <- varSymish if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s conId = lexeme $ nameTailed large conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol conSym = do s <- conSymish if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s special c = lexeme $ sat (c ==) comma = special ',' semicolon = special ';' lParen = special '(' rParen = special ')' lBrace = special '{' rBrace = special '}' lSquare = special '[' rSquare = special ']' backquote = special '`' lexeme f = f <* whitespace lexemePrelude = whitespace *> Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of Left _ -> Right ((), curly (indentOf pasta) pasta) Right _ -> Right ((), pasta) curlyCheck f = do Parser \pasta -> Right ((), modIndents (0:) pasta) r <- f Parser \pasta -> let pasta' = modIndents tail pasta in case readme pasta of [] -> Right ((), curly 0 pasta') ('{', _):_ -> Right ((), pasta') (_, (_, col)):_ -> Right ((), curly col pasta') pure r conOf (Constr s _) = s specialCase (h:_) = '|':conOf h mkCase t cs = (specialCase cs, ( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ ts) -> foldr arr (TV "case") ts) cs , E $ Basic "I")) mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", []) scottEncode _ ":" _ = E $ Basic "CONS" scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs) scottConstr t cs (Constr s ts) = (s, (Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts)) mkAdtDefs t cs = mkCase t cs : map (scottConstr t cs) cs mkFFIHelper n t acc = case t of TC s -> acc TAp (TC "IO") _ -> acc TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs addAdt t cs (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl fs (mkAdtDefs t cs ++ typed) (updateDcs cs dcs) ffis ffes ims emptyTycl = Tycl [] [] addClass classId v (sigs, defs) (Neat tycl fs typed dcs ffis ffes ims) = let vars = take (size sigs) $ show <$> [0..] selectors = zipWith (\var (s, t) -> (s, (Qual [Pred classId v] t, L "@" $ A (V "@") $ foldr L (V var) vars))) vars $ toAscList sigs defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs Tycl ms is = maybe emptyTycl id $ mlookup classId tycl tycl' = insert classId (Tycl (keys sigs) is) tycl in if null ms then Neat tycl' (defaults ++ fs) (selectors ++ typed) dcs ffis ffes ims else error $ "duplicate class: " ++ classId addInstance classId ps ty ds (Neat tycl fs typed dcs ffis ffes ims) = let Tycl ms is = maybe emptyTycl id $ mlookup classId tycl tycl' = insert classId (Tycl ms $ Instance ty name ps (fromList ds):is) tycl name = '{':classId ++ (' ':shows ty "}") in Neat tycl' fs typed dcs ffis ffes ims addFFI foreignname ourname t (Neat tycl fs typed dcs ffis ffes ims) = let fn = A (E $ Basic "F") $ E $ Const $ length ffis in Neat tycl fs ((ourname, (Qual [] t, mkFFIHelper 0 t fn)) : typed) dcs ((foreignname, t):ffis) ffes ims addDefs ds (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl (ds ++ fs) typed dcs ffis ffes ims addImport im (Neat tycl fs typed dcs ffis exs ims) = Neat tycl fs typed dcs ffis exs (im:ims) addExport e f (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl fs typed dcs ffis ((e, f):ffes) ims parseErrorRule = Parser \pasta -> case indents pasta of m:ms | m /= 0 -> Right ('}', modIndents tail pasta) _ -> badpos pasta "missing }" res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where reservedSeq = if elem w ["let", "where", "do", "of"] then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol) paren = between lParen rParen braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x nonemptyTails [] = [] nonemptyTails xs@(x:xt) = xs : nonemptyTails xt joinIsFail t = A (L "join#" t) (V "fail#") addLets ls x = foldr triangle x components where vs = fst <$> ls ios = foldr (\(s, dsts) (ins, outs) -> (foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs)) (Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs triangle names expr = let tnames = nonemptyTails names suball t = foldr (\(x:xt) t -> overFreePro x (const $ foldl (\acc s -> A acc (V s)) (V x) xt) t) t tnames insLams vs t = foldr L t vs in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ insLams xt $ suball $ maybe undefined joinIsFail $ lookup x ls) (suball expr) tnames data Assoc = NAssoc | LAssoc | RAssoc instance Eq Assoc where NAssoc == NAssoc = True LAssoc == LAssoc = True RAssoc == RAssoc = True _ == _ = False precOf s precTab = maybe 9 fst $ mlookup s precTab assocOf s precTab = maybe LAssoc snd $ mlookup s precTab opFold precTab f x xs = case xs of [] -> pure x (op, y):xt -> case find (\(op', _) -> assocOf op precTab /= assocOf op' precTab) xt of Nothing -> case assocOf op precTab of NAssoc -> case xt of [] -> pure $ f op x y y:yt -> bad "NAssoc repeat" LAssoc -> pure $ foldl (\a (op, y) -> f op a y) x xs RAssoc -> pure $ foldr (\(op, y) b -> \e -> f op e (b y)) id xs $ x Just y -> bad "Assoc clash" qconop = conSym <|> res ":" <|> between backquote backquote conId qconsym = conSym <|> res ":" op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId) con = conId <|> paren qconsym var = varId <|> paren varSym tycon = do s <- conId pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s aType = lParen *> ( rParen *> pure (TC "()") <|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen) <|> tycon <|> TV <$> varId <|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare))) bType = foldl1 TAp <$> some aType _type = foldr1 arr <$> sepBy bType (res "->") fixityDecl w a = do res w n <- lexeme integer os <- sepBy op comma precs <- getPrecs putPrecs $ foldr (\o m -> insert o (n, a) m) precs os fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi genDecl = (,) <$> var <*> (res "::" *> _type) classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls)) simpleClass = Pred <$> conId <*> _type scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma) instDecl = res "instance" *> ((\ps cl ty defs -> addInstance cl ps ty defs) <$> (scontext <* res "=>" <|> pure []) <*> conId <*> _type <*> (res "where" *> braceDef)) letin = addLets <$> between (res "let") (res "in") braceDef <*> expr ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$> (res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr) listify = foldr (\h t -> A (A (V ":") h) t) (V "[]") alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> guards "->") cas = flip A <$> between (res "case") (res "of") expr <*> alts lamCase = curlyCheck (res "case") *> alts lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr)) flipPairize y x = A (A (V ",") x) y moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ","))) parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id) rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()")) maybePureUnit = maybe (V "pure" `A` V "()") id stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr) <|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr <|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef) doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt) compQual = (\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs) <$> pat <*> (res "<-" *> expr) <|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr <|> addLets <$> (res "let" *> braceDef) sqExpr = between lSquare rSquare $ ((&) <$> expr <*> ( res ".." *> ( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr <|> pure (A (V "enumFrom")) ) <|> res "|" *> ((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma) <|> (\t h -> listify (h:t)) <$> many (comma *> expr) ) ) <|> pure (V "[]") atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section <|> cas <|> lam <|> (paren comma *> pure (V ",")) <|> V <$> (con <|> var) <|> literal aexp = foldl1 A <$> some atom withPrec precTab n p = p >>= \s -> if n == precOf s precTab then pure s else Parser $ const $ Left "" exprP n = if n <= 9 then getPrecs >>= \precTab -> exprP (succ n) >>= \a -> many ((,) <$> withPrec precTab n op <*> exprP (succ n)) >>= \as -> opFold precTab (\op x y -> A (A (V op) x) y) a as else aexp expr = exprP 0 gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> (lSquare *> rSquare *> pure "[]") apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing) <|> flip PatVar Nothing <$> (res "_" *> pure "_") <|> flip PatCon [] <$> gcon <|> PatLit <$> literal <|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" []) <$> between lSquare rSquare (sepBy pat comma) <|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" [])) where pairPat x y = PatCon "," [x, y] binPat f x y = PatCon f [x, y] patP n = if n <= 9 then getPrecs >>= \precTab -> patP (succ n) >>= \a -> many ((,) <$> withPrec precTab n qconop <*> patP (succ n)) >>= \as -> opFold precTab binPat a as else PatCon <$> gcon <*> many apat <|> apat pat = patP 0 maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id) guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "join#") <$> some ((\x y -> case x of V "True" -> \_ -> y _ -> A (A (A (V "if") x) y) ) <$> (res "|" *> expr) <*> (res s *> expr)) onePat vs x = joinIsFail $ Pa [(vs, x)] defOnePat vs x = Pa [(vs, x)] opDef x f y rhs = [(f, defOnePat [x, y] rhs)] leftyPat p expr = case pvars of [] -> [] (h:t) -> let gen = '@':h in (gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars where pvars = filter (/= "_") $ patVars p def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ guards "=") <|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=") coalesce = \case [] -> [] h@(s, x):t -> case t of [] -> [h] (s', x'):t' -> let f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts' f _ _ = error "bad multidef" in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t defSemi = coalesce . concat <$> sepBy1 def (some semicolon) braceDef = concat <$> braceSep defSemi simpleType c vs = foldl TAp (TC c) (map TV vs) conop = conSym <|> between backquote backquote conId constr = (\x c y -> Constr c [x, y]) <$> aType <*> conop <*> aType <|> Constr <$> conId <*> many aType adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") impDecl = addImport <$> (res "import" *> conId) topdecls = braceSep $ adt <|> classDecl <|> instDecl <|> res "foreign" *> ( res "import" *> var *> (addFFI <$> lexeme tokStr <*> var <*> (res "::" *> _type)) <|> res "export" *> var *> (addExport <$> lexeme tokStr <*> var) ) <|> addDefs <$> defSemi <|> fixity *> pure id <|> impDecl haskell = between lexemePrelude eof $ some $ (,) <$> (res "module" *> conId <* res "where" <|> pure "Main") <*> topdecls parseProgram s = fst <$> parse haskell s -- Primitives. primAdts = [ (TC "()", [Constr "()" []]) , (TC "Bool", [Constr "True" [], Constr "False" []]) , (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" [TV "a", TAp (TC "[]") (TV "a")]]) , (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," [TV "a", TV "b"]]) ] prims = let ro = E . Basic dyad s = TC s `arr` (TC s `arr` TC s) bin s = A (ro "Q") (ro s) in map (second (first $ Qual [])) $ [ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ")) , ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE")) , ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ")) , ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE")) , ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y")) , ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I")) , ("chr", (arr (TC "Int") (TC "Char"), ro "I")) , ("ord", (arr (TC "Char") (TC "Int"), ro "I")) , ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C")) , ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V")) , ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR")) , ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF")) , ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")), A (ro "T") (ro "READREF"))) , ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))), A (A (ro "R") (ro "WRITEREF")) (ro "B"))) , ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END")) , ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K"))) , ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) , ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) ] ++ map (\(s, v) -> (s, (dyad "Int", bin v))) [ ("intAdd", "ADD") , ("intSub", "SUB") , ("intMul", "MUL") , ("intDiv", "DIV") , ("intMod", "MOD") , ("intQuot", "DIV") , ("intRem", "MOD") ] -- Conversion to De Bruijn indices. data LC = Ze | Su LC | Pass IntTree | La LC | App LC LC debruijn n e = case e of E x -> Pass $ Lf x V v -> maybe (Pass $ LfVar v) id $ foldr (\h found -> if h == v then Just Ze else Su <$> found) Nothing n A x y -> App (debruijn n x) (debruijn n y) L s t -> La (debruijn (s:n) t) -- Kiselyov bracket abstraction. data IntTree = Lf Extra | LfVar String | Nd IntTree IntTree data Sem = Defer | Closed IntTree | Need Sem | Weak Sem lf = Lf . Basic x ## y = case x of Defer -> case y of Defer -> Need $ Closed (Nd (Nd (lf "S") (lf "I")) (lf "I")) Closed d -> Need $ Closed (Nd (lf "T") d) Need e -> Need $ Closed (Nd (lf "S") (lf "I")) ## e Weak e -> Need $ Closed (lf "T") ## e Closed d -> case y of Defer -> Need $ Closed d Closed dd -> Closed $ Nd d dd Need e -> Need $ Closed (Nd (lf "B") d) ## e Weak e -> Weak $ Closed d ## e Need e -> case y of Defer -> Need $ Closed (lf "S") ## e ## Closed (lf "I") Closed d -> Need $ Closed (Nd (lf "R") d) ## e Need ee -> Need $ Closed (lf "S") ## e ## ee Weak ee -> Need $ Closed (lf "C") ## e ## ee Weak e -> case y of Defer -> Need e Closed d -> Weak $ e ## Closed d Need ee -> Need $ Closed (lf "B") ## e ## ee Weak ee -> Weak $ e ## ee babs t = case t of Ze -> Defer Su x -> Weak $ babs x Pass x -> Closed x La t -> case babs t of Defer -> Closed $ lf "I" Closed d -> Closed $ Nd (lf "K") d Need e -> e Weak e -> Closed (lf "K") ## e App x y -> babs x ## babs y nolam x = (\(Closed d) -> d) $ babs $ debruijn [] x optim t = case t of Nd x y -> go (optim x) (optim y) _ -> t where go (Lf (Basic "I")) q = q go p q@(Lf (Basic c)) = case c of "K" -> case p of Lf (Basic "B") -> lf "BK" _ -> Nd p q "I" -> case p of Lf (Basic r) -> case r of "C" -> lf "T" "B" -> lf "I" "K" -> lf "KI" _ -> Nd p q Nd p1 p2 -> case p1 of Lf (Basic "B") -> p2 Lf (Basic "R") -> Nd (lf "T") p2 _ -> Nd (Nd p1 p2) q _ -> Nd p q "T" -> case p of Nd (Lf (Basic "B")) (Lf (Basic r)) -> case r of "C" -> lf "V" "BK" -> lf "LEFT" _ -> Nd p q _ -> Nd p q "V" -> case p of Nd (Lf (Basic "B")) (Lf (Basic "BK")) -> lf "CONS" _ -> Nd p q _ -> Nd p q go p q = Nd p q app01 s x y = maybe (A (L s x) y) snd $ go x where go expr = case expr of E _ -> Just (False, expr) V v -> Just $ if s == v then (True, y) else (False, expr) A l r -> do (a, l') <- go l (b, r') <- go r if a && b then Nothing else pure (a || b, A l' r') L v t -> if v == s then Just (False, expr) else second (L v) <$> go t optiApp t = case t of A x y -> let x' = optiApp x y' = optiApp y in case x' of L s v -> app01 s v y' _ -> A x' y' L s x -> L s (optiApp x) _ -> t -- Pattern compiler. rewritePats dcs = \case [] -> pure $ V "join#" vsxs@((as0, _):_) -> case as0 of [] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs _ -> do let k = length as0 n <- get put $ n + k let vs@(vh:vt) = take k $ (`shows` "#") <$> [n..] cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase dcs v Tip [(p, b)]) x (zip at vt) flip (foldr L) vs <$> rewriteCase dcs vh Tip cs patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y rewriteCase dcs caseVar tab = \case [] -> flush $ V "join#" ((v, x):rest) -> go v x rest where rec = rewriteCase dcs caseVar go v x rest = case v of PatLit lit -> patEq lit (V caseVar) x <$> rec Tip rest >>= flush PatVar s m -> let x' = beta s (V caseVar) x in case m of Nothing -> A (L "join#" x') <$> rec Tip rest >>= flush Just v' -> go v' x' rest PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest flush onFail = case toAscList tab of [] -> pure onFail -- TODO: Check rest of `tab` lies in cs. (firstC, _):_ -> do let cs = maybe undefined id $ dcs firstC jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts Just f -> rewritePats dcs $ f [] ) cs pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail secondM f (a, b) = (a,) <$> f b patternCompile dcs t = optiApp $ evalState (go t) 0 where go t = case t of E _ -> pure t V _ -> pure t A x y -> liftA2 A (go x) (go y) L s x -> L s <$> go x Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats dcs -- Unification and matching. apply sub t = case t of TC v -> t TV v -> maybe t id $ lookup v sub TAp a b -> TAp (apply sub a) (apply sub b) (@@) s1 s2 = map (second (apply s1)) s2 ++ s1 occurs s t = case t of TC v -> False TV v -> s == v TAp a b -> occurs s a || occurs s b varBind s t = case t of TC v -> Right [(s, t)] TV v -> Right $ if v == s then [] else [(s, t)] TAp a b -> if occurs s t then Left "occurs check" else Right [(s, t)] ufail t u = Left $ ("unify fail: "++) . shows t . (" vs "++) . shows u $ "" mgu t u = case t of TC a -> case u of TC b -> if a == b then Right [] else ufail t u TV b -> varBind b t TAp a b -> ufail t u TV a -> varBind a u TAp a b -> case u of TC b -> ufail t u TV b -> varBind b t TAp c d -> mgu a c >>= unify b d unify a b s = (@@ s) <$> mgu (apply s a) (apply s b) merge s1 s2 = if all (\v -> apply s1 (TV v) == apply s2 (TV v)) $ map fst s1 `intersect` map fst s2 then Just $ s1 ++ s2 else Nothing match h t = case h of TC a -> case t of TC b | a == b -> Just [] _ -> Nothing TV a -> Just [(a, t)] TAp a b -> case t of TAp c d -> case match a c of Nothing -> Nothing Just ac -> case match b d of Nothing -> Nothing Just bd -> merge ac bd _ -> Nothing -- Type inference. instantiate' t n tab = case t of TC s -> ((t, n), tab) TV s -> case lookup s tab of Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab) Just v -> ((v, n), tab) TAp x y -> let ((t1, n1), tab1) = instantiate' x n tab ((t2, n2), tab2) = instantiate' y n1 tab1 in ((TAp t1 t2, n2), tab2) instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab) instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where ((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps proofApply sub a = case a of Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty) A x y -> A (proofApply sub x) (proofApply sub y) L s t -> L s $ proofApply sub t _ -> a typeAstSub sub (t, a) = (apply sub t, proofApply sub a) infer typed loc ast csn@(cs, n) = case ast of E x -> Right $ case x of Const _ -> ((TC "Int", ast), csn) ChrCon _ -> ((TC "Char", ast), csn) StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn) Link im s q -> insta q V s -> maybe (Left $ "undefined: " ++ s) Right $ (\t -> ((t, ast), csn)) <$> lookup s loc <|> insta . fst <$> mlookup s typed A x y -> infer typed loc x (cs, n + 1) >>= \((tx, ax), csn1) -> infer typed loc y csn1 >>= \((ty, ay), (cs2, n2)) -> unify tx (arr ty va) cs2 >>= \cs -> Right ((va, A ax ay), (cs, n2)) L s x -> first (\(t, a) -> (arr va t, L s a)) <$> infer typed ((s, va):loc) x (cs, n + 1) where va = TV $ show n insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1)) where (Qual preds ty1, n1) = instantiate ty n findInstance tycl qn@(q, n) p@(Pred cl ty) insts = case insts of [] -> let v = '*':show n in Right (((p, v):q, n + 1), V v) (modName, Instance h name ps _):rest -> case match h ty of Nothing -> findInstance tycl qn p rest Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t) <$> findProof tycl (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps findProof tycl pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of Nothing -> findInstance tycl psn pred $ tycl classId Just s -> Right (psn, V s) prove tycl psn a = case a of Proof pred -> findProof tycl pred psn A x y -> prove tycl psn x >>= \(psn1, x1) -> second (A x1) <$> prove tycl psn1 y L s t -> second (L s) <$> prove tycl psn t _ -> Right (psn, a) data Dep a = Dep ([String] -> Either String ([String], a)) instance Functor Dep where fmap f = \(Dep mf) -> Dep \g -> do (g', x) <- mf g pure (g', f x) instance Applicative Dep where pure x = Dep \g -> Right (g, x) (Dep mf) <*> (Dep mx) = Dep \g -> do (g', f) <- mf g (g'', x) <- mx g' pure (g'', f x) addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ()) badDep s = Dep $ const $ Left s runDep (Dep f) = f [] astLink typed locals imps mods ast = runDep $ go [] ast where go bound ast = case ast of V s | elem s bound -> pure ast | member s locals -> case findImportSym imps mods s of [] -> (if member s typed then pure () else addDep s) *> pure ast _ -> badDep $ "ambiguous: " ++ s | True -> case findImportSym imps mods s of [] -> badDep $ "missing: " ++ s [(im, t)] -> pure $ E $ Link im s t _ -> badDep $ "ambiguous: " ++ s A x y -> A <$> go bound x <*> go bound y L s t -> L s <$> go (s:bound) t _ -> pure ast depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex -> if vertex `elem` visited then st else second (vertex:) $ depthFirstSearch relation (vertex:visited, sequence) (relation vertex) spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex -> if vertex `elem` visited then st else second ((:setSequence) . (vertex:)) $ depthFirstSearch relation (vertex:visited, []) (relation vertex) scc ins outs = spanning . depthFirst where depthFirst = snd . depthFirstSearch outs ([], []) spanning = snd . spanningSearch ins ([], []) forFree cond f bound t = case t of E _ -> t V s -> if (not $ s `elem` bound) && cond s then f t else t A x y -> A (rec bound x) (rec bound y) L s t' -> L s $ rec (s:bound) t' where rec = forFree cond f inferno tycl typed defmap syms = let loc = zip syms $ TV . (' ':) <$> syms principal (acc, (subs, n)) s = do expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap) ((t, a), (ms, n1)) <- infer typed loc expr (subs, n) cs <- unify (TV (' ':s)) t ms Right ((s, (t, a)):acc, (cs, n1)) gatherPreds (acc, psn) (s, (t, a)) = do (psn, a) <- prove tycl psn a pure ((s, (t, a)):acc, psn) in do (stas, (soln, _)) <- foldM principal ([], ([], 0)) syms stas <- pure $ second (typeAstSub soln) <$> stas (stas, (ps, _)) <- foldM gatherPreds ([], ([], 0)) $ second (typeAstSub soln) <$> stas let preds = fst <$> ps dicts = snd <$> ps applyDicts (s, (t, a)) = (s, (Qual preds t, foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts)) pure $ map applyDicts stas findImportSym imps mods s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s qas | im <- imps, let qas = fst $ mods ! im] inferDefs tycl defs typed = do let insertUnique m (s, (_, t)) = case mlookup s m of Nothing -> case mlookup s typed of Nothing -> Right $ insert s t m _ -> Left $ "reserved: " ++ s _ -> Left $ "duplicate: " ++ s addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs) graph = foldr addEdges (Tip, Tip) defs defmap <- foldM insertUnique Tip defs let ins k = maybe [] id $ mlookup k $ fst graph outs k = maybe [] id $ mlookup k $ snd graph typeTab = fst <$> typed inferComponent typed syms = foldr (uncurry insert) typed <$> inferno tycl typed defmap syms foldM inferComponent typed $ scc ins outs $ keys defmap dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps) inferTypeclasses tycl typeOfMethod typed dcs linker ienv = foldM perClass typed $ toAscList ienv where perClass typed (classId, Tycl sigs insts) = foldM perInstance typed insts where perInstance typed (Instance ty name ps idefs) = do let dvs = map snd $ fst $ dictVars ps 0 perMethod s = do let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs expr <- snd <$> linker (patternCompile dcs rawExpr) (ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right $ infer typed [] expr ([], 0) let (tx, ax) = typeAstSub sub ta -- e.g. qc = Eq a => a -> a -> Bool -- We instantiate: Eq a1 => a1 -> a1 -> Bool. qc = typeOfMethod s (Qual [Pred _ headT] tc, n1) = instantiate qc n -- Mix the predicates `ps` with the type of `headT`, applying a -- substitution such as (a1, [a]) so the variable names match. -- e.g. Eq a => [a] -> [a] -> Bool Just subc = match headT ty (Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1 case match tx t2 of Nothing -> Left "class/instance type conflict" Just subx -> do ((ps3, _), tr) <- prove tycl (dictVars ps2 0) (proofApply subx ax) if length ps2 /= length ps3 then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name else pure tr ms <- mapM perMethod sigs pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed neatNew = Neat Tip [] [] Tip [] [] [] neatPrim = foldr (uncurry addAdt) (Neat Tip [] prims Tip [] [] []) primAdts typedAsts (Neat _ _ tas _ _ _ _) = tas typeclasses (Neat tcs _ _ _ _ _ _) = tcs dataCons (Neat _ _ _ dcs _ _ _) = dcs soloPrim = singleton "#" (fromList $ typedAsts neatPrim, ([], [])) tabulateModules mods = foldM ins (singleton "#" neatPrim) $ go <$> mods where go (name, prog) = (name, foldr ($) neatNew prog) ins tab (k, v) = case mlookup k tab of Nothing -> Right $ insert k v tab Just _ -> Left $ "duplicate module: " ++ k null xs = case xs of [] -> True _ -> False inferModule tab acc name = case mlookup name acc of Nothing -> do let Neat rawIenv defs typedList adtTab ffis ffes rawImps = tab ! name typed = fromList typedList fillSigs (cl, Tycl sigs is) = (cl,) $ case sigs of [] -> Tycl (findSigs cl) is _ -> Tycl sigs is findSigs cl = maybe (error $ "no sigs: " ++ cl) id $ find (not . null) [maybe [] (\(Tycl sigs _) -> sigs) $ mlookup cl $ typeclasses (tab ! im) | im <- imps] ienv = fromList $ fillSigs <$> toAscList rawIenv imps = "#":rawImps locals = fromList $ map (, ()) $ (fst <$> typedList) ++ (fst <$> defs) insts im (Tycl _ is) = (im,) <$> is classes im = if im == "" then ienv else typeclasses $ tab ! im tycl classId = concat [maybe [] (insts im) $ mlookup classId $ classes im | im <- "":imps] dcs s = foldr (<|>) (mlookup s adtTab) $ map (\im -> mlookup s $ dataCons $ tab ! im) imps typeOfMethod s = maybe undefined id $ foldr (<|>) (fst <$> mlookup s typed) [fmap fst $ lookup s $ typedAsts $ tab ! im | im <- imps] acc' <- foldM (inferModule tab) acc imps let linker = astLink typed locals imps acc' depdefs <- mapM (\(s, t) -> (s,) <$> linker (patternCompile dcs t)) defs typed <- inferDefs tycl depdefs typed typed <- inferTypeclasses tycl typeOfMethod typed dcs linker ienv Right $ insert name (typed, (ffis, ffes)) acc' Just _ -> Right acc untangle s = do tab <- parseProgram s >>= tabulateModules foldM (inferModule tab) soloPrim $ keys tab optiComb' (subs, combs) (s, lamb) = let gosub t = case t of LfVar v -> maybe t id $ lookup v subs Nd a b -> Nd (gosub a) (gosub b) _ -> t c = optim $ gosub $ nolam $ optiApp lamb combs' = combs . ((s, c):) in case c of Lf (Basic _) -> ((s, c):subs, combs') LfVar v -> if v == s then (subs, combs . ((s, Nd (lf "Y") (lf "I")):)) else ((s, gosub c):subs, combs') _ -> (subs, combs') optiComb lambs = ($[]) . snd $ foldl optiComb' ([], id) lambs instance Show Type where showsPrec _ = \case TC s -> (s++) TV s -> (s++) TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b TAp a b -> showParen True $ shows a . (' ':) . shows b instance Show Pred where showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++) instance Show Qual where showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t instance Show Extra where showsPrec _ = \case Basic s -> (s++) Const i -> shows i ChrCon c -> shows c StrCon s -> shows s Link im s _ -> (im++) . ('.':) . (s++) instance Show Pat where showsPrec _ = \case PatLit e -> shows e PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps) showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++) instance Show Ast where showsPrec prec = \case E e -> shows e V s -> showVar s A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts) Proof p -> ("{Proof "++) . shows p . ("}"++) instance Show IntTree where showsPrec prec = \case LfVar s -> showVar s Lf extra -> shows extra Nd x y -> showParen (1 <= prec) $ showsPrec 0 x . (' ':) . showsPrec 1 y disasm (s, t) = (s++) . (" = "++) . shows t . (";\n"++) dumpWith dumper s = case untangle s of Left err -> err Right tab -> foldr ($) [] $ map (\(name, mod) -> ("module "++) . (name++) . ('\n':) . (foldr (.) id $ dumper mod)) $ toAscList tab dumpCombs (typed, _) = map disasm $ optiComb $ lambsList typed dumpLambs (typed, _) = map (\(s, (_, t)) -> (s++) . (" = "++) . shows t . ('\n':)) $ toAscList typed dumpTypes (typed, _) = map (\(s, (q, _)) -> (s++) . (" :: "++) . shows q . ('\n':)) $ toAscList typed -- Code generation. appCell (hp, bs) x y = (Right hp, (hp + 2, bs . (x:) . (y:))) enc tab mem = \case Lf n -> case n of Basic c -> (Right $ comEnum c, mem) Const c -> appCell mem (Right $ comEnum "NUM") $ Right c ChrCon c -> appCell mem (Right $ comEnum "NUM") $ Right $ ord c StrCon s -> enc tab mem $ foldr (\h t -> Nd (Nd (lf "CONS") (Lf $ ChrCon h)) t) (lf "K") s Link m s _ -> (Left (m, s), mem) LfVar s -> maybe (error $ "resolve " ++ s) (, mem) $ mlookup s tab Nd x y -> let (xAddr, mem') = enc tab mem x (yAddr, mem'') = enc tab mem' y in appCell mem'' xAddr yAddr asm hp0 combs = tabmem where tabmem = foldl (\(as, m) (s, t) -> let (p, m') = enc (fst tabmem) m t in (insert s p as, m')) (Tip, (hp0, id)) combs argList t = case t of TC s -> [TC s] TV s -> [TV s] TAp (TC "IO") (TC u) -> [TC u] TAp (TAp (TC "->") x) y -> x : argList y cTypeName (TC "()") = "void" cTypeName (TC "Int") = "int" cTypeName (TC "Char") = "int" ffiDeclare (name, t) = let tys = argList t in concat [cTypeName $ last tys, " ", name, "(", intercalate "," $ cTypeName <$> init tys, ");\n"] ffiArgs n t = case t of TC s -> ("", ((True, s), n)) TAp (TC "IO") (TC u) -> ("", ((False, u), n)) TAp (TAp (TC "->") x) y -> first (((if 3 <= n then ", " else "") ++ "num(" ++ shows n ")") ++) $ ffiArgs (n + 1) y ffiDefine n ffis = case ffis of [] -> id (name, t):xt -> let (args, ((isPure, ret), count)) = ffiArgs 2 t lazyn = ("lazy2(" ++) . shows (if isPure then count - 1 else count + 1) . (", " ++) cont tgt = if isPure then ("_I, "++) . tgt else ("app(arg("++) . shows (count + 1) . ("), "++) . tgt . ("), arg("++) . shows count . (")"++) longDistanceCall = (name++) . ("("++) . (args++) . ("); "++) . lazyn in ("case " ++) . shows n . (": " ++) . if ret == "()" then longDistanceCall . cont ("_K"++) . ("); break;"++) . ffiDefine (n - 1) xt else ("{u r = "++) . longDistanceCall . cont ("app(_NUM, r)" ++) . ("); break;}\n"++) . ffiDefine (n - 1) xt genMain n = "int main(int argc,char**argv){env_argc=argc;env_argv=argv;rts_reduce(" ++ shows n ");return 0;}\n" resolve bigmap (m, s) = either (resolve bigmap) id $ (bigmap ! m) ! s mayResolve bigmap (m, s) = mlookup m bigmap >>= fmap (either (resolve bigmap) id) . mlookup s lambsList typed = toAscList $ snd <$> typed codegenLocal (name, (typed, _)) (bigmap, (hp, f)) = (insert name localmap bigmap, (hp', f . memF)) where (localmap, (hp', memF)) = asm hp $ optiComb $ lambsList typed codegen mods = (bigmap, mem) where (bigmap, (_, memF)) = foldr codegenLocal (Tip, (128, id)) $ toAscList mods mem = either (resolve bigmap) id <$> memF [] getIOType (Qual [] (TAp (TC "IO") t)) = Right t getIOType q = Left $ "main : " ++ shows q "" ffcat (name, (_, (ffis, ffes))) (xs, ys) = (ffis ++ xs, ((name,) <$> ffes) ++ ys) compile s = either id id do mods <- untangle s let (bigmap, mem) = codegen mods (ffis, ffes) = foldr ffcat ([], []) $ toAscList mods mustType modName s = case mlookup s (fst $ mods ! modName) of Just (Qual [] t, _) -> t _ -> error "TODO: report bad exports" mayMain = do mainAddr <- mayResolve bigmap ("Main", "main") (mainType, _) <- mlookup "main" (fst $ mods ! "Main") pure (mainAddr, mainType) mainStr <- case mayMain of Nothing -> pure "" Just (a, q) -> do getIOType q pure $ genMain a pure $ ("#include<stdio.h>\n"++) . ("typedef unsigned u;\n"++) . ("enum{_UNDEFINED=0,"++) . foldr (.) id (map (\(s, _) -> ('_':) . (s++) . (',':)) comdefs) . ("};\n"++) . ("static const u prog[]={" ++) . foldr (.) id (map (\n -> shows n . (',':)) mem) . ("};\nstatic const u prog_size="++) . shows (length mem) . (";\n"++) . ("static u root[]={" ++) . foldr (\(modName, (_, ourName)) f -> shows (resolve bigmap (modName, ourName)) . (", " ++) . f) id ffes . ("0};\n" ++) . (preamble++) . (libc++) . (concatMap ffiDeclare ffis ++) . ("static void foreign(u n) {\n switch(n) {\n" ++) . ffiDefine (length ffis - 1) ffis . ("\n }\n}\n" ++) . runFun . foldr (.) id (zipWith (\(modName, (expName, ourName)) n -> ("EXPORT(f"++) . shows n . (", \""++) . (expName++) . ("\")\n"++) . genExport (arrCount $ mustType modName ourName) n) ffes [0..]) $ mainStr genExport m n = ("void f"++) . shows n . ("("++) . foldr (.) id (intersperse (',':) $ map (("u "++) .) xs) . ("){rts_reduce("++) . foldl (\s x -> ("app("++) . s . (",app(_NUM,"++) . x . ("))"++)) rt xs . (");}\n"++) where xs = map ((('x':) .) . shows) [0..m - 1] rt = ("root["++) . shows n . ("]"++) arrCount = \case TAp (TAp (TC "->") _) y -> 1 + arrCount y _ -> 0 -- Main VM loop. comdefsrc = [r| F x = "foreign(num(1));" Y x = x "sp[1]" Q x y z = z(y x) S x y z = x z(y z) B x y z = x (y z) BK x y z = x y C x y z = x z y R x y z = y z x V x y z = z x y T x y = y x K x y = "_I" x KI x y = "_I" y I x = "sp[1] = arg(1); sp++;" LEFT x y z = y x CONS x y z w = w x y NUM x y = y "sp[1]" ADD x y = "_NUM" "num(1) + num(2)" SUB x y = "_NUM" "num(1) - num(2)" MUL x y = "_NUM" "num(1) * num(2)" DIV x y = "_NUM" "num(1) / num(2)" MOD x y = "_NUM" "num(1) % num(2)" EQ x y = "num(1) == num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);" LE x y = "num(1) <= num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);" REF x y = y "sp[1]" NEWREF x y z = z ("_REF" x) y READREF x y z = z "num(1)" y WRITEREF x y z w = w "((mem[arg(2) + 1] = arg(1)), _K)" z END = "return;" ERR = "sp[1]=app(app(arg(1),_ERREND),_ERR2);sp++;" ERR2 = "lazy3(2, arg(1), _ERROUT, arg(2));" ERROUT = "errchar(num(1)); lazy2(2, _ERR, arg(2));" ERREND = "errexit(); return;" |] comb = (,) <$> conId <*> ((,) <$> many varId <*> (res "=" *> combExpr)) combExpr = foldl1 A <$> some (V <$> varId <|> E . StrCon <$> lexeme tokStr <|> paren combExpr) comdefs = case parse (lexemePrelude *> braceSep comb <* eof) comdefsrc of Left e -> error e Right (cs, _) -> cs comEnum s = maybe (error s) id $ lookup s $ zip (fst <$> comdefs) [1..] comName i = maybe undefined id $ lookup i $ zip [1..] (fst <$> comdefs) preamble = [r|#define EXPORT(f, sym, n) void f() asm(sym) __attribute__((export_name(sym))); void f(){rts_reduce(root[n]);} void *malloc(unsigned long); enum { FORWARD = 127, REDUCING = 126 }; enum { TOP = 1<<24 }; static u *mem, *altmem, *sp, *spTop, hp; static inline u isAddr(u n) { return n>=128; } static u evac(u n) { if (!isAddr(n)) return n; u x = mem[n]; while (isAddr(x) && mem[x] == _T) { mem[n] = mem[n + 1]; mem[n + 1] = mem[x + 1]; x = mem[n]; } if (isAddr(x) && mem[x] == _K) { mem[n + 1] = mem[x + 1]; x = mem[n] = _I; } u y = mem[n + 1]; switch(x) { case FORWARD: return y; case REDUCING: mem[n] = FORWARD; mem[n + 1] = hp; hp += 2; return mem[n + 1]; case _I: mem[n] = REDUCING; y = evac(y); if (mem[n] == FORWARD) { altmem[mem[n + 1]] = _I; altmem[mem[n + 1] + 1] = y; } else { mem[n] = FORWARD; mem[n + 1] = y; } return mem[n + 1]; default: break; } u z = hp; hp += 2; mem[n] = FORWARD; mem[n + 1] = z; altmem[z] = x; altmem[z + 1] = y; return z; } static void gc() { hp = 128; u di = hp; sp = altmem + TOP - 1; for(u *r = root; *r; r++) *r = evac(*r); *sp = evac(*spTop); while (di < hp) { u x = altmem[di] = evac(altmem[di]); di++; if (x != _NUM) altmem[di] = evac(altmem[di]); di++; } spTop = sp; u *tmp = mem; mem = altmem; altmem = tmp; } static inline u app(u f, u x) { mem[hp] = f; mem[hp + 1] = x; return (hp += 2) - 2; } static inline u arg(u n) { return mem[sp [n] + 1]; } static inline int num(u n) { return mem[arg(n) + 1]; } static inline void lazy2(u height, u f, u x) { u *p = mem + sp[height]; *p = f; *++p = x; sp += height - 1; *sp = f; } static void lazy3(u height,u x1,u x2,u x3){u*p=mem+sp[height];sp[height-1]=*p=app(x1,x2);*++p=x3;*(sp+=height-2)=x1;} |] runFun = ([r|static void run() { for(;;) { if (mem + hp > sp - 8) gc(); u x = *sp; if (isAddr(x)) *--sp = mem[x]; else switch(x) { |]++) . foldr (.) id (genComb <$> comdefs) . ([r| } } } void rts_init() { mem = malloc(TOP * sizeof(u)); altmem = malloc(TOP * sizeof(u)); hp = 128; for (u i = 0; i < prog_size; i++) mem[hp++] = prog[i]; spTop = mem + TOP - 1; } void rts_reduce(u n) { static u ready;if (!ready){ready=1;rts_init();} *(sp = spTop) = app(app(n, _UNDEFINED), _END); run(); } |]++) genArg m a = case a of V s -> ("arg("++) . (maybe undefined shows $ lookup s m) . (')':) E (StrCon s) -> (s++) A x y -> ("app("++) . genArg m x . (',':) . genArg m y . (')':) genArgs m as = foldl1 (.) $ map (\a -> (","++) . genArg m a) as genComb (s, (args, body)) = let argc = ('(':) . shows (length args) m = zip args [1..] in ("case _"++) . (s++) . (':':) . (case body of A (A x y) z -> ("lazy3"++) . argc . genArgs m [x, y, z] . (");"++) A x y -> ("lazy2"++) . argc . genArgs m [x, y] . (");"++) E (StrCon s) -> (s++) ) . ("break;\n"++) main = getArgs >>= \case "comb":_ -> interact $ dumpWith dumpCombs "lamb":_ -> interact $ dumpWith dumpLambs "type":_ -> interact $ dumpWith dumpTypes _ -> interact compile iterate f x = x : iterate f (f x) takeWhile _ [] = [] takeWhile p xs@(x:xt) | p x = x : takeWhile p xt | True = [] class Enum a where succ :: a -> a pred :: a -> a toEnum :: Int -> a fromEnum :: a -> Int enumFrom :: a -> [a] enumFromTo :: a -> a -> [a] instance Enum Int where succ = (+1) pred = (+(0-1)) toEnum = id fromEnum = id enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo instance Enum Char where succ = chr . (+1) . ord pred = chr . (+(0-1)) . ord toEnum = chr fromEnum = ord enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo (+) = intAdd (-) = intSub (*) = intMul div = intDiv mod = intMod
Multiparty
We put our changes to the test by splitting party.hs
into modules.
(We really should do the same for our "marginally" compiler, namely create
an intermediate stage that is exactly the same except we use indentation
instead of braces and semicolons. This would make it easier to compare against
its successor "methodically".)
cat Base0.hs Ast.hs Map.hs Parser.hs Kiselyov.hs Unify.hs RTS.hs Typer.hs party.hs
module Base where infixr 9 . infixl 7 * , `div` , `mod` infixl 6 + , - infixr 5 ++ infixl 4 <*> , <$> , <* , *> infix 4 == , /= , <= infixl 3 && , <|> infixl 2 || infixl 1 >> , >>= infixr 0 $ class Functor f where fmap :: (a -> b) -> f a -> f b class Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (<$>) = fmap liftA2 f x y = f <$> x <*> y (>>) f g = f >>= \_ -> g class Eq a where (==) :: a -> a -> Bool instance Eq Int where (==) = intEq instance Eq Char where (==) = charEq ($) f x = f x id x = x const x y = x flip f x y = f y x (&) x f = f x class Ord a where (<=) :: a -> a -> Bool compare x y = if x <= y then if y <= x then EQ else LT else GT instance Ord Int where (<=) = intLE instance Ord Char where (<=) = charLE data Ordering = LT | GT | EQ instance Ord a => Ord [a] where xs <= ys = case xs of [] -> True x:xt -> case ys of [] -> False y:yt -> if x <= y then if y <= x then xt <= yt else True else False compare xs ys = case xs of [] -> case ys of [] -> EQ _ -> LT x:xt -> case ys of [] -> GT y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT data Maybe a = Nothing | Just a data Either a b = Left a | Right b fst (x, y) = x snd (x, y) = y uncurry f (x, y) = f x y first f (x, y) = (f x, y) second f (x, y) = (x, f y) not a = if a then False else True x /= y = not $ x == y (.) f g x = f (g x) (||) f g = if f then True else g (&&) f g = if f then g else False instance Eq a => Eq [a] where xs == ys = case xs of [] -> case ys of [] -> True _ -> False x:xt -> case ys of [] -> False y:yt -> x == y && xt == yt take 0 xs = [] take _ [] = [] take n (h:t) = h : take (n - 1) t maybe n j m = case m of Nothing -> n; Just x -> j x instance Functor Maybe where fmap f = maybe Nothing (Just . f) instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x foldr c n = \case [] -> n; h:t -> c h $ foldr c n t length = foldr (\_ n -> n + 1) 0 mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure []) mapM_ f = foldr ((>>) . f) (pure ()) foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0 error = primitiveError undefined = error "undefined" foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a foldl1 f (h:t) = foldl f h t elem k xs = foldr (\x t -> x == k || t) False xs find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs (++) = flip (foldr (:)) concat = foldr (++) [] map = flip (foldr . ((:) .)) [] head (h:_) = h tail (_:t) = t isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160] instance Functor [] where fmap = map instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f instance Monad [] where return = (:[]); (>>=) = flip concatMap concatMap = (concat .) . map lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing filter f = foldr (\x xs -> if f x then x:xs else xs) [] union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt init (x:xt) = case xt of [] -> []; _ -> x : init xt intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt) all f = foldr (&&) True . map f any f = foldr (||) False . map f and = foldr (&&) True or = foldr (||) False zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt zip = zipWith (,) data State s a = State (s -> (a, s)) runState (State f) = f instance Functor (State s) where fmap f = \(State h) -> State (first f . h) instance Applicative (State s) where pure a = State (a,) (State f) <*> (State x) = State \s -> case f s of (g, s') -> first g $ x s' instance Monad (State s) where return a = State (a,) (State h) >>= f = State $ uncurry (runState . f) . h evalState m s = fst $ runState m s get = State \s -> (s, s) put n = State \s -> ((), n) either l r e = case e of Left x -> l x; Right x -> r x instance Functor (Either a) where fmap f e = either Left (Right . f) e instance Applicative (Either a) where pure = Right ef <*> ex = case ef of Left s -> Left s Right f -> either Left (Right . f) ex instance Monad (Either a) where return = Right ex >>= f = either Left f ex class Alternative f where empty :: f a (<|>) :: f a -> f a -> f a asum = foldr (<|>) empty (*>) = liftA2 \x y -> y (<*) = liftA2 \x y -> x many p = liftA2 (:) p (many p) <|> pure [] some p = liftA2 (:) p (many p) sepBy1 p sep = liftA2 (:) p (many (sep *> p)) sepBy p sep = sepBy1 p sep <|> pure [] between x y p = x *> (p <* y) showParen b f = if b then ('(':) . f . (')':) else f iterate f x = x : iterate f (f x) takeWhile _ [] = [] takeWhile p xs@(x:xt) | p x = x : takeWhile p xt | True = [] class Enum a where succ :: a -> a pred :: a -> a toEnum :: Int -> a fromEnum :: a -> Int enumFrom :: a -> [a] enumFromTo :: a -> a -> [a] instance Enum Int where succ = (+1) pred = (+(0-1)) toEnum = id fromEnum = id enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo instance Enum Char where succ = chr . (+1) . ord pred = chr . (+(0-1)) . ord toEnum = chr fromEnum = ord enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo (+) = intAdd (-) = intSub (*) = intMul div = intDiv mod = intMod instance (Eq a, Eq b) => Eq (a, b) where (a1, b1) == (a2, b2) = a1 == a2 && b1 == b2 instance (Ord a, Ord b) => Ord (a, b) where (a1, b1) <= (a2, b2) = a1 <= a2 && (not (a2 <= a1) || b1 <= b2) null xs = case xs of [] -> True _ -> False instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y) instance Monad IO where return = ioPure ; (>>=) = ioBind instance Functor IO where fmap f x = ioPure f <*> x class Show a where showsPrec :: Int -> a -> String -> String showsPrec _ x = (show x++) show :: a -> String show x = shows x "" showList :: [a] -> String -> String showList = showList__ shows shows = showsPrec 0 showList__ _ [] s = "[]" ++ s showList__ showx (x:xs) s = '[' : showx x (showl xs) where showl [] = ']' : s showl (y:ys) = ',' : showx y (showl ys) showInt__ n | 0 == n = id | True = showInt__ (n`div`10) . (chr (48+n`mod`10):) instance Show () where show () = "()" instance Show Bool where show True = "True" show False = "False" instance Show a => Show [a] where showsPrec _ = showList instance Show Int where showsPrec _ n | 0 == n = ('0':) | 1 <= n = showInt__ n | 2 * n == 0 = ("-2147483648"++) | True = ('-':) . showInt__ (0 - n) showLitChar__ '\n' = ("\\n"++) showLitChar__ '\\' = ("\\\\"++) showLitChar__ c = (c:) instance Show Char where showsPrec _ '\'' = ("'\\''"++) showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':) showList s = ('"':) . foldr (.) id (map go s) . ('"':) where go '"' = ("\\\""++) go c = showLitChar__ c instance (Show a, Show b) => Show (a, b) where showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b
module Map where import Base infixl 9 ! data Map k a = Tip | Bin Int k a (Map k a) (Map k a) instance Functor (Map k) where fmap f m = case m of Tip -> Tip Bin sz k x l r -> Bin sz k (f x) (fmap f l) (fmap f r) size m = case m of Tip -> 0 ; Bin sz _ _ _ _ -> sz node k x l r = Bin (1 + size l + size r) k x l r singleton k x = Bin 1 k x Tip Tip singleL k x l (Bin _ rk rkx rl rr) = node rk rkx (node k x l rl) rr doubleL k x l (Bin _ rk rkx (Bin _ rlk rlkx rll rlr) rr) = node rlk rlkx (node k x l rll) (node rk rkx rlr rr) singleR k x (Bin _ lk lkx ll lr) r = node lk lkx ll (node k x lr r) doubleR k x (Bin _ lk lkx ll (Bin _ lrk lrkx lrl lrr)) r = node lrk lrkx (node lk lkx ll lrl) (node k x lrr r) balance k x l r = f k x l r where f | size l + size r <= 1 = node | 5 * size l + 3 <= 2 * size r = case r of Tip -> node Bin sz _ _ rl rr -> if 2 * size rl + 1 <= 3 * size rr then singleL else doubleL | 5 * size r + 3 <= 2 * size l = case l of Tip -> node Bin sz _ _ ll lr -> if 2 * size lr + 1 <= 3 * size ll then singleR else doubleR | True = node insert kx x t = case t of Tip -> singleton kx x Bin sz ky y l r -> case compare kx ky of LT -> balance ky y (insert kx x l) r GT -> balance ky y l (insert kx x r) EQ -> Bin sz kx x l r insertWith f kx x t = case t of Tip -> singleton kx x Bin sy ky y l r -> case compare kx ky of LT -> balance ky y (insertWith f kx x l) r GT -> balance ky y l (insertWith f kx x r) EQ -> Bin sy kx (f x y) l r mlookup kx t = case t of Tip -> Nothing Bin _ ky y l r -> case compare kx ky of LT -> mlookup kx l GT -> mlookup kx r EQ -> Just y fromList = foldl (\t (k, x) -> insert k x t) Tip member k t = maybe False (const True) $ mlookup k t t ! k = maybe undefined id $ mlookup k t foldrWithKey f = go where go z t = case t of Tip -> z Bin _ kx x l r -> go (f kx x (go z r)) l mapWithKey _ Tip = Tip mapWithKey f (Bin sx kx x l r) = Bin sx kx (f kx x) (mapWithKey f l) (mapWithKey f r) toAscList = foldrWithKey (\k x xs -> (k,x):xs) [] keys = map fst . toAscList elems = map snd . toAscList assocs = toAscList
-- Add `Show` instance. module Kiselyov where import Base import Ast -- Conversion to De Bruijn indices. data LC = Ze | Su LC | Pass IntTree | La LC | App LC LC debruijn n e = case e of E x -> Pass $ Lf x V v -> maybe (Pass $ LfVar v) id $ foldr (\h found -> if h == v then Just Ze else Su <$> found) Nothing n A x y -> App (debruijn n x) (debruijn n y) L s t -> La (debruijn (s:n) t) -- Kiselyov bracket abstraction. data IntTree = Lf Extra | LfVar String | Nd IntTree IntTree data Sem = Defer | Closed IntTree | Need Sem | Weak Sem instance Show IntTree where showsPrec prec = \case LfVar s -> showVar s Lf extra -> shows extra Nd x y -> showParen (1 <= prec) $ showsPrec 0 x . (' ':) . showsPrec 1 y lf = Lf . Basic x ## y = case x of Defer -> case y of Defer -> Need $ Closed (Nd (Nd (lf "S") (lf "I")) (lf "I")) Closed d -> Need $ Closed (Nd (lf "T") d) Need e -> Need $ Closed (Nd (lf "S") (lf "I")) ## e Weak e -> Need $ Closed (lf "T") ## e Closed d -> case y of Defer -> Need $ Closed d Closed dd -> Closed $ Nd d dd Need e -> Need $ Closed (Nd (lf "B") d) ## e Weak e -> Weak $ Closed d ## e Need e -> case y of Defer -> Need $ Closed (lf "S") ## e ## Closed (lf "I") Closed d -> Need $ Closed (Nd (lf "R") d) ## e Need ee -> Need $ Closed (lf "S") ## e ## ee Weak ee -> Need $ Closed (lf "C") ## e ## ee Weak e -> case y of Defer -> Need e Closed d -> Weak $ e ## Closed d Need ee -> Need $ Closed (lf "B") ## e ## ee Weak ee -> Weak $ e ## ee babs t = case t of Ze -> Defer Su x -> Weak $ babs x Pass x -> Closed x La t -> case babs t of Defer -> Closed $ lf "I" Closed d -> Closed $ Nd (lf "K") d Need e -> e Weak e -> Closed (lf "K") ## e App x y -> babs x ## babs y nolam x = (\(Closed d) -> d) $ babs $ debruijn [] x -- Optimizations. optim t = case t of Nd x y -> go (optim x) (optim y) _ -> t where go (Lf (Basic "I")) q = q go p q@(Lf (Basic c)) = case c of "K" -> case p of Lf (Basic "B") -> lf "BK" _ -> Nd p q "I" -> case p of Lf (Basic r) -> case r of "C" -> lf "T" "B" -> lf "I" "K" -> lf "KI" _ -> Nd p q Nd p1 p2 -> case p1 of Lf (Basic "B") -> p2 Lf (Basic "R") -> Nd (lf "T") p2 _ -> Nd (Nd p1 p2) q _ -> Nd p q "T" -> case p of Nd (Lf (Basic "B")) (Lf (Basic r)) -> case r of "C" -> lf "V" "BK" -> lf "LEFT" _ -> Nd p q _ -> Nd p q "V" -> case p of Nd (Lf (Basic "B")) (Lf (Basic "BK")) -> lf "CONS" _ -> Nd p q _ -> Nd p q go p q = Nd p q
(There are more files, which I’ll include if I get around to writing a tool to help show several source files in HTML. For now, see the git repo.)
GHC compatibility
The main obstacle to compiling our modules with GHC is the Prelude. We define
entities such as Monad
and (==)
from scratch, which breaks do
notation
for example because GHC always uses Prelude.Monad
.
We remove this obstacle by simply removing any overlap with the Prelude. We use
a stripped-down Base.hs
when testing with GHC, whereas our compilers really
use files like Base0.hs
.
This implies much of our Base
code is untested, but for this special case,
perhaps we can add a wrapper to test it on its own with GHC.
-- GHC-compatible version. module Base where import qualified Data.Char (chr, ord, isSpace) hide_prelude_here = hide_prelude_here chr = Data.Char.chr ord = Data.Char.ord isSpace = Data.Char.isSpace first f (x, y) = (f x, y) second f (x, y) = (x, f y) infixl 3 <|> instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x class Alternative f where empty :: f a (<|>) :: f a -> f a -> f a (&) x f = f x liftA2 f x y = f <$> x <*> y many p = liftA2 (:) p (many p) <|> pure [] some p = liftA2 (:) p (many p) sepBy1 p sep = liftA2 (:) p (many (sep *> p)) sepBy p sep = sepBy1 p sep <|> pure [] between x y p = x *> (p <* y) asum = foldr (<|>) empty find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt) foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0 data State s a = State (s -> (a, s)) runState (State f) = f instance Functor (State s) where fmap f = \(State h) -> State (first f . h) instance Applicative (State s) where pure a = State (a,) (State f) <*> (State x) = State \s -> case f s of (g, s') -> first g $ x s' instance Monad (State s) where return a = State (a,) (State h) >>= f = State $ uncurry (runState . f) . h evalState m s = fst $ runState m s get = State \s -> (s, s) put n = State \s -> ((), n) integerSignList x f = f (x >= 0) $ go x where go 0 = [] go n = r : go q where (q, r) = divMod n $ 2^32 intFromWord = fromIntegral when x y = if x then y else pure () unless x y = if x then pure () else y
Another obstacle is our built-in support for quasi-quoted raw strings. We solve this by adding the line:
import_qq_here = import_qq_here
immediately after the import statements. Then we enable the C pre-processor and
define import_qq_here
to be import Text.RawString.QQ --
.
We perform similar tricks to hide Prelude
symbols we define in the System
module.
module RTS where import Base import Ast import Kiselyov import Map import Parser import_qq_here = import_qq_here libc = [r|#include<stdio.h> static int env_argc; int getargcount() { return env_argc; } static char **env_argv; int getargchar(int n, int k) { return env_argv[n][k]; } static int nextCh, isAhead; int eof_shim() { if (!isAhead) { isAhead = 1; nextCh = getchar(); } return nextCh == -1; } void exit(int); void putchar_shim(int c) { putchar(c); } int getchar_shim() { if (!isAhead) nextCh = getchar(); if (nextCh == -1) exit(1); isAhead = 0; return nextCh; } void errchar(int c) { fputc(c, stderr); } void errexit() { fputc('\n', stderr); } |] preamble = [r|#define EXPORT(f, sym, n) void f() asm(sym) __attribute__((export_name(sym))); void f(){rts_reduce(root[n]);} void *malloc(unsigned long); enum { FORWARD = 127, REDUCING = 126 }; enum { TOP = 1<<24 }; static u *mem, *altmem, *sp, *spTop, hp; static inline u isAddr(u n) { return n>=128; } static u evac(u n) { if (!isAddr(n)) return n; u x = mem[n]; while (isAddr(x) && mem[x] == _T) { mem[n] = mem[n + 1]; mem[n + 1] = mem[x + 1]; x = mem[n]; } if (isAddr(x) && mem[x] == _K) { mem[n + 1] = mem[x + 1]; x = mem[n] = _I; } u y = mem[n + 1]; switch(x) { case FORWARD: return y; case REDUCING: mem[n] = FORWARD; mem[n + 1] = hp; hp += 2; return mem[n + 1]; case _I: mem[n] = REDUCING; y = evac(y); if (mem[n] == FORWARD) { altmem[mem[n + 1]] = _I; altmem[mem[n + 1] + 1] = y; } else { mem[n] = FORWARD; mem[n + 1] = y; } return mem[n + 1]; default: break; } u z = hp; hp += 2; mem[n] = FORWARD; mem[n + 1] = z; altmem[z] = x; altmem[z + 1] = y; return z; } static void gc() { hp = 128; u di = hp; sp = altmem + TOP - 1; for(u *r = root; *r; r++) *r = evac(*r); *sp = evac(*spTop); while (di < hp) { u x = altmem[di] = evac(altmem[di]); di++; if (x != _NUM) altmem[di] = evac(altmem[di]); di++; } spTop = sp; u *tmp = mem; mem = altmem; altmem = tmp; } static inline u app(u f, u x) { mem[hp] = f; mem[hp + 1] = x; return (hp += 2) - 2; } static inline u arg(u n) { return mem[sp [n] + 1]; } static inline int num(u n) { return mem[arg(n) + 1]; } static inline void lazy2(u height, u f, u x) { u *p = mem + sp[height]; *p = f; *++p = x; sp += height - 1; *sp = f; } static void lazy3(u height,u x1,u x2,u x3){u*p=mem+sp[height];sp[height-1]=*p=app(x1,x2);*++p=x3;*(sp+=height-2)=x1;} |] -- Main VM loop. comdefsrc = [r| F x = "foreign(num(1));" Y x = x "sp[1]" Q x y z = z(y x) S x y z = x z(y z) B x y z = x (y z) BK x y z = x y C x y z = x z y R x y z = y z x V x y z = z x y T x y = y x K x y = "_I" x KI x y = "_I" y I x = "sp[1] = arg(1); sp++;" LEFT x y z = y x CONS x y z w = w x y NUM x y = y "sp[1]" ADD x y = "_NUM" "num(1) + num(2)" SUB x y = "_NUM" "num(1) - num(2)" MUL x y = "_NUM" "num(1) * num(2)" DIV x y = "_NUM" "num(1) / num(2)" MOD x y = "_NUM" "num(1) % num(2)" EQ x y = "num(1) == num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);" LE x y = "num(1) <= num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);" REF x y = y "sp[1]" NEWREF x y z = z ("_REF" x) y READREF x y z = z "num(1)" y WRITEREF x y z w = w "((mem[arg(2) + 1] = arg(1)), _K)" z END = "return;" ERR = "sp[1]=app(app(arg(1),_ERREND),_ERR2);sp++;" ERR2 = "lazy3(2, arg(1), _ERROUT, arg(2));" ERROUT = "errchar(num(1)); lazy2(2, _ERR, arg(2));" ERREND = "errexit(); return;" |] argList t = case t of TC s -> [TC s] TV s -> [TV s] TAp (TC "IO") (TC u) -> [TC u] TAp (TAp (TC "->") x) y -> x : argList y cTypeName (TC "()") = "void" cTypeName (TC "Int") = "int" cTypeName (TC "Char") = "int" ffiDeclare (name, t) = let tys = argList t in concat [cTypeName $ last tys, " ", name, "(", intercalate "," $ cTypeName <$> init tys, ");\n"] ffiArgs n t = case t of TC s -> ("", ((True, s), n)) TAp (TC "IO") (TC u) -> ("", ((False, u), n)) TAp (TAp (TC "->") x) y -> first (((if 3 <= n then ", " else "") ++ "num(" ++ shows n ")") ++) $ ffiArgs (n + 1) y ffiDefine n ffis = case ffis of [] -> id (name, t):xt -> let (args, ((isPure, ret), count)) = ffiArgs 2 t lazyn = ("lazy2(" ++) . shows (if isPure then count - 1 else count + 1) . (", " ++) cont tgt = if isPure then ("_I, "++) . tgt else ("app(arg("++) . shows (count + 1) . ("), "++) . tgt . ("), arg("++) . shows count . (")"++) longDistanceCall = (name++) . ("("++) . (args++) . ("); "++) . lazyn in ("case " ++) . shows n . (": " ++) . if ret == "()" then longDistanceCall . cont ("_K"++) . ("); break;"++) . ffiDefine (n - 1) xt else ("{u r = "++) . longDistanceCall . cont ("app(_NUM, r)" ++) . ("); break;}\n"++) . ffiDefine (n - 1) xt genMain n = "int main(int argc,char**argv){env_argc=argc;env_argv=argv;rts_reduce(" ++ shows n ");return 0;}\n" arrCount = \case TAp (TAp (TC "->") _) y -> 1 + arrCount y _ -> 0 genExport m n = ("void f"++) . shows n . ("("++) . foldr (.) id (intersperse (',':) $ map (("u "++) .) xs) . ("){rts_reduce("++) . foldl (\s x -> ("app("++) . s . (",app(_NUM,"++) . x . ("))"++)) rt xs . (");}\n"++) where xs = map ((('x':) .) . shows) [0..m - 1] rt = ("root["++) . shows n . ("]"++) genArg m a = case a of V s -> ("arg("++) . (maybe undefined shows $ lookup s m) . (')':) E (StrCon s) -> (s++) A x y -> ("app("++) . genArg m x . (',':) . genArg m y . (')':) genArgs m as = foldl1 (.) $ map (\a -> (","++) . genArg m a) as genComb (s, (args, body)) = let argc = ('(':) . shows (length args) m = zip args [1..] in ("case _"++) . (s++) . (':':) . (case body of A (A x y) z -> ("lazy3"++) . argc . genArgs m [x, y, z] . (");"++) A x y -> ("lazy2"++) . argc . genArgs m [x, y] . (");"++) E (StrCon s) -> (s++) ) . ("break;\n"++) comb = (,) <$> conId <*> ((,) <$> many varId <*> (res "=" *> combExpr)) combExpr = foldl1 A <$> some (V <$> varId <|> E . StrCon <$> lexeme tokStr <|> paren combExpr) comdefs = case parse (lexemePrelude *> braceSep comb <* eof) comdefsrc of Left e -> error e Right (cs, _) -> cs comEnum s = maybe (error s) id $ lookup s $ zip (fst <$> comdefs) [1..] comName i = maybe undefined id $ lookup i $ zip [1..] (fst <$> comdefs) runFun = ([r|static void run() { for(;;) { if (mem + hp > sp - 8) gc(); u x = *sp; if (isAddr(x)) *--sp = mem[x]; else switch(x) { |]++) . foldr (.) id (genComb <$> comdefs) . ([r| } } } void rts_init() { mem = malloc(TOP * sizeof(u)); altmem = malloc(TOP * sizeof(u)); hp = 128; for (u i = 0; i < sizeof(prog)/sizeof(*prog); i++) mem[hp++] = prog[i]; spTop = mem + TOP - 1; } void rts_reduce(u n) { static u ready;if (!ready){ready=1;rts_init();} *(sp = spTop) = app(app(n, _UNDEFINED), _END); run(); } |]++) resolve bigmap (m, s) = either (resolve bigmap) id $ (bigmap ! m) ! s mayResolve bigmap (m, s) = mlookup m bigmap >>= fmap (either (resolve bigmap) id) . mlookup s appCell (hp, bs) x y = (Right hp, (hp + 2, bs . (x:) . (y:))) enc tab mem = \case Lf n -> case n of Basic c -> (Right $ comEnum c, mem) Const c -> appCell mem (Right $ comEnum "NUM") $ Right c ChrCon c -> appCell mem (Right $ comEnum "NUM") $ Right $ ord c StrCon s -> enc tab mem $ foldr (\h t -> Nd (Nd (lf "CONS") (Lf $ ChrCon h)) t) (lf "K") s Link m s _ -> (Left (m, s), mem) LfVar s -> maybe (error $ "resolve " ++ s) (, mem) $ mlookup s tab Nd x y -> let (xAddr, mem') = enc tab mem x (yAddr, mem'') = enc tab mem' y in appCell mem'' xAddr yAddr asm hp0 combs = tabmem where tabmem = foldl (\(as, m) (s, t) -> let (p, m') = enc (fst tabmem) m t in (insert s p as, m')) (Tip, (hp0, id)) combs optiComb' (subs, combs) (s, lamb) = let gosub t = case t of LfVar v -> maybe t id $ lookup v subs Nd a b -> Nd (gosub a) (gosub b) _ -> t c = optim $ gosub $ nolam lamb combs' = combs . ((s, c):) in case c of Lf (Basic _) -> ((s, c):subs, combs') LfVar v -> if v == s then (subs, combs . ((s, Nd (lf "Y") (lf "I")):)) else ((s, gosub c):subs, combs') _ -> (subs, combs') optiComb lambs = ($[]) . snd $ foldl optiComb' ([], id) lambs lambsList typed = toAscList $ snd <$> typed codegenLocal (name, (typed, _)) (bigmap, (hp, f)) = (insert name localmap bigmap, (hp', f . memF)) where (localmap, (hp', memF)) = asm hp $ optiComb $ lambsList typed codegen mods = (bigmap, mem) where (bigmap, (_, memF)) = foldr codegenLocal (Tip, (128, id)) $ toAscList mods mem = either (resolve bigmap) id <$> memF [] getIOType (Qual [] (TAp (TC "IO") t)) = Right t getIOType q = Left $ "main : " ++ shows q "" ffcat (name, (_, (ffis, ffes))) (xs, ys) = (ffis ++ xs, ((name,) <$> ffes) ++ ys) compile mods = do let (bigmap, mem) = codegen mods (ffis, ffes) = foldr ffcat ([], []) $ toAscList mods mustType modName s = case mlookup s (fst $ mods ! modName) of Just (Qual [] t, _) -> t _ -> error "TODO: report bad exports" mayMain = do mainAddr <- mayResolve bigmap ("Main", "main") (mainType, _) <- mlookup "main" $ fst $ mods ! "Main" pure (mainAddr, mainType) mainStr <- case mayMain of Nothing -> pure "" Just (a, q) -> do getIOType q pure $ genMain a pure $ ("typedef unsigned u;\n"++) . ("enum{_UNDEFINED=0,"++) . foldr (.) id (map (\(s, _) -> ('_':) . (s++) . (',':)) comdefs) . ("};\n"++) . ("static const u prog[]={" ++) . foldr (.) id (map (\n -> shows n . (',':)) mem) . ("};\nstatic u root[]={" ++) . foldr (\(modName, (_, ourName)) f -> shows (resolve bigmap (modName, ourName)) . (", " ++) . f) id ffes . ("0};\n" ++) . (preamble++) . (libc++) . (concatMap ffiDeclare ffis ++) . ("static void foreign(u n) {\n switch(n) {\n" ++) . ffiDefine (length ffis - 1) ffis . ("\n }\n}\n" ++) . runFun . foldr (.) id (zipWith (\(modName, (expName, ourName)) n -> ("EXPORT(f"++) . shows n . (", \""++) . (expName++) . ("\")\n"++) . genExport (arrCount $ mustType modName ourName) n) ffes [0..]) $ mainStr
Our source now works with GHC with the following options:
:set "-Dhide_prelude_here=import Prelude hiding (getChar, putChar, getContents, putStr, putStrLn, interact) --" :set "-Dimport_qq_here=import Text.RawString.QQ --" :set -cpp -XQuasiQuotes :set -XBlockArguments -XLambdaCase -XTupleSections :set -XNoMonomorphismRestriction -XMonoLocalBinds
In the inn
subdirectory:
$ ghci -ghci-script compat.ghci party.hs ../stub.o
Here, the stub.o
has been created from stub.c
with clang -c
or similar.
We gave the nice filenames to GHC, which expects to find modules in files with matching names. Our compilers tolerate weird filename prefixes and suffixes because we can simply concatenate different files. An alternative is to manage different subdirectories containing the same filenames.
We can test later iterations with GHCi by symlinking appropriate versions of each file in a dedicated subdirectory.
Party1
Modules feel revolutionary. Our source becomes clearer, because modularization forces us to think about interdependencies, which guided refactoring so breaking up was less hard to do. And we can progress by making a small change to a small file, like our earliest compilers back in the day.
However, we face new challenges. Addressing the limitations listed above will require effort. Prepending a little wrapper no longer suffices for GHC interoperability. And how are we going to keep track of many versions of many files?
For now we answer the last question by tweaking an existing filename and
Makefile
rule. The module name remains the same but we concatenate a
different file.
For mutual let definitions we wrote code that traversed a syntax tree to substitute certain variables. An alternative is to build a syntax tree that describes this substitution. After all, lambda calculus is substitution incarnate. In other words, we rely more on dynamic rather than static semantics, a distinction that sometimes blurs because beta-reducing may occur during optimization.
One advantage of this approach is we can remove overFreePro
, a helper that
traverses over syntax trees before case expressions and pattern matches have
been transformed away.
We extend the parser to support named record fields in data type declarations such as:
data Foo = Foo { bar :: Int, baz :: String } | Qux
For accessors, we generate one function definition per field. For example:
bar = \case Foo bar baz -> bar
except at a lower level, exploiting our knowledge that our data types are Scott-encoded.
Record updates and initialization are more challenging. We need more than plain function definitions, and furthermore, we only have all valid field names after parsing. This means we ought to extend our syntax tree to hold lists of field bindings for record updates and initializations.
Instead of adding a new data constructor to our Ast
type, we invent two
basic combinators Basic "{="
and Basic "=}"
which act as delimiters
for a list of field bindings, where the A
data constructor acts like a cons.
An alternative is to use recursion schemes for our many variants of syntax
trees.
By pattern compilation, we know all the field names, so at this point we call
resolveFieldBinds
to transform, say:
x { bar = 42 }
into:
case x of \Foo {orig}bar {orig}baz -> Foo 42 {orig}baz
though again using a lower level representation since we know we’re
Scott-encoding the data types. The {orig}
added by our code to each variable
name guards against variable capture.
For record initializations, we only generate the right-hand side of the case
match and use undefined
for missing fields instead of {orig}
variables.
We implement deriving
for Eq
and Show
. It would be nice to automatically
derive Eq
for our primitive data types (unit, boolean, pairs, lists) but this
would require all programs to define the Eq
class.
Recall for data types, we maintain a map from a data constructor name to the
list of all data constructors of the same type, along with the types of any
field they may have. Even though we need to generate a unique and predictable
symbol per type to represent corresponding case expressions, the function
specialCase
simply builds this symbol from the first data constructor.
We barely modify this map for named fields. As a result, there’s no easy way
for findField
to look up relevant information based on a field name. We
inefficiently search linearly through possibly repeated entries. It may be
better to add a separate map for named fields, but it’s tedious to add fields
to the Neat
type when our current compiler lacks support for naming them!
Once again, a proto-chicken comes first.
To test with GHC, we create a new directory containing appropriately named symlinks to the desired versions of the modules. Incremental development means we only need to change a few symlinks at a time, but in the long run, we ought to automate symlinking from a given set of module files.
-- Record fields. -- Remove `overFreePro`. module Ast where import Base import Map data Type = TC String | TV String | TAp Type Type arr a b = TAp (TAp (TC "->") a) b data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat] data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred data Constr = Constr String [(String, Type)] data Pred = Pred String Type data Qual = Qual [Pred] Type instance Eq Type where (TC s) == (TC t) = s == t (TV s) == (TV t) = s == t (TAp a b) == (TAp c d) = a == c && b == d _ == _ = False instance Eq Pred where (Pred s a) == (Pred t b) = s == t && a == b data Instance = Instance -- Type, e.g. Int for Eq Int. Type -- Dictionary name, e.g. "{Eq Int}" String -- Context. [Pred] -- Method definitions (Map String Ast) data Tycl = Tycl [String] [Instance] data Neat = Neat (Map String Tycl) -- | Top-level definitions [(String, Ast)] -- | Typed ASTs, ready for compilation, including ADTs and methods, -- e.g. (==), (Eq a => a -> a -> Bool, select-==) [(String, (Qual, Ast))] -- | Data constructor table. (Map String [Constr]) -- | FFI declarations. [(String, Type)] -- | Exports. [(String, String)] -- | Module imports. [String] patVars = \case PatLit _ -> [] PatVar s m -> s : maybe [] patVars m PatCon _ args -> concat $ patVars <$> args fvPro bound expr = case expr of V s | not (elem s bound) -> [s] A x y -> fvPro bound x `union` fvPro bound y L s t -> fvPro (s:bound) t Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts _ -> [] beta s a t = case t of E _ -> t V v -> if s == v then a else t A x y -> A (beta s a x) (beta s a y) L v u -> if s == v then t else L v $ beta s a u instance Show Type where showsPrec _ = \case TC s -> (s++) TV s -> (s++) TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b TAp a b -> showParen True $ shows a . (' ':) . shows b instance Show Pred where showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++) instance Show Qual where showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t instance Show Extra where showsPrec _ = \case Basic s -> (s++) Const i -> shows i ChrCon c -> shows c StrCon s -> shows s Link im s _ -> (im++) . ('.':) . (s++) instance Show Pat where showsPrec _ = \case PatLit e -> shows e PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps) showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++) instance Show Ast where showsPrec prec = \case E e -> shows e V s -> showVar s A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts) Proof p -> ("{Proof "++) . shows p . ("}"++) typedAsts (Neat _ _ tas _ _ _ _) = tas typeclasses (Neat tcs _ _ _ _ _ _) = tcs dataCons (Neat _ _ _ dcs _ _ _) = dcs typeVars = \case TC _ -> [] TV v -> [v] TAp x y -> typeVars x `union` typeVars y depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex -> if vertex `elem` visited then st else second (vertex:) $ depthFirstSearch relation (vertex:visited, sequence) (relation vertex) spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex -> if vertex `elem` visited then st else second ((:setSequence) . (vertex:)) $ depthFirstSearch relation (vertex:visited, []) (relation vertex) scc ins outs = spanning . depthFirst where depthFirst = snd . depthFirstSearch outs ([], []) spanning = snd . spanningSearch ins ([], [])
-- Record fields. -- Deriving `Eq`, `Show`. module Parser where import Base import Ast import Map -- Parser. data ParserState = ParserState [(Char, (Int, Int))] String [Int] (Map String (Int, Assoc)) readme (ParserState x _ _ _) = x landin (ParserState _ x _ _) = x indents (ParserState _ _ x _) = x precs (ParserState _ _ _ x) = x putReadme x (ParserState _ a b c) = ParserState x a b c putLandin x (ParserState a _ b c) = ParserState a x b c modIndents f (ParserState a b x c) = ParserState a b (f x) c data Parser a = Parser (ParserState -> Either String (a, ParserState)) getParser (Parser p) = p instance Functor Parser where fmap f x = pure f <*> x instance Applicative Parser where pure x = Parser \inp -> Right (x, inp) (Parser f) <*> (Parser x) = Parser \inp -> do (fun, t) <- f inp (arg, u) <- x t pure (fun arg, u) instance Monad Parser where return = pure (Parser x) >>= f = Parser \inp -> do (a, t) <- x inp getParser (f a) t instance Alternative Parser where empty = bad "" x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp getPrecs = Parser \st -> Right (precs st, st) putPrecs ps = Parser \(ParserState a b c _) -> Right ((), ParserState a b c ps) notFollowedBy p = do saved <- Parser \pasta -> Right (pasta, pasta) ret <- p *> pure (bad "") <|> pure (pure ()) Parser \_ -> Right ((), saved) ret parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] $ singleton ":" (5, RAssoc) where rowcol s rc = case s of [] -> [] h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc) advanceRC n (r, c) | n `elem` [10, 11, 12, 13] = (r + 1, 1) | n == 9 = (r, (c + 8)`mod`8) | True = (r, c + 1) indentOf pasta = case readme pasta of [] -> 1 (_, (_, c)):_ -> c ins c pasta = putLandin (c:landin pasta) pasta angle n pasta = case indents pasta of m:ms | m == n -> ins ';' pasta | n + 1 <= m -> ins '}' $ angle n $ modIndents tail pasta _ -> pasta curly n pasta = case indents pasta of m:ms | m + 1 <= n -> ins '{' $ modIndents (n:) pasta [] | 1 <= n -> ins '{' $ modIndents (n:) pasta _ -> ins '{' . ins '}' $ angle n pasta sat f = Parser \pasta -> case landin pasta of c:t -> if f c then Right (c, putLandin t pasta) else Left "unsat" [] -> case readme pasta of [] -> case indents pasta of [] -> Left "EOF" m:ms | m /= 0 && f '}' -> Right ('}', modIndents tail pasta) _ -> Left "unsat" (h, _):t | f h -> let p' = putReadme t pasta in case h of '}' -> case indents pasta of 0:ms -> Right (h, modIndents tail p') _ -> Left "unsat" '{' -> Right (h, modIndents (0:) p') _ -> Right (h, p') _ -> Left "unsat" char c = sat (c ==) rawSat f = Parser \pasta -> case readme pasta of [] -> Left "EOF" (h, _):t -> if f h then Right (h, putReadme t pasta) else Left "unsat" eof = Parser \pasta -> case pasta of ParserState [] [] _ _ -> Right ((), pasta) _ -> badpos pasta "want eof" comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *> (rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True spaces = isNewline <$> rawSat isSpace whitespace = do offside <- or <$> many (spaces <|> comment) Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta) hexValue d | d <= '9' = ord d - ord '0' | d <= 'F' = 10 + ord d - ord 'A' | d <= 'f' = 10 + ord d - ord 'a' isNewline c = ord c `elem` [10, 11, 12, 13] isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:") isSmall c = c <= 'z' && 'a' <= c || c == '_' small = sat isSmall large = sat \x -> (x <= 'Z') && ('A' <= x) hexit = sat \x -> (x <= '9') && ('0' <= x) || (x <= 'F') && ('A' <= x) || (x <= 'f') && ('a' <= x) digit = sat \x -> (x <= '9') && ('0' <= x) decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit nameTailChar = small <|> large <|> digit <|> char '\'' nameTailed p = liftA2 (:) p $ many nameTailChar escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure (chr 0) <|> char 'x' *> (chr <$> hexadecimal)) tokOne delim = escape <|> rawSat (delim /=) charSeq = mapM char tokChar = between (char '\'') (char '\'') (tokOne '\'') quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"' quasiquoteStr = charSeq "[r|" *> quasiquoteBody quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody tokStr = quoteStr <|> quasiquoteStr integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr varish = lexeme $ nameTailed small bad s = Parser \pasta -> badpos pasta s badpos pasta s = Left $ loc $ ": " ++ s where loc = case readme pasta of [] -> ("EOF"++) (_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c varId = do s <- varish if elem s ["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"] then bad $ "reserved: " ++ s else pure s varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol) varSym = lexeme $ do s <- varSymish if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s conId = lexeme $ nameTailed large conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol conSym = do s <- conSymish if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s special c = lexeme $ sat (c ==) comma = special ',' semicolon = special ';' lParen = special '(' rParen = special ')' lBrace = special '{' rBrace = special '}' lSquare = special '[' rSquare = special ']' backquote = special '`' lexeme f = f <* whitespace lexemePrelude = whitespace *> Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of Left _ -> Right ((), curly (indentOf pasta) pasta) Right _ -> Right ((), pasta) curlyCheck f = do Parser \pasta -> Right ((), modIndents (0:) pasta) r <- f Parser \pasta -> let pasta' = modIndents tail pasta in case readme pasta of [] -> Right ((), curly 0 pasta') ('{', _):_ -> Right ((), pasta') (_, (_, col)):_ -> Right ((), curly col pasta') pure r conOf (Constr s _) = s specialCase (h:_) = '|':conOf h mkCase t cs = (specialCase cs, ( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ sts) -> foldr arr (TV "case") $ snd <$> sts) cs , E $ Basic "I")) mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", []) scottEncode _ ":" _ = E $ Basic "CONS" scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs) scottConstr t cs (Constr s sts) = (s, (Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts)) : [(field, (Qual [] $ t `arr` ft, L s $ foldl A (V s) $ inj $ proj field)) | (field, ft) <- sts, field /= ""] where ts = snd <$> sts proj fd = foldr L (V fd) $ fst <$> sts inj x = map (\(Constr s' _) -> if s' == s then x else V "undefined") cs mkAdtDefs t cs = mkCase t cs : concatMap (scottConstr t cs) cs mkFFIHelper n t acc = case t of TC s -> acc TAp (TC "IO") _ -> acc TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs addAdt t cs ders (Neat tycl fs typed dcs ffis ffes ims) = foldr derive ast ders where ast = Neat tycl fs (mkAdtDefs t cs ++ typed) (updateDcs cs dcs) ffis ffes ims derive "Eq" = addInstance "Eq" (mkPreds "Eq") t [("==", Pa $ map eqCase cs )] derive "Show" = addInstance "Show" (mkPreds "Show") t [("showsPrec", L "prec" $ Pa $ map showCase cs )] derive der = error $ "bad deriving: " ++ der prec0 = E $ Const 0 showCase (Constr con args) = let as = show <$> [1..length args] in ([PatCon con $ mkPatVar "" <$> as], case args of [] -> A (V "++") (E $ StrCon con) _ -> case con of ':':_ -> A (A (V "showParen") $ V "True") $ foldr1 (\f g -> A (A (V ".") f) g) [ A (A (V "showsPrec") prec0) (V "1") , A (V "++") (E $ StrCon $ ' ':con++" ") , A (A (V "showsPrec") prec0) (V "2") ] _ -> A (A (V "showParen") $ A (A (V "<=") prec0) $ V "prec") $ A (A (V ".") $ A (V "++") (E $ StrCon con)) $ foldr (\f g -> A (A (V ".") f) g) (L "x" $ V "x") $ map (\a -> A (A (V ".") (A (V ":") (E $ ChrCon ' '))) $ A (A (V "showsPrec") prec0) (V a)) as ) mkPreds classId = Pred classId . TV <$> typeVars t mkPatVar pre s = PatVar (pre ++ s) Nothing eqCase (Constr con args) = let as = show <$> [1..length args] in ([PatCon con $ mkPatVar "l" <$> as], Pa [ ([PatCon con $ mkPatVar "r" <$> as], foldr (\x y -> (A (A (V "&&") x) y)) (V "True") $ map (\n -> A (A (V "==") (V $ "l" ++ n)) (V $ "r" ++ n)) as) , ([PatVar "_" Nothing], V "False")]) emptyTycl = Tycl [] [] addClass classId v (sigs, defs) (Neat tycl fs typed dcs ffis ffes ims) = let vars = take (size sigs) $ show <$> [0..] selectors = zipWith (\var (s, t) -> (s, (Qual [Pred classId v] t, L "@" $ A (V "@") $ foldr L (V var) vars))) vars $ toAscList sigs defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs Tycl ms is = maybe emptyTycl id $ mlookup classId tycl tycl' = insert classId (Tycl (keys sigs) is) tycl in if null ms then Neat tycl' (defaults ++ fs) (selectors ++ typed) dcs ffis ffes ims else error $ "duplicate class: " ++ classId addInstance classId ps ty ds (Neat tycl fs typed dcs ffis ffes ims) = let Tycl ms is = maybe emptyTycl id $ mlookup classId tycl tycl' = insert classId (Tycl ms $ Instance ty name ps (fromList ds):is) tycl name = '{':classId ++ (' ':shows ty "}") in Neat tycl' fs typed dcs ffis ffes ims addFFI foreignname ourname t (Neat tycl fs typed dcs ffis ffes ims) = let fn = A (E $ Basic "F") $ E $ Const $ length ffis in Neat tycl fs ((ourname, (Qual [] t, mkFFIHelper 0 t fn)) : typed) dcs ((foreignname, t):ffis) ffes ims addDefs ds (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl (ds ++ fs) typed dcs ffis ffes ims addImport im (Neat tycl fs typed dcs ffis exs ims) = Neat tycl fs typed dcs ffis exs (im:ims) addExport e f (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl fs typed dcs ffis ((e, f):ffes) ims parseErrorRule = Parser \pasta -> case indents pasta of m:ms | m /= 0 -> Right ('}', modIndents tail pasta) _ -> badpos pasta "missing }" res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where reservedSeq = if elem w ["let", "where", "do", "of"] then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol) paren = between lParen rParen braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x nonemptyTails [] = [] nonemptyTails xs@(x:xt) = xs : nonemptyTails xt joinIsFail t = A (L "join#" t) (V "fail#") addLets ls x = foldr triangle x components where vs = fst <$> ls ios = foldr (\(s, dsts) (ins, outs) -> (foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs)) (Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs triangle names expr = let tnames = nonemptyTails names appem vs = foldl1 A $ V <$> vs suball expr = foldl A (foldr L expr $ init names) $ appem <$> init tnames redef tns expr = foldr L (suball expr) tns in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ redef xt $ maybe undefined joinIsFail $ lookup x ls) (suball expr) tnames data Assoc = NAssoc | LAssoc | RAssoc instance Eq Assoc where NAssoc == NAssoc = True LAssoc == LAssoc = True RAssoc == RAssoc = True _ == _ = False precOf s precTab = maybe 9 fst $ mlookup s precTab assocOf s precTab = maybe LAssoc snd $ mlookup s precTab opFold precTab f x xs = case xs of [] -> pure x (op, y):xt -> case find (\(op', _) -> assocOf op precTab /= assocOf op' precTab) xt of Nothing -> case assocOf op precTab of NAssoc -> case xt of [] -> pure $ f op x y y:yt -> bad "NAssoc repeat" LAssoc -> pure $ foldl (\a (op, y) -> f op a y) x xs RAssoc -> pure $ foldr (\(op, y) b -> \e -> f op e (b y)) id xs $ x Just y -> bad "Assoc clash" qconop = conSym <|> res ":" <|> between backquote backquote conId qconsym = conSym <|> res ":" op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId) con = conId <|> paren qconsym var = varId <|> paren varSym tycon = do s <- conId pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s aType = lParen *> ( rParen *> pure (TC "()") <|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen) <|> tycon <|> TV <$> varId <|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare))) bType = foldl1 TAp <$> some aType _type = foldr1 arr <$> sepBy bType (res "->") fixityDecl w a = do res w n <- lexeme integer os <- sepBy op comma precs <- getPrecs putPrecs $ foldr (\o m -> insert o (n, a) m) precs os fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi genDecl = (,) <$> var <*> (res "::" *> _type) classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls)) simpleClass = Pred <$> conId <*> _type scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma) instDecl = res "instance" *> ((\ps cl ty defs -> addInstance cl ps ty defs) <$> (scontext <* res "=>" <|> pure []) <*> conId <*> _type <*> (res "where" *> braceDef)) letin = addLets <$> between (res "let") (res "in") braceDef <*> expr ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$> (res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr) listify = foldr (\h t -> A (A (V ":") h) t) (V "[]") alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> guards "->") cas = flip A <$> between (res "case") (res "of") expr <*> alts lamCase = curlyCheck (res "case") *> alts lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr)) flipPairize y x = A (A (V ",") x) y moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ","))) parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id) rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()")) maybePureUnit = maybe (V "pure" `A` V "()") id stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr) <|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr <|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef) doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt) compQual = (\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs) <$> pat <*> (res "<-" *> expr) <|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr <|> addLets <$> (res "let" *> braceDef) sqExpr = between lSquare rSquare $ ((&) <$> expr <*> ( res ".." *> ( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr <|> pure (A (V "enumFrom")) ) <|> res "|" *> ((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma) <|> (\t h -> listify (h:t)) <$> many (comma *> expr) ) ) <|> pure (V "[]") fbind = A <$> (V <$> var) <*> (res "=" *> expr) fBinds v = (do fbs <- between lBrace rBrace $ sepBy1 fbind comma pure $ A (E $ Basic "{=") $ foldr A (E $ Basic "=}") $ v:fbs ) <|> pure v atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section <|> cas <|> lam <|> (paren comma *> pure (V ",")) <|> V <$> (con <|> var) <|> literal >>= fBinds aexp = foldl1 A <$> some atom withPrec precTab n p = p >>= \s -> if n == precOf s precTab then pure s else Parser $ const $ Left "" exprP n = if n <= 9 then getPrecs >>= \precTab -> exprP (succ n) >>= \a -> many ((,) <$> withPrec precTab n op <*> exprP (succ n)) >>= \as -> opFold precTab (\op x y -> A (A (V op) x) y) a as else aexp expr = exprP 0 gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> (lSquare *> rSquare *> pure "[]") apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing) <|> flip PatVar Nothing <$> (res "_" *> pure "_") <|> flip PatCon [] <$> gcon <|> PatLit <$> literal <|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" []) <$> between lSquare rSquare (sepBy pat comma) <|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" [])) where pairPat x y = PatCon "," [x, y] binPat f x y = PatCon f [x, y] patP n = if n <= 9 then getPrecs >>= \precTab -> patP (succ n) >>= \a -> many ((,) <$> withPrec precTab n qconop <*> patP (succ n)) >>= \as -> opFold precTab binPat a as else PatCon <$> gcon <*> many apat <|> apat pat = patP 0 maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id) guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "join#") <$> some ((\x y -> case x of V "True" -> \_ -> y _ -> A (A (A (V "if") x) y) ) <$> (res "|" *> expr) <*> (res s *> expr)) onePat vs x = joinIsFail $ Pa [(vs, x)] defOnePat vs x = Pa [(vs, x)] opDef x f y rhs = [(f, defOnePat [x, y] rhs)] leftyPat p expr = case pvars of [] -> [] (h:t) -> let gen = '@':h in (gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars where pvars = filter (/= "_") $ patVars p def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ guards "=") <|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=") coalesce = \case [] -> [] h@(s, x):t -> case t of [] -> [h] (s', x'):t' -> let f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts' f _ _ = error "bad multidef" in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t defSemi = coalesce . concat <$> sepBy1 def (some semicolon) braceDef = concat <$> braceSep defSemi simpleType c vs = foldl TAp (TC c) (map TV vs) conop = conSym <|> between backquote backquote conId fieldDecl = (\vs t -> map (, t) vs) <$> sepBy1 var comma <*> (res "::" *> _type) constr = (\x c y -> Constr c [("", x), ("", y)]) <$> aType <*> conop <*> aType <|> Constr <$> conId <*> ( concat <$> between lBrace rBrace (fieldDecl `sepBy` comma) <|> map ("",) <$> many aType) dclass = conId _deriving = (res "deriving" *> ((:[]) <$> dclass <|> paren (dclass `sepBy` comma))) <|> pure [] adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") <*> _deriving impDecl = addImport <$> (res "import" *> conId) topdecls = braceSep $ adt <|> classDecl <|> instDecl <|> res "foreign" *> ( res "import" *> var *> (addFFI <$> lexeme tokStr <*> var <*> (res "::" *> _type)) <|> res "export" *> var *> (addExport <$> lexeme tokStr <*> var) ) <|> addDefs <$> defSemi <|> fixity *> pure id <|> impDecl haskell = between lexemePrelude eof $ some $ (,) <$> (res "module" *> conId <* res "where" <|> pure "Main") <*> topdecls parseProgram s = fst <$> parse haskell s
-- Record fields. module Typer where import Base import Map import Ast import Parser import Unify app01 s x y = maybe (A (L s x) y) snd $ go x where go expr = case expr of E _ -> Just (False, expr) V v -> Just $ if s == v then (True, y) else (False, expr) A l r -> do (a, l') <- go l (b, r') <- go r if a && b then Nothing else pure (a || b, A l' r') L v t -> if v == s then Just (False, expr) else second (L v) <$> go t optiApp t = case t of A x y -> let x' = optiApp x y' = optiApp y in case x' of L s v -> app01 s v y' _ -> A x' y' L s x -> L s (optiApp x) _ -> t -- Pattern compiler. findCon dcs s = foldr (<|>) Nothing $ mlookup s <$> dcs rewritePats dcs = \case [] -> pure $ V "join#" vsxs@((as0, _):_) -> case as0 of [] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs _ -> do let k = length as0 n <- get put $ n + k let vs@(vh:vt) = take k $ (`shows` "#") <$> [n..] cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase dcs v Tip [(p, b)]) x (zip at vt) flip (foldr L) vs <$> rewriteCase dcs vh Tip cs patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y rewriteCase dcs caseVar tab = \case [] -> flush $ V "join#" ((v, x):rest) -> go v x rest where rec = rewriteCase dcs caseVar go v x rest = case v of PatLit lit -> patEq lit (V caseVar) x <$> rec Tip rest >>= flush PatVar s m -> let x' = beta s (V caseVar) x in case m of Nothing -> A (L "join#" x') <$> rec Tip rest >>= flush Just v' -> go v' x' rest PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest flush onFail = case toAscList tab of [] -> pure onFail -- TODO: Check rest of `tab` lies in cs. (firstC, _):_ -> do let cs = maybe undefined id $ findCon dcs firstC jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts Just f -> rewritePats dcs $ f [] ) cs pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail findField dcs f = case [(con, fields) | tab <- dcs, (_, cons) <- toAscList tab, Constr con fields <- cons, (f', _) <- fields, f == f'] of [] -> error $ "no such field: " ++ f h:_ -> h resolveFieldBinds dcs t = go t where go t = case t of E _ -> t V _ -> t A (E (Basic "{=")) (A rawExpr fbsAst) -> let expr = go rawExpr fromAst t = case t of A (A (V f) body) rest -> (f, go body):fromAst rest E (Basic "=}") -> [] fbs@((firstField, _):_) = fromAst fbsAst (con, fields) = findField dcs firstField cs = maybe undefined id $ findCon dcs con newValue = foldl A (V con) [maybe (V $ "[old]"++f) id $ lookup f fbs | (f, _) <- fields] initValue = foldl A expr [maybe (V "undefined") id $ lookup f fbs | (f, _) <- fields] updater = foldr L newValue $ ("[old]"++) . fst <$> fields inj x = map (\(Constr con' _) -> if con' == con then x else V "undefined") cs allPresent = all (`elem` (fst <$> fields)) $ fst <$> fbs isCon = case expr of V (h:_) -> 'A' <= h && h <= 'Z' _ -> False in if allPresent then if isCon then initValue else foldl A (A (V $ specialCase cs) expr) $ inj updater else error "bad fields in update" A x y -> A (go x) (go y) L s x -> L s $ go x secondM f (a, b) = (a,) <$> f b patternCompile dcs t = optiApp $ resolveFieldBinds dcs $ evalState (go t) 0 where go t = case t of E _ -> pure t V _ -> pure t A x y -> liftA2 A (go x) (go y) L s x -> L s <$> go x Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats dcs -- Type inference. instantiate' t n tab = case t of TC s -> ((t, n), tab) TV s -> case lookup s tab of Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab) Just v -> ((v, n), tab) TAp x y -> let ((t1, n1), tab1) = instantiate' x n tab ((t2, n2), tab2) = instantiate' y n1 tab1 in ((TAp t1 t2, n2), tab2) instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab) instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where ((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps proofApply sub a = case a of Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty) A x y -> A (proofApply sub x) (proofApply sub y) L s t -> L s $ proofApply sub t _ -> a typeAstSub sub (t, a) = (apply sub t, proofApply sub a) infer typed loc ast csn@(cs, n) = case ast of E x -> Right $ case x of Const _ -> ((TC "Int", ast), csn) ChrCon _ -> ((TC "Char", ast), csn) StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn) Link im s q -> insta q V s -> maybe (Left $ "undefined: " ++ s) Right $ (\t -> ((t, ast), csn)) <$> lookup s loc <|> insta . fst <$> mlookup s typed A x y -> infer typed loc x (cs, n + 1) >>= \((tx, ax), csn1) -> infer typed loc y csn1 >>= \((ty, ay), (cs2, n2)) -> unify tx (arr ty va) cs2 >>= \cs -> Right ((va, A ax ay), (cs, n2)) L s x -> first (\(t, a) -> (arr va t, L s a)) <$> infer typed ((s, va):loc) x (cs, n + 1) where va = TV $ show n insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1)) where (Qual preds ty1, n1) = instantiate ty n findInstance tycl qn@(q, n) p@(Pred cl ty) insts = case insts of [] -> let v = '*':show n in Right (((p, v):q, n + 1), V v) (modName, Instance h name ps _):rest -> case match h ty of Nothing -> findInstance tycl qn p rest Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t) <$> findProof tycl (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps findProof tycl pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of Nothing -> findInstance tycl psn pred $ tycl classId Just s -> Right (psn, V s) prove tycl psn a = case a of Proof pred -> findProof tycl pred psn A x y -> prove tycl psn x >>= \(psn1, x1) -> second (A x1) <$> prove tycl psn1 y L s t -> second (L s) <$> prove tycl psn t _ -> Right (psn, a) data Dep a = Dep ([String] -> Either String ([String], a)) instance Functor Dep where fmap f = \(Dep mf) -> Dep \g -> do (g', x) <- mf g pure (g', f x) instance Applicative Dep where pure x = Dep \g -> Right (g, x) (Dep mf) <*> (Dep mx) = Dep \g -> do (g', f) <- mf g (g'', x) <- mx g' pure (g'', f x) addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ()) badDep s = Dep $ const $ Left s runDep (Dep f) = f [] astLink typed locals imps mods ast = runDep $ go [] ast where go bound ast = case ast of V s | elem s bound -> pure ast | member s locals -> case findImportSym imps mods s of [] -> (if member s typed then pure () else addDep s) *> pure ast _ -> badDep $ "ambiguous: " ++ s | True -> case findImportSym imps mods s of [] -> badDep $ "missing: " ++ s [(im, t)] -> pure $ E $ Link im s t _ -> badDep $ "ambiguous: " ++ s A x y -> A <$> go bound x <*> go bound y L s t -> L s <$> go (s:bound) t _ -> pure ast forFree cond f bound t = case t of E _ -> t V s -> if (not $ s `elem` bound) && cond s then f t else t A x y -> A (rec bound x) (rec bound y) L s t' -> L s $ rec (s:bound) t' where rec = forFree cond f inferno tycl typed defmap syms = let loc = zip syms $ TV . (' ':) <$> syms principal (acc, (subs, n)) s = do expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap) ((t, a), (ms, n1)) <- infer typed loc expr (subs, n) cs <- unify (TV (' ':s)) t ms Right ((s, (t, a)):acc, (cs, n1)) gatherPreds (acc, psn) (s, (t, a)) = do (psn, a) <- prove tycl psn a pure ((s, (t, a)):acc, psn) in do (stas, (soln, _)) <- foldM principal ([], (Tip, 0)) syms stas <- pure $ second (typeAstSub soln) <$> stas (stas, (ps, _)) <- foldM gatherPreds ([], ([], 0)) $ second (typeAstSub soln) <$> stas let preds = fst <$> ps dicts = snd <$> ps applyDicts (s, (t, a)) = (s, (Qual preds t, foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts)) pure $ map applyDicts stas findImportSym imps mods s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s qas | im <- imps, let qas = fst $ mods ! im] inferDefs tycl defs typed = do let insertUnique m (s, (_, t)) = case mlookup s m of Nothing -> case mlookup s typed of Nothing -> Right $ insert s t m _ -> Left $ "reserved: " ++ s _ -> Left $ "duplicate: " ++ s addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs) graph = foldr addEdges (Tip, Tip) defs defmap <- foldM insertUnique Tip defs let ins k = maybe [] id $ mlookup k $ fst graph outs k = maybe [] id $ mlookup k $ snd graph inferComponent typed syms = foldr (uncurry insert) typed <$> inferno tycl typed defmap syms foldM inferComponent typed $ scc ins outs $ keys defmap dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps) inferTypeclasses tycl typeOfMethod typed dcs linker ienv = foldM perClass typed $ toAscList ienv where perClass typed (classId, Tycl sigs insts) = foldM perInstance typed insts where perInstance typed (Instance ty name ps idefs) = do let dvs = map snd $ fst $ dictVars ps 0 perMethod s = do let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs expr <- snd <$> linker (patternCompile dcs rawExpr) (ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right $ infer typed [] expr (Tip, 0) let (tx, ax) = typeAstSub sub ta -- e.g. qc = Eq a => a -> a -> Bool -- We instantiate: Eq a1 => a1 -> a1 -> Bool. qc = typeOfMethod s (Qual [Pred _ headT] tc, n1) = instantiate qc n -- Mix the predicates `ps` with the type of `headT`, applying a -- substitution such as (a1, [a]) so the variable names match. -- e.g. Eq a => [a] -> [a] -> Bool Just subc = match headT ty (Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1 case match tx t2 of Nothing -> Left "class/instance type conflict" Just subx -> do ((ps3, _), tr) <- prove tycl (dictVars ps2 0) (proofApply subx ax) if length ps2 /= length ps3 then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name else pure tr ms <- mapM perMethod sigs pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed primAdts = [ (TC "()", [Constr "()" []]) , (TC "Bool", [Constr "True" [], Constr "False" []]) , (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" $ map ("",) [TV "a", TAp (TC "[]") (TV "a")]]) , (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," $ map ("",) [TV "a", TV "b"]]) ] prims = let ro = E . Basic dyad s = TC s `arr` (TC s `arr` TC s) bin s = A (ro "Q") (ro s) in map (second (first $ Qual [])) $ [ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ")) , ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE")) , ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ")) , ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE")) , ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y")) , ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I")) , ("chr", (arr (TC "Int") (TC "Char"), ro "I")) , ("ord", (arr (TC "Char") (TC "Int"), ro "I")) , ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C")) , ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V")) , ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR")) , ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF")) , ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")), A (ro "T") (ro "READREF"))) , ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))), A (A (ro "R") (ro "WRITEREF")) (ro "B"))) , ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END")) , ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K"))) , ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) , ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) ] ++ map (\(s, v) -> (s, (dyad "Int", bin v))) [ ("intAdd", "ADD") , ("intSub", "SUB") , ("intMul", "MUL") , ("intDiv", "DIV") , ("intMod", "MOD") , ("intQuot", "DIV") , ("intRem", "MOD") ] neatImportPrim = Neat Tip [] [] Tip [] [] ["#"] tabulateModules mods = foldM ins Tip mods where go = foldr ($) neatImportPrim ins tab (k, prog) = case mlookup k tab of Nothing -> Right $ insert k (go prog) tab Just _ -> Left $ "duplicate module: " ++ k inferModule tab acc name = case mlookup name acc of Nothing -> do let Neat rawIenv defs typedList adtTab ffis ffes imps = tab ! name typed = fromList typedList fillSigs (cl, Tycl sigs is) = (cl,) $ case sigs of [] -> Tycl (findSigs cl) is _ -> Tycl sigs is findSigs cl = maybe (error $ "no sigs: " ++ cl) id $ find (not . null) [maybe [] (\(Tycl sigs _) -> sigs) $ mlookup cl $ typeclasses (tab ! im) | im <- imps] ienv = fromList $ fillSigs <$> toAscList rawIenv locals = fromList $ map (, ()) $ (fst <$> typedList) ++ (fst <$> defs) insts im (Tycl _ is) = (im,) <$> is classes im = if im == "" then ienv else typeclasses $ tab ! im tycl classId = concat [maybe [] (insts im) $ mlookup classId $ classes im | im <- "":imps] dcs = adtTab : map (dataCons . (tab !)) imps typeOfMethod s = maybe undefined id $ foldr (<|>) (fst <$> mlookup s typed) [fmap fst $ lookup s $ typedAsts $ tab ! im | im <- imps] acc' <- foldM (inferModule tab) acc imps let linker = astLink typed locals imps acc' depdefs <- mapM (\(s, t) -> (s,) <$> linker (patternCompile dcs t)) defs typed <- inferDefs tycl depdefs typed typed <- inferTypeclasses tycl typeOfMethod typed dcs linker ienv Right $ insert name (typed, (ffis, ffes)) acc' Just _ -> Right acc untangle s = do tab <- insert "#" neatPrim <$> (parseProgram s >>= tabulateModules) foldM (inferModule tab) Tip $ keys tab neatPrim = foldr (\(a, b) -> addAdt a b []) (Neat Tip [] prims Tip [] [] []) primAdts
We replace a list with a Map
for faster type unification.
-- Replace list with map. module Unify where import Base import Map import Ast apply sub t = case t of TC v -> t TV v -> maybe t id $ mlookup v sub TAp a b -> TAp (apply sub a) (apply sub b) (@@) s1 s2 = foldr (\(k, v) m -> insert k v m) (apply s1 <$> s2) $ toAscList s1 occurs s t = case t of TC v -> False TV v -> s == v TAp a b -> occurs s a || occurs s b varBind s t = case t of TC v -> Right $ singleton s t TV v -> Right $ if v == s then Tip else singleton s t TAp a b -> if occurs s t then Left "occurs check" else Right $ singleton s t ufail t u = Left $ ("unify fail: "++) . shows t . (" vs "++) . shows u $ "" mgu t u = case t of TC a -> case u of TC b -> if a == b then Right Tip else ufail t u TV b -> varBind b t TAp a b -> ufail t u TV a -> varBind a u TAp a b -> case u of TC b -> ufail t u TV b -> varBind b t TAp c d -> mgu a c >>= unify b d unify a b s = (@@ s) <$> mgu (apply s a) (apply s b) merge s1 s2 = foldM go s2 $ toAscList s1 where go subs (v, t) = case mlookup v s2 of Nothing -> Just $ insert v t subs Just _ | apply s1 (TV v) == apply s2 (TV v) -> Just subs | True -> Nothing match h t = case h of TC a -> case t of TC b | a == b -> Just Tip _ -> Nothing TV a -> Just $ singleton a t TAp a b -> case t of TAp c d -> case match a c of Nothing -> Nothing Just ac -> case match b d of Nothing -> Nothing Just bd -> merge ac bd _ -> Nothing
Party2
Recall we require a fixity declaration to precede the use of its corresponding operator, which forces us to concatenate module sources in a particular order. We remove this wart by adding a new phase. Once done, not only may we paste together modules in any order, but we may also declare fixities anywhere within a module.
During parsing, operators have the same precedence. When a chain of two or more
appear in a row, we abuse the syntax tree to store them in a right-associative
list, for example: [1 + 2, * 3, - 4, + 5]
.
For patterns, we use the list field of a PatCon
value; a made-up data
constructor "{+"
indicates the beginning of such a list. Expressions are
clumsier; we bookend chains with L "("
and V ")"
, and fashion a list out of
A
and V
nodes.
Later, once all fixity declarations are known, we traverse the syntax tree, and
we re-associate each specially marked infix chain. The algorithm starts with
the first binary infix expression, that is, two operands and one operator such
as 1 + 2
. For each operator and operand we add on the right, we walk down the
right spine of the current syntax tree until we reach a node of higher
precedence; leaf nodes are considered to have maximum precedence. Then we
insert the operator and operand at this point. We also check for illegal infix
operator conflicts.
The code is messy due to a couple of wrinkles. Firstly, we have two distinct ad hoc representations of lists for holding infix chains. Secondly, we temporarily store the AST being reshaped in one-off tree structures.
However, we’re still cheating: we maintain one giant fixity declaration table for all operators across all modules, which relies on operators being distinct. Also, we only allow top-level fixity declarations. We could add support for scoped fixity declarations with yet more ad hoc encodings that we later use to create scoped fixity lookup tables that override the global ones.
We fix the problem with foreign imports across multiple modules. In the lone-module days, we numbered the imports as we parsed the source. Now, the numbering must be consistent across all modules.
In the spirit of incremental compilation, we replace the number of an import with its name in the syntax tree, which we map to a number during our code generation that corresponds to linking.
We reuse the Link
data constructor for this. The special {foreign}
module
indicates the function name is foreign.
We also check for name conflicts among foreign imports and exports.
We take advantage of our new ability to derive Eq
and Show
instances,
and also name the fields of the Neat
data type.
The Haskell report describes layout parsing as a separate phase, and we had followed this closely in our first implementation of the feature.
It turns out to be less code to inline the various bits and pieces of the layout logic in the parser. We do wind up with a touch more complexity, as comments and whitespace must be parsed in a different code path, but it’s manageable.
For parsing one of let where do of
, and also \case
since we act as if the
LambdaCase
is enabled, the curlyCheck
function temporarily disables the
angle-bracket indentation rule by placing an indent value of 0 at the head of
the indents
list.
-- Use `deriving`. -- Change `isEOF` and `getChar` to behave more like Haskell's. module Base where infixr 9 . infixl 7 * , `div` , `mod` infixl 6 + , - infixr 5 ++ infixl 4 <*> , <$> , <* , *> infix 4 == , /= , <= , < , >= , > infixl 3 && , <|> infixl 2 || infixl 1 >> , >>= infixr 1 =<< infixr 0 $ class Functor f where fmap :: (a -> b) -> f a -> f b class Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (<$>) = fmap liftA2 f x y = f <$> x <*> y (>>) f g = f >>= \_ -> g (=<<) = flip (>>=) class Eq a where (==) :: a -> a -> Bool instance Eq () where () == () = True instance Eq Bool where True == True = True False == False = True _ == _ = False instance (Eq a, Eq b) => Eq (a, b) where (a1, b1) == (a2, b2) = a1 == a2 && b1 == b2 instance Eq a => Eq [a] where xs == ys = case xs of [] -> case ys of [] -> True _ -> False x:xt -> case ys of [] -> False y:yt -> x == y && xt == yt instance Eq Int where (==) = intEq instance Eq Char where (==) = charEq ($) f x = f x id x = x const x y = x flip f x y = f y x (&) x f = f x class Ord a where (<=) :: a -> a -> Bool x <= y = case compare x y of LT -> True EQ -> True GT -> False compare :: a -> a -> Ordering compare x y = if x <= y then if y <= x then EQ else LT else GT instance Ord Int where (<=) = intLE instance Ord Char where (<=) = charLE data Ordering = LT | GT | EQ deriving (Eq, Show) instance Ord a => Ord [a] where xs <= ys = case xs of [] -> True x:xt -> case ys of [] -> False y:yt -> if x <= y then if y <= x then xt <= yt else True else False compare xs ys = case xs of [] -> case ys of [] -> EQ _ -> LT x:xt -> case ys of [] -> GT y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT data Maybe a = Nothing | Just a deriving (Eq, Show) data Either a b = Left a | Right b deriving (Eq, Show) fst (x, y) = x snd (x, y) = y uncurry f (x, y) = f x y first f (x, y) = (f x, y) second f (x, y) = (x, f y) not a = if a then False else True x /= y = not $ x == y (.) f g x = f (g x) (||) f g = if f then True else g (&&) f g = if f then g else False take 0 xs = [] take _ [] = [] take n (h:t) = h : take (n - 1) t drop n xs | n <= 0 = xs drop _ [] = [] drop n (_:xs) = drop (n-1) xs splitAt n xs = (take n xs, drop n xs) maybe n j m = case m of Nothing -> n; Just x -> j x instance Functor Maybe where fmap f = maybe Nothing (Just . f) instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x foldr c n = \case [] -> n; h:t -> c h $ foldr c n t length = foldr (\_ n -> n + 1) 0 mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure []) mapM_ f = foldr ((>>) . f) (pure ()) foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0 when x y = if x then y else pure () unless x y = if x then pure () else y error = primitiveError undefined = error "undefined" foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a foldl1 f (h:t) = foldl f h t elem k xs = foldr (\x t -> x == k || t) False xs find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs (++) = flip (foldr (:)) concat = foldr (++) [] map = flip (foldr . ((:) .)) [] head (h:_) = h tail (_:t) = t xs!!0 = head xs xs!!n = tail xs!!(n - 1) replicate 0 _ = [] replicate n x = x : replicate (n - 1) x null [] = True null _ = False reverse = foldl (flip (:)) [] dropWhile _ [] = [] dropWhile p xs@(x:xt) | p x = dropWhile p xt | True = xs span _ [] = ([], []) span p xs@(x:xt) | p x = first (x:) $ span p xt | True = ([],xs) break p = span (not . p) isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160] words s = case dropWhile isSpace s of "" -> [] s' -> w : words s'' where (w, s'') = break isSpace s' instance Functor [] where fmap = map instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f instance Monad [] where return = (:[]); (>>=) = flip concatMap concatMap = (concat .) . map lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing filter f = foldr (\x xs -> if f x then x:xs else xs) [] union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt init (x:xt) = case xt of [] -> []; _ -> x : init xt intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt) all f = and . map f any f = or . map f and = foldr (&&) True or = foldr (||) False zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt zip = zipWith (,) data State s a = State (s -> (a, s)) runState (State f) = f instance Functor (State s) where fmap f = \(State h) -> State (first f . h) instance Applicative (State s) where pure a = State (a,) (State f) <*> (State x) = State \s -> let (g, s') = f s in first g $ x s' instance Monad (State s) where return a = State (a,) (State h) >>= f = State $ uncurry (runState . f) . h evalState m s = fst $ runState m s get = State \s -> (s, s) put n = State \s -> ((), n) either l r e = case e of Left x -> l x; Right x -> r x instance Functor (Either a) where fmap f e = either Left (Right . f) e instance Applicative (Either a) where pure = Right ef <*> ex = case ef of Left s -> Left s Right f -> either Left (Right . f) ex instance Monad (Either a) where return = Right ex >>= f = either Left f ex class Alternative f where empty :: f a (<|>) :: f a -> f a -> f a asum = foldr (<|>) empty (*>) = liftA2 \x y -> y (<*) = liftA2 \x y -> x many p = liftA2 (:) p (many p) <|> pure [] some p = liftA2 (:) p (many p) sepBy1 p sep = liftA2 (:) p (many (sep *> p)) sepBy p sep = sepBy1 p sep <|> pure [] between x y p = x *> (p <* y) showParen b f = if b then ('(':) . f . (')':) else f iterate f x = x : iterate f (f x) takeWhile _ [] = [] takeWhile p xs@(x:xt) | p x = x : takeWhile p xt | True = [] class Enum a where succ :: a -> a pred :: a -> a toEnum :: Int -> a fromEnum :: a -> Int enumFrom :: a -> [a] enumFromTo :: a -> a -> [a] instance Enum Int where succ = (+1) pred = (+(0-1)) toEnum = id fromEnum = id enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo instance Enum Char where succ = chr . (+1) . ord pred = chr . (+(0-1)) . ord toEnum = chr fromEnum = ord enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo (+) = intAdd (-) = intSub (*) = intMul div = intDiv mod = intMod instance (Ord a, Ord b) => Ord (a, b) where (a1, b1) <= (a2, b2) = a1 <= a2 && (not (a2 <= a1) || b1 <= b2) a < b = a <= b && a /= b a > b = b <= a && a /= b (>=) = flip(<=) instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y) instance Monad IO where return = ioPure ; (>>=) = ioBind instance Functor IO where fmap f x = ioPure f <*> x class Show a where showsPrec :: Int -> a -> String -> String showsPrec _ x = (show x++) show :: a -> String show x = shows x "" showList :: [a] -> String -> String showList = showList__ shows shows = showsPrec 0 showList__ _ [] s = "[]" ++ s showList__ showx (x:xs) s = '[' : showx x (showl xs) where showl [] = ']' : s showl (y:ys) = ',' : showx y (showl ys) showInt__ n | 0 == n = id | True = showInt__ (n`div`10) . (chr (48+n`mod`10):) instance Show () where show () = "()" instance Show Bool where show True = "True" show False = "False" instance Show a => Show [a] where showsPrec _ = showList instance Show Int where showsPrec _ n | 0 == n = ('0':) | 1 <= n = showInt__ n | 2 * n == 0 = ("-2147483648"++) | True = ('-':) . showInt__ (0 - n) showLitChar__ '\n' = ("\\n"++) showLitChar__ '\\' = ("\\\\"++) showLitChar__ c = (c:) instance Show Char where showsPrec _ '\'' = ("'\\''"++) showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':) showList s = ('"':) . foldr (.) id (map go s) . ('"':) where go '"' = ("\\\""++) go c = showLitChar__ c instance (Show a, Show b) => Show (a, b) where showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b
-- FFI across multiple modules. -- Rewrite with named fields, deriving. module Ast where import Base import Map data Type = TC String | TV String | TAp Type Type deriving Eq arr a b = TAp (TAp (TC "->") a) b data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat] data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred data Constr = Constr String [(String, Type)] data Pred = Pred String Type deriving Eq data Qual = Qual [Pred] Type instance Show Type where showsPrec _ = \case TC s -> (s++) TV s -> (s++) TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b TAp a b -> showParen True $ shows a . (' ':) . shows b instance Show Pred where showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++) instance Show Qual where showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t instance Show Extra where showsPrec _ = \case Basic s -> (s++) Const i -> shows i ChrCon c -> shows c StrCon s -> shows s Link im s _ -> (im++) . ('.':) . (s++) instance Show Pat where showsPrec _ = \case PatLit e -> shows e PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps) showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++) instance Show Ast where showsPrec prec = \case E e -> shows e V s -> showVar s A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts) Proof p -> ("{Proof "++) . shows p . ("}"++) data Instance = Instance -- Type, e.g. Int for Eq Int. Type -- Dictionary name, e.g. "{Eq Int}" String -- Context. [Pred] -- Method definitions (Map String Ast) data Assoc = NAssoc | LAssoc | RAssoc deriving Eq data Neat = Neat { typeclasses :: Map String [String] , instances :: Map String [Instance] , topDefs :: [(String, Ast)] -- | Typed ASTs, ready for compilation, including ADTs and methods, -- e.g. (==), (Eq a => a -> a -> Bool, select-==) , typedAsts :: [(String, (Qual, Ast))] , dataCons :: Map String [Constr] , ffiImports :: Map String Type , ffiExports :: Map String String , moduleImports :: [String] , opFixity :: Map String (Int, Assoc) } neatEmpty = Neat Tip Tip [] [] Tip Tip Tip [] Tip patVars = \case PatLit _ -> [] PatVar s m -> s : maybe [] patVars m PatCon _ args -> concat $ patVars <$> args fvPro bound expr = case expr of V s | not (elem s bound) -> [s] A x y -> fvPro bound x `union` fvPro bound y L s t -> fvPro (s:bound) t Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts _ -> [] beta s a t = case t of E _ -> t V v -> if s == v then a else t A x y -> A (beta s a x) (beta s a y) L v u -> if s == v then t else L v $ beta s a u typeVars = \case TC _ -> [] TV v -> [v] TAp x y -> typeVars x `union` typeVars y depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex -> if vertex `elem` visited then st else second (vertex:) $ depthFirstSearch relation (vertex:visited, sequence) (relation vertex) spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex -> if vertex `elem` visited then st else second ((:setSequence) . (vertex:)) $ depthFirstSearch relation (vertex:visited, []) (relation vertex) scc ins outs = spanning . depthFirst where depthFirst = snd . depthFirstSearch outs ([], []) spanning = snd . spanningSearch ins ([], [])
-- FFI across multiple modules. -- Rewrite with named fields. -- Fix fixities after parsing. module Parser where import Base import Ast import Map -- Parser. data ParserState = ParserState { readme :: [(Char, (Int, Int))] , landin :: String , indents :: [Int] } data Parser a = Parser { getParser :: ParserState -> Either String (a, ParserState) } instance Functor Parser where fmap f x = pure f <*> x instance Applicative Parser where pure x = Parser \inp -> Right (x, inp) (Parser f) <*> (Parser x) = Parser \inp -> do (fun, t) <- f inp (arg, u) <- x t pure (fun arg, u) instance Monad Parser where return = pure (Parser x) >>= f = Parser \inp -> do (a, t) <- x inp getParser (f a) t instance Alternative Parser where empty = bad "" x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp notFollowedBy p = do saved <- Parser \pasta -> Right (pasta, pasta) ret <- p *> pure (bad "") <|> pure (pure ()) Parser \_ -> Right ((), saved) ret parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] where rowcol s rc = case s of [] -> [] h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc) advanceRC n (r, c) | n `elem` [10, 11, 12, 13] = (r + 1, 1) | n == 9 = (r, (c + 8)`mod`8) | True = (r, c + 1) indentOf pasta = case readme pasta of [] -> 1 (_, (_, c)):_ -> c ins c pasta = pasta { landin = c:landin pasta } angle n pasta = case indents pasta of m:ms | m == n -> ins ';' pasta | n + 1 <= m -> ins '}' $ angle n pasta { indents = ms } _ -> pasta curly n pasta = case indents pasta of m:ms | m + 1 <= n -> ins '{' pasta { indents = n:m:ms } [] | 1 <= n -> ins '{' pasta { indents = [n] } _ -> ins '{' . ins '}' $ angle n pasta sat f = Parser \pasta -> case landin pasta of c:t -> if f c then Right (c, pasta { landin = t }) else Left "unsat" [] -> case readme pasta of [] -> case indents pasta of [] -> Left "EOF" m:ms | m /= 0 && f '}' -> Right ('}', pasta { indents = ms }) _ -> Left "unsat" (h, _):t | f h -> let p' = pasta { readme = t } in case h of '}' -> case indents pasta of 0:ms -> Right (h, p' { indents = ms }) _ -> Left "unsat" '{' -> Right (h, p' { indents = 0:indents p' }) _ -> Right (h, p') _ -> Left "unsat" char c = sat (c ==) rawSat f = Parser \pasta -> case readme pasta of [] -> Left "EOF" (h, _):t -> if f h then Right (h, pasta { readme = t }) else Left "unsat" eof = Parser \pasta -> case pasta of ParserState [] [] _ -> Right ((), pasta) _ -> badpos pasta "want eof" comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *> (rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True spaces = isNewline <$> rawSat isSpace whitespace = do offside <- or <$> many (spaces <|> comment) Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta) hexValue d | d <= '9' = ord d - ord '0' | d <= 'F' = 10 + ord d - ord 'A' | d <= 'f' = 10 + ord d - ord 'a' isNewline c = ord c `elem` [10, 11, 12, 13] isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:") isSmall c = c <= 'z' && 'a' <= c || c == '_' small = sat isSmall large = sat \x -> (x <= 'Z') && ('A' <= x) hexit = sat \x -> (x <= '9') && ('0' <= x) || (x <= 'F') && ('A' <= x) || (x <= 'f') && ('a' <= x) digit = sat \x -> (x <= '9') && ('0' <= x) decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit nameTailChar = small <|> large <|> digit <|> char '\'' nameTailed p = liftA2 (:) p $ many nameTailChar escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure (chr 0) <|> char 'x' *> (chr <$> hexadecimal)) tokOne delim = escape <|> rawSat (delim /=) charSeq = mapM char tokChar = between (char '\'') (char '\'') (tokOne '\'') quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"' quasiquoteStr = charSeq "[r|" *> quasiquoteBody quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody tokStr = quoteStr <|> quasiquoteStr integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr varish = lexeme $ nameTailed small bad s = Parser \pasta -> badpos pasta s badpos pasta s = Left $ loc $ ": " ++ s where loc = case readme pasta of [] -> ("EOF"++) (_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c varId = do s <- varish if elem s ["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"] then bad $ "reserved: " ++ s else pure s varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol) varSym = lexeme $ do s <- varSymish if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s conId = lexeme $ nameTailed large conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol conSym = do s <- conSymish if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s special c = lexeme $ sat (c ==) comma = special ',' semicolon = special ';' lParen = special '(' rParen = special ')' lBrace = special '{' rBrace = special '}' lSquare = special '[' rSquare = special ']' backquote = special '`' lexeme f = f <* whitespace lexemePrelude = whitespace *> Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of Left _ -> Right ((), curly (indentOf pasta) pasta) Right _ -> Right ((), pasta) curlyCheck f = do Parser \pasta -> Right ((), pasta { indents = 0:indents pasta }) r <- f Parser \pasta -> let pasta' = pasta { indents = tail $ indents pasta } in case readme pasta of [] -> Right ((), curly 0 pasta') ('{', _):_ -> Right ((), pasta') (_, (_, col)):_ -> Right ((), curly col pasta') pure r conOf (Constr s _) = s specialCase (h:_) = '|':conOf h mkCase t cs = (specialCase cs, ( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ sts) -> foldr arr (TV "case") $ snd <$> sts) cs , E $ Basic "I")) mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", []) scottEncode _ ":" _ = E $ Basic "CONS" scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs) scottConstr t cs (Constr s sts) = (s, (Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts)) : [(field, (Qual [] $ t `arr` ft, L s $ foldl A (V s) $ inj $ proj field)) | (field, ft) <- sts, field /= ""] where ts = snd <$> sts proj fd = foldr L (V fd) $ fst <$> sts inj x = map (\(Constr s' _) -> if s' == s then x else V "undefined") cs mkAdtDefs t cs = mkCase t cs : concatMap (scottConstr t cs) cs mkFFIHelper n t acc = case t of TC s -> acc TAp (TC "IO") _ -> acc TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs addAdt t cs ders neat = foldr derive neat' ders where neat' = neat { typedAsts = mkAdtDefs t cs ++ typedAsts neat , dataCons = updateDcs cs $ dataCons neat } derive "Eq" = addInstance "Eq" (mkPreds "Eq") t [("==", Pa $ map eqCase cs )] derive "Show" = addInstance "Show" (mkPreds "Show") t [("showsPrec", L "prec" $ Pa $ map showCase cs )] derive der = error $ "bad deriving: " ++ der prec0 = E $ Const 0 showCase (Constr con args) = let as = show <$> [1..length args] in ([PatCon con $ mkPatVar "" <$> as], case args of [] -> A (V "++") (E $ StrCon con) _ -> case con of ':':_ -> A (A (V "showParen") $ V "True") $ foldr1 (\f g -> A (A (V ".") f) g) [ A (A (V "showsPrec") prec0) (V "1") , A (V "++") (E $ StrCon $ ' ':con++" ") , A (A (V "showsPrec") prec0) (V "2") ] _ -> A (A (V "showParen") $ A (A (V "<=") prec0) $ V "prec") $ A (A (V ".") $ A (V "++") (E $ StrCon con)) $ foldr (\f g -> A (A (V ".") f) g) (L "x" $ V "x") $ map (\a -> A (A (V ".") (A (V ":") (E $ ChrCon ' '))) $ A (A (V "showsPrec") prec0) (V a)) as ) mkPreds classId = Pred classId . TV <$> typeVars t mkPatVar pre s = PatVar (pre ++ s) Nothing eqCase (Constr con args) = let as = show <$> [1..length args] in ([PatCon con $ mkPatVar "l" <$> as], Pa [ ([PatCon con $ mkPatVar "r" <$> as], foldr (\x y -> (A (A (V "&&") x) y)) (V "True") $ map (\n -> A (A (V "==") (V $ "l" ++ n)) (V $ "r" ++ n)) as) , ([PatVar "_" Nothing], V "False")]) addClass classId v (sigs, defs) neat = if not $ member classId $ typeclasses neat then neat { typeclasses = insert classId (keys sigs) $ typeclasses neat , typedAsts = selectors ++ typedAsts neat , topDefs = defaults ++ topDefs neat } else error $ "duplicate class: " ++ classId where vars = take (size sigs) $ show <$> [0..] selectors = zipWith (\var (s, t) -> (s, (Qual [Pred classId v] t, L "@" $ A (V "@") $ foldr L (V var) vars))) vars $ toAscList sigs defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs addInstance classId ps ty ds neat = neat { instances = insertWith (++) classId [Instance ty name ps (fromList ds)] $ instances neat } where name = '{':classId ++ (' ':shows ty "}") addForeignImport foreignname ourname t neat = let ffis = ffiImports neat in neat { typedAsts = (ourname, (Qual [] t, mkFFIHelper 0 t $ A (E $ Basic "F") $ A (E $ Basic "NUM") $ E $ Link "{foreign}" foreignname $ Qual [] t)) : typedAsts neat , ffiImports = insertWith (error $ "duplicate import: " ++ foreignname) foreignname t ffis } addForeignExport e f neat = neat { ffiExports = insertWith (error $ "duplicate export: " ++ e) e f $ ffiExports neat } addDefs ds neat = neat { topDefs = ds ++ topDefs neat } addImport im neat = neat { moduleImports = im:moduleImports neat } addFixities os prec neat = neat { opFixity = foldr (\o tab -> insert o prec tab) (opFixity neat) os } parseErrorRule = Parser \pasta -> case indents pasta of m:ms | m /= 0 -> Right ('}', pasta { indents = ms }) _ -> badpos pasta "missing }" res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where reservedSeq = if elem w ["let", "where", "do", "of"] then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol) paren = between lParen rParen braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x nonemptyTails [] = [] nonemptyTails xs@(x:xt) = xs : nonemptyTails xt joinIsFail t = A (L "join#" t) (V "fail#") addLets ls x = foldr triangle x components where vs = fst <$> ls ios = foldr (\(s, dsts) (ins, outs) -> (foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs)) (Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs triangle names expr = let tnames = nonemptyTails names appem vs = foldl1 A $ V <$> vs suball expr = foldl A (foldr L expr $ init names) $ appem <$> init tnames redef tns expr = foldr L (suball expr) tns in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ redef xt $ maybe undefined joinIsFail $ lookup x ls) (suball expr) tnames qconop = conSym <|> res ":" <|> between backquote backquote conId qconsym = conSym <|> res ":" op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId) con = conId <|> paren qconsym var = varId <|> paren varSym tycon = do s <- conId pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s aType = lParen *> ( rParen *> pure (TC "()") <|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen) <|> tycon <|> TV <$> varId <|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare))) bType = foldl1 TAp <$> some aType _type = foldr1 arr <$> sepBy bType (res "->") fixityDecl w a = do res w n <- lexeme integer os <- sepBy op comma pure $ addFixities os (n, a) fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi genDecl = (,) <$> var <*> (res "::" *> _type) classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls)) simpleClass = Pred <$> conId <*> _type scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma) instDecl = res "instance" *> ((\ps cl ty defs -> addInstance cl ps ty defs) <$> (scontext <* res "=>" <|> pure []) <*> conId <*> _type <*> (res "where" *> braceDef)) letin = addLets <$> between (res "let") (res "in") braceDef <*> expr ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$> (res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr) listify = foldr (\h t -> A (A (V ":") h) t) (V "[]") alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> guards "->") cas = flip A <$> between (res "case") (res "of") expr <*> alts lamCase = curlyCheck (res "case") *> alts lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr)) flipPairize y x = A (A (V ",") x) y moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ","))) parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id) rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()")) maybePureUnit = maybe (V "pure" `A` V "()") id stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr) <|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr <|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef) doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt) compQual = (\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs) <$> pat <*> (res "<-" *> expr) <|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr <|> addLets <$> (res "let" *> braceDef) sqExpr = between lSquare rSquare $ ((&) <$> expr <*> ( res ".." *> ( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr <|> pure (A (V "enumFrom")) ) <|> res "|" *> ((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma) <|> (\t h -> listify (h:t)) <$> many (comma *> expr) ) ) <|> pure (V "[]") fbind = A <$> (V <$> var) <*> (res "=" *> expr) fBinds v = (do fbs <- between lBrace rBrace $ sepBy1 fbind comma pure $ A (E $ Basic "{=") $ foldr A (E $ Basic "=}") $ v:fbs ) <|> pure v atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section <|> cas <|> lam <|> (paren comma *> pure (V ",")) <|> V <$> (con <|> var) <|> literal >>= fBinds aexp = foldl1 A <$> some atom chain a = \case [] -> a A f b:rest -> case rest of [] -> A (A f a) b _ -> L "(" $ A (A (A f a) b) $ foldr A (V ")") rest _ -> error "unreachable" expr = chain <$> aexp <*> many (A <$> (V <$> op) <*> aexp) gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> (lSquare *> rSquare *> pure "[]") apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing) <|> flip PatVar Nothing <$> (res "_" *> pure "_") <|> flip PatCon [] <$> gcon <|> PatLit <$> literal <|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" []) <$> between lSquare rSquare (sepBy pat comma) <|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" [])) where pairPat x y = PatCon "," [x, y] patChain a = \case [] -> a PatCon f [b]:rest -> case rest of [] -> PatCon f [a, b] _ -> PatCon "{+" $ PatCon f [a, b] : rest _ -> error "unreachable" patAtom = PatCon <$> gcon <*> many apat <|> apat pat = patChain <$> patAtom <*> many (PatCon <$> qconop <*> ((:[]) <$> patAtom)) maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id) guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "join#") <$> some ((\x y -> case x of V "True" -> \_ -> y _ -> A (A (A (V "if") x) y) ) <$> (res "|" *> expr) <*> (res s *> expr)) onePat vs x = joinIsFail $ Pa [(vs, x)] defOnePat vs x = Pa [(vs, x)] opDef x f y rhs = [(f, defOnePat [x, y] rhs)] leftyPat p expr = case pvars of [] -> [] (h:t) -> let gen = '@':h in (gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars where pvars = filter (/= "_") $ patVars p def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ guards "=") <|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=") coalesce = \case [] -> [] h@(s, x):t -> case t of [] -> [h] (s', x'):t' -> let f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts' f _ _ = error "bad multidef" in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t defSemi = coalesce . concat <$> sepBy1 def (some semicolon) braceDef = concat <$> braceSep defSemi simpleType c vs = foldl TAp (TC c) (map TV vs) conop = conSym <|> between backquote backquote conId fieldDecl = (\vs t -> map (, t) vs) <$> sepBy1 var comma <*> (res "::" *> _type) constr = (\x c y -> Constr c [("", x), ("", y)]) <$> aType <*> conop <*> aType <|> Constr <$> conId <*> ( concat <$> between lBrace rBrace (fieldDecl `sepBy` comma) <|> map ("",) <$> many aType) dclass = conId _deriving = (res "deriving" *> ((:[]) <$> dclass <|> paren (dclass `sepBy` comma))) <|> pure [] adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") <*> _deriving impDecl = addImport <$> (res "import" *> conId) topdecls = braceSep $ adt <|> classDecl <|> instDecl <|> res "foreign" *> ( res "import" *> var *> (addForeignImport <$> lexeme tokStr <*> var <*> (res "::" *> _type)) <|> res "export" *> var *> (addForeignExport <$> lexeme tokStr <*> var) ) <|> addDefs <$> defSemi <|> fixity <|> impDecl haskell = between lexemePrelude eof $ some $ (,) <$> (res "module" *> conId <* res "where" <|> pure "Main") <*> topdecls parseProgram s = fst <$> parse haskell s
-- FFI across multiple modules. module Typer where import Base import Map import Ast import Parser import Unify app01 s x y = maybe (A (L s x) y) snd $ go x where go expr = case expr of E _ -> Just (False, expr) V v -> Just $ if s == v then (True, y) else (False, expr) A l r -> do (a, l') <- go l (b, r') <- go r if a && b then Nothing else pure (a || b, A l' r') L v t -> if v == s then Just (False, expr) else second (L v) <$> go t optiApp t = case t of A x y -> let x' = optiApp x y' = optiApp y in case x' of L s v -> app01 s v y' _ -> A x' y' L s x -> L s (optiApp x) _ -> t -- Pattern compiler. findCon dcs s = foldr (<|>) Nothing $ mlookup s <$> dcs rewritePats dcs = \case [] -> pure $ V "join#" vsxs@((as0, _):_) -> case as0 of [] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs _ -> do let k = length as0 n <- get put $ n + k let vs = take k $ (`shows` "#") <$> [n..] cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase dcs v Tip [(p, b)]) x (zip at $ tail vs) flip (foldr L) vs <$> rewriteCase dcs (head vs) Tip cs patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y rewriteCase dcs caseVar tab = \case [] -> flush $ V "join#" ((v, x):rest) -> go v x rest where rec = rewriteCase dcs caseVar go v x rest = case v of PatLit lit -> flush =<< patEq lit (V caseVar) x <$> rec Tip rest PatVar s m -> let x' = beta s (V caseVar) x in case m of Nothing -> flush =<< A (L "join#" x') <$> rec Tip rest Just v' -> go v' x' rest PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest flush onFail = case toAscList tab of [] -> pure onFail -- TODO: Check rest of `tab` lies in cs. (firstC, _):_ -> do let cs = maybe undefined id $ findCon dcs firstC jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts Just f -> rewritePats dcs $ f [] ) cs pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail findField dcs f = case [(con, fields) | tab <- dcs, (_, cons) <- toAscList tab, Constr con fields <- cons, (f', _) <- fields, f == f'] of [] -> error $ "no such field: " ++ f h:_ -> h resolveFieldBinds dcs t = go t where go t = case t of E _ -> t V _ -> t A (E (Basic "{=")) (A rawExpr fbsAst) -> let expr = go rawExpr fromAst t = case t of A (A (V f) body) rest -> (f, go body):fromAst rest E (Basic "=}") -> [] fbs@((firstField, _):_) = fromAst fbsAst (con, fields) = findField dcs firstField cs = maybe undefined id $ findCon dcs con newValue = foldl A (V con) [maybe (V $ "[old]"++f) id $ lookup f fbs | (f, _) <- fields] initValue = foldl A expr [maybe (V "undefined") id $ lookup f fbs | (f, _) <- fields] updater = foldr L newValue $ ("[old]"++) . fst <$> fields inj x = map (\(Constr con' _) -> if con' == con then x else V "undefined") cs allPresent = all (`elem` (fst <$> fields)) $ fst <$> fbs isCon = case expr of V (h:_) -> 'A' <= h && h <= 'Z' _ -> False in if allPresent then if isCon then initValue else foldl A (A (V $ specialCase cs) expr) $ inj updater else error "bad fields in update" A x y -> A (go x) (go y) L s x -> L s $ go x fixFixity precs t = case t of E _ -> t V _ -> t A x y -> A (go x) (go y) L s b | s == "(" -> infixer precs $ go b | True -> L s $ go b Pa vsxs -> Pa $ map (\(ps, a) -> (pgo <$> ps, go a)) vsxs where go = fixFixity precs pgo = patFixFixity precs data OpTree = OpLeaf Ast | OpNode String Ast OpTree infixer precs (A (A (A (V s) x) y) t) = go (OpNode s x (OpLeaf y)) t where go acc = \case A (A (V s) z) rest -> go (ins s z acc) rest V ")" -> decode acc _ -> error "unreachable" ins s z t = case t of OpNode s' x y | isStronger precs s s' -> OpNode s' x (ins s z y) | True -> OpNode s (decode t) (OpLeaf z) OpLeaf x -> OpNode s x (OpLeaf z) decode = \case OpNode f x y -> A (A (V f) x) (decode y) OpLeaf x -> x isStronger precs s s' = if prec <= prec' then if prec == prec' then if assoc == assoc' then case assoc of LAssoc -> False RAssoc -> True NAssoc -> error $ "adjacent NAssoc: " ++ s ++ " vs " ++ s' else error $ "assoc mismatch: " ++ s ++ " vs " ++ s' else False else True where (prec, assoc) = findPrec s (prec', assoc') = findPrec s' findPrec s = if s == ":" then (5, RAssoc) else maybe defPrec id $ mlookup s precs defPrec = (9, LAssoc) patFixFixity precs p = case p of PatLit _ -> p PatVar s m -> PatVar s $ go <$> m PatCon "{+" args -> patFixer precs args PatCon con args -> PatCon con $ go <$> args where go = patFixFixity precs data PopTree = PopLeaf Pat | PopNode String Pat PopTree patFixer precs (PatCon f [a, b]:rest) = go seed rest where seed = PopNode f a (PopLeaf b) go acc = \case [] -> decode acc PatCon s [z]:rest -> go (ins s z acc) rest ins s z t = case t of PopNode s' x y -> case isStronger precs s s' of True -> PopNode s' x $ ins s z y False -> PopNode s (decode t) (PopLeaf z) PopLeaf x -> PopNode s x (PopLeaf z) decode = \case PopNode f x y -> PatCon f [x, decode y] PopLeaf x -> x secondM f (a, b) = (a,) <$> f b patternCompile dcs t = optiApp $ resolveFieldBinds dcs $ evalState (go t) 0 where go t = case t of E _ -> pure t V _ -> pure t A x y -> liftA2 A (go x) (go y) L s x -> L s <$> go x Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats dcs -- Type inference. instantiate' t n tab = case t of TC s -> ((t, n), tab) TV s -> case lookup s tab of Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab) Just v -> ((v, n), tab) TAp x y -> let ((t1, n1), tab1) = instantiate' x n tab ((t2, n2), tab2) = instantiate' y n1 tab1 in ((TAp t1 t2, n2), tab2) instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab) instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where ((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps proofApply sub a = case a of Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty) A x y -> A (proofApply sub x) (proofApply sub y) L s t -> L s $ proofApply sub t _ -> a typeAstSub sub (t, a) = (apply sub t, proofApply sub a) infer typed loc ast csn@(cs, n) = case ast of E x -> Right $ case x of Const _ -> ((TC "Int", ast), csn) ChrCon _ -> ((TC "Char", ast), csn) StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn) Link im s q -> insta q V s -> maybe (Left $ "undefined: " ++ s) Right $ (\t -> ((t, ast), csn)) <$> lookup s loc <|> insta . fst <$> mlookup s typed A x y -> infer typed loc x (cs, n + 1) >>= \((tx, ax), csn1) -> infer typed loc y csn1 >>= \((ty, ay), (cs2, n2)) -> unify tx (arr ty va) cs2 >>= \cs -> Right ((va, A ax ay), (cs, n2)) L s x -> first (\(t, a) -> (arr va t, L s a)) <$> infer typed ((s, va):loc) x (cs, n + 1) where va = TV $ show n insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1)) where (Qual preds ty1, n1) = instantiate ty n findInstance tycl qn@(q, n) p@(Pred cl ty) insts = case insts of [] -> let v = '*':show n in Right (((p, v):q, n + 1), V v) (modName, Instance h name ps _):rest -> case match h ty of Nothing -> findInstance tycl qn p rest Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t) <$> findProof tycl (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps findProof tycl pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of Nothing -> findInstance tycl psn pred $ tycl classId Just s -> Right (psn, V s) prove tycl psn a = case a of Proof pred -> findProof tycl pred psn A x y -> prove tycl psn x >>= \(psn1, x1) -> second (A x1) <$> prove tycl psn1 y L s t -> second (L s) <$> prove tycl psn t _ -> Right (psn, a) data Dep a = Dep ([String] -> Either String ([String], a)) instance Functor Dep where fmap f = \(Dep mf) -> Dep \g -> do (g', x) <- mf g pure (g', f x) instance Applicative Dep where pure x = Dep \g -> Right (g, x) (Dep mf) <*> (Dep mx) = Dep \g -> do (g', f) <- mf g (g'', x) <- mx g' pure (g'', f x) addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ()) badDep s = Dep $ const $ Left s runDep (Dep f) = f [] astLink typed locals imps mods ast = runDep $ go [] ast where go bound ast = case ast of V s | elem s bound -> pure ast | member s locals -> case findImportSym imps mods s of [] -> (if member s typed then pure () else addDep s) *> pure ast _ -> badDep $ "ambiguous: " ++ s | True -> case findImportSym imps mods s of [] -> badDep $ "missing: " ++ s [(im, t)] -> pure $ E $ Link im s t _ -> badDep $ "ambiguous: " ++ s A x y -> A <$> go bound x <*> go bound y L s t -> L s <$> go (s:bound) t _ -> pure ast forFree cond f bound t = case t of E _ -> t V s -> if (not $ s `elem` bound) && cond s then f t else t A x y -> A (rec bound x) (rec bound y) L s t' -> L s $ rec (s:bound) t' where rec = forFree cond f inferno tycl typed defmap syms = let loc = zip syms $ TV . (' ':) <$> syms principal (acc, (subs, n)) s = do expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap) ((t, a), (ms, n1)) <- infer typed loc expr (subs, n) cs <- unify (TV (' ':s)) t ms Right ((s, (t, a)):acc, (cs, n1)) gatherPreds (acc, psn) (s, (t, a)) = do (psn, a) <- prove tycl psn a pure ((s, (t, a)):acc, psn) in do (stas, (soln, _)) <- foldM principal ([], (Tip, 0)) syms stas <- pure $ second (typeAstSub soln) <$> stas (stas, (ps, _)) <- foldM gatherPreds ([], ([], 0)) $ second (typeAstSub soln) <$> stas let preds = fst <$> ps dicts = snd <$> ps applyDicts (s, (t, a)) = (s, (Qual preds t, foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts)) pure $ map applyDicts stas findImportSym imps mods s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s qas | im <- imps, let qas = fst $ mods ! im] inferDefs tycl defs typed = do let insertUnique m (s, (_, t)) = case mlookup s m of Nothing -> case mlookup s typed of Nothing -> Right $ insert s t m _ -> Left $ "reserved: " ++ s _ -> Left $ "duplicate: " ++ s addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs) graph = foldr addEdges (Tip, Tip) defs defmap <- foldM insertUnique Tip defs let ins k = maybe [] id $ mlookup k $ fst graph outs k = maybe [] id $ mlookup k $ snd graph inferComponent typed syms = foldr (uncurry insert) typed <$> inferno tycl typed defmap syms foldM inferComponent typed $ scc ins outs $ keys defmap dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps) inferTypeclasses tycl typeOfMethod typed dcs precs linker iMap mergedSigs = foldM inferInstance typed [(classId, inst) | (classId, insts) <- toAscList iMap, inst <- insts] where inferInstance typed (classId, Instance ty name ps idefs) = let dvs = map snd $ fst $ dictVars ps 0 perMethod s = do let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs expr <- snd <$> linker (patternCompile dcs $ fixFixity precs rawExpr) (ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right $ infer typed [] expr (Tip, 0) let (tx, ax) = typeAstSub sub ta -- e.g. qc = Eq a => a -> a -> Bool -- We instantiate: Eq a1 => a1 -> a1 -> Bool. qc = typeOfMethod s (Qual [Pred _ headT] tc, n1) = instantiate qc n -- Mix the predicates `ps` with the type of `headT`, applying a -- substitution such as (a1, [a]) so the variable names match. -- e.g. Eq a => [a] -> [a] -> Bool Just subc = match headT ty (Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1 case match tx t2 of Nothing -> Left "class/instance type conflict" Just subx -> do ((ps3, _), tr) <- prove tycl (dictVars ps2 0) (proofApply subx ax) if length ps2 /= length ps3 then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name else pure tr in do ms <- mapM perMethod $ maybe (error $ "missing class: " ++ classId) id $ mlookup classId mergedSigs pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed primAdts = [ (TC "()", [Constr "()" []]) , (TC "Bool", [Constr "True" [], Constr "False" []]) , (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" $ map ("",) [TV "a", TAp (TC "[]") (TV "a")]]) , (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," $ map ("",) [TV "a", TV "b"]]) ] prims = let ro = E . Basic dyad s = TC s `arr` (TC s `arr` TC s) bin s = A (ro "Q") (ro s) in map (second (first $ Qual [])) $ [ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ")) , ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE")) , ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ")) , ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE")) , ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y")) , ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I")) , ("chr", (arr (TC "Int") (TC "Char"), ro "I")) , ("ord", (arr (TC "Char") (TC "Int"), ro "I")) , ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C")) , ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V")) , ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR")) , ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF")) , ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")), A (ro "T") (ro "READREF"))) , ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))), A (A (ro "R") (ro "WRITEREF")) (ro "B"))) , ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END")) , ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K"))) , ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) , ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) ] ++ map (\(s, v) -> (s, (dyad "Int", bin v))) [ ("intAdd", "ADD") , ("intSub", "SUB") , ("intMul", "MUL") , ("intDiv", "DIV") , ("intMod", "MOD") , ("intQuot", "QUOT") , ("intRem", "REM") , ("intXor", "XOR") , ("intAnd", "AND") , ("intOr", "OR") ] tabulateModules mods = snd <$> foldM ins (Tip, Tip) mods where go precs = foldr ($) neatEmpty{moduleImports = ["#"], opFixity = precs} ins (accprecs, tab) (k, prog) = case mlookup k tab of Nothing -> let v = go accprecs prog in Right (opFixity v, insert k v tab) Just _ -> Left $ "duplicate module: " ++ k slowUnionWith f x y = foldr go x $ toAscList y where go (k, v) m = insertWith f k v m inferModule tab acc name = case mlookup name acc of Nothing -> do let Neat mySigs iMap defs typedList adtTab ffis ffes imps precs = tab ! name typed = fromList typedList mergedSigs = foldr (slowUnionWith const) Tip $ mySigs : map (typeclasses . (tab !)) imps mergedInstances = foldr (slowUnionWith (++)) Tip [fmap (map (im,)) m | (im, m) <- ("", iMap) : map (\im -> (im, instances $ tab ! im)) imps] locals = fromList $ map (, ()) $ (fst <$> typedList) ++ (fst <$> defs) tycl classId = maybe [] id $ mlookup classId mergedInstances dcs = adtTab : map (dataCons . (tab !)) imps typeOfMethod s = maybe undefined id $ foldr (<|>) (fst <$> mlookup s typed) [fmap fst $ lookup s $ typedAsts $ tab ! im | im <- imps] acc' <- foldM (inferModule tab) acc imps let linker = astLink typed locals imps acc' depdefs <- mapM (\(s, t) -> (s,) <$> linker (patternCompile dcs $ fixFixity precs t)) defs typed <- inferDefs tycl depdefs typed typed <- inferTypeclasses tycl typeOfMethod typed dcs precs linker iMap mergedSigs Right $ insert name (typed, (ffis, ffes)) acc' Just _ -> Right acc untangle s = do tab <- insert "#" neatPrim <$> (parseProgram s >>= tabulateModules) foldM (inferModule tab) Tip $ keys tab neatPrim = foldr (\(a, b) -> addAdt a b []) (Neat Tip Tip [] prims Tip Tip Tip [] Tip) primAdts
-- FFI across multiple modules. -- Change `div` and `mod` to round down instead towards zero for `Int`. module RTS where import Base import Ast import Kiselyov import Map import Parser import_qq_here = import_qq_here libc = [r|#include<stdio.h> static int env_argc; int getargcount() { return env_argc; } static char **env_argv; int getargchar(int n, int k) { return env_argv[n][k]; } static int nextCh, isAhead; int eof_shim() { if (!isAhead) { isAhead = 1; nextCh = getchar(); } return nextCh == -1; } void exit(int); void putchar_shim(int c) { putchar(c); } int getchar_shim() { if (!isAhead) nextCh = getchar(); if (nextCh == -1) exit(1); isAhead = 0; return nextCh; } void errchar(int c) { fputc(c, stderr); } void errexit() { fputc('\n', stderr); } |] preamble = [r|#define EXPORT(f, sym) void f() asm(sym) __attribute__((export_name(sym))); void *malloc(unsigned long); enum { FORWARD = 127, REDUCING = 126 }; enum { TOP = 1<<24 }; static u *mem, *altmem, *sp, *spTop, hp; static inline u isAddr(u n) { return n>=128; } static u evac(u n) { if (!isAddr(n)) return n; u x = mem[n]; while (isAddr(x) && mem[x] == _T) { mem[n] = mem[n + 1]; mem[n + 1] = mem[x + 1]; x = mem[n]; } if (isAddr(x) && mem[x] == _K) { mem[n + 1] = mem[x + 1]; x = mem[n] = _I; } u y = mem[n + 1]; switch(x) { case FORWARD: return y; case REDUCING: mem[n] = FORWARD; mem[n + 1] = hp; hp += 2; return mem[n + 1]; case _I: mem[n] = REDUCING; y = evac(y); if (mem[n] == FORWARD) { altmem[mem[n + 1]] = _I; altmem[mem[n + 1] + 1] = y; } else { mem[n] = FORWARD; mem[n + 1] = y; } return mem[n + 1]; default: break; } u z = hp; hp += 2; mem[n] = FORWARD; mem[n + 1] = z; altmem[z] = x; altmem[z + 1] = y; return z; } static void gc() { hp = 128; u di = hp; sp = altmem + TOP - 1; for(u *r = root; *r; r++) *r = evac(*r); *sp = evac(*spTop); while (di < hp) { u x = altmem[di] = evac(altmem[di]); di++; if (x != _NUM) altmem[di] = evac(altmem[di]); di++; } spTop = sp; u *tmp = mem; mem = altmem; altmem = tmp; } static inline u app(u f, u x) { mem[hp] = f; mem[hp + 1] = x; return (hp += 2) - 2; } static inline u arg(u n) { return mem[sp [n] + 1]; } static inline int num(u n) { return mem[arg(n) + 1]; } static inline void lazy2(u height, u f, u x) { u *p = mem + sp[height]; *p = f; *++p = x; sp += height - 1; *sp = f; } static void lazy3(u height,u x1,u x2,u x3){u*p=mem+sp[height];sp[height-1]=*p=app(x1,x2);*++p=x3;*(sp+=height-2)=x1;} |] -- Main VM loop. comdefsrc = [r| F x = "foreign(num(1));" Y x = x "sp[1]" Q x y z = z(y x) S x y z = x z(y z) B x y z = x (y z) BK x y z = x y C x y z = x z y R x y z = y z x V x y z = z x y T x y = y x K x y = "_I" x KI x y = "_I" y I x = "sp[1] = arg(1); sp++;" LEFT x y z = y x CONS x y z w = w x y NUM x y = y "sp[1]" ADD x y = "_NUM" "num(1) + num(2)" SUB x y = "_NUM" "num(1) - num(2)" MUL x y = "_NUM" "num(1) * num(2)" QUOT x y = "_NUM" "num(1) / num(2)" REM x y = "_NUM" "num(1) % num(2)" DIV x y = "_NUM" "div(num(1), num(2))" MOD x y = "_NUM" "mod(num(1), num(2))" XOR x y = "_NUM" "num(1) ^ num(2)" AND x y = "_NUM" "num(1) & num(2)" OR x y = "_NUM" "num(1) | num(2)" EQ x y = "num(1) == num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);" LE x y = "num(1) <= num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);" REF x y = y "sp[1]" NEWREF x y z = z ("_REF" x) y READREF x y z = z "num(1)" y WRITEREF x y z w = w "((mem[arg(2) + 1] = arg(1)), _K)" z END = "return;" ERR = "sp[1]=app(app(arg(1),_ERREND),_ERR2);sp++;" ERR2 = "lazy3(2, arg(1), _ERROUT, arg(2));" ERROUT = "errchar(num(1)); lazy2(2, _ERR, arg(2));" ERREND = "errexit(); return;" |] argList t = case t of TC s -> [TC s] TV s -> [TV s] TAp (TC "IO") (TC u) -> [TC u] TAp (TAp (TC "->") x) y -> x : argList y _ -> [t] cTypeName (TC "()") = "void" cTypeName (TC "Int") = "int" cTypeName (TC "Char") = "int" cTypeName _ = "int" ffiDeclare (name, t) = let tys = argList t in (concat [cTypeName $ last tys, " ", name, "(", intercalate "," $ cTypeName <$> init tys, ");\n"]++) ffiArgs n t = case t of TAp (TC "IO") u -> ("", ((False, u), n)) TAp (TAp (TC "->") _) y -> first (((if 3 <= n then ", " else "") ++ "num(" ++ shows n ")") ++) $ ffiArgs (n + 1) y _ -> ("", ((True, t), n)) needsNum t = case t of TC "Int" -> True TC "Char" -> True _ -> False ffiDefine n (name, t) = ("case " ++) . shows n . (": " ++) . if ret == TC "()" then longDistanceCall . cont ("_K"++) . ("); break;"++) else ("{u r = "++) . longDistanceCall . cont ((if needsNum ret then "app(_NUM, r)" else "r") ++) . ("); break;}\n"++) where (args, ((isPure, ret), count)) = ffiArgs 2 t lazyn = ("lazy2(" ++) . shows (if isPure then count - 1 else count + 1) . (", " ++) cont tgt = if isPure then ("_I, "++) . tgt else ("app(arg("++) . shows (count + 1) . ("), "++) . tgt . ("), arg("++) . shows count . (")"++) longDistanceCall = (name++) . ("("++) . (args++) . ("); "++) . lazyn genMain n = "int main(int argc,char**argv){env_argc=argc;env_argv=argv;rts_reduce(" ++ shows n ");return 0;}\n" arrCount = \case TAp (TAp (TC "->") _) y -> 1 + arrCount y _ -> 0 genExport m n = ("void f"++) . shows n . ("("++) . foldr (.) id (intersperse (',':) $ map (("u "++) .) xs) . ("){rts_reduce("++) . foldl (\s x -> ("app("++) . s . (",app(_NUM,"++) . x . ("))"++)) rt xs . (");}\n"++) where xs = map ((('x':) .) . shows) [0..m - 1] rt = ("root["++) . shows n . ("]"++) genArg m a = case a of V s -> ("arg("++) . (maybe undefined shows $ lookup s m) . (')':) E (StrCon s) -> (s++) A x y -> ("app("++) . genArg m x . (',':) . genArg m y . (')':) genArgs m as = foldl1 (.) $ map (\a -> (","++) . genArg m a) as genComb (s, (args, body)) = let argc = ('(':) . shows (length args) m = zip args [1..] in ("case _"++) . (s++) . (':':) . (case body of A (A x y) z -> ("lazy3"++) . argc . genArgs m [x, y, z] . (");"++) A x y -> ("lazy2"++) . argc . genArgs m [x, y] . (");"++) E (StrCon s) -> (s++) ) . ("break;\n"++) comb = (,) <$> conId <*> ((,) <$> many varId <*> (res "=" *> combExpr)) combExpr = foldl1 A <$> some (V <$> varId <|> E . StrCon <$> lexeme tokStr <|> paren combExpr) comdefs = case parse (lexemePrelude *> braceSep comb <* eof) comdefsrc of Left e -> error e Right (cs, _) -> cs comEnum s = maybe (error s) id $ lookup s $ zip (fst <$> comdefs) [1..] comName i = maybe undefined id $ lookup i $ zip [1..] (fst <$> comdefs) runFun = ([r| static int div(int a, int b) { int q = a/b; return q - (((u)(a^b)) >> 31)*(q*b!=a); } static int mod(int a, int b) { int r = a%b; return r + (((u)(a^b)) >> 31)*(!!r)*b; } static void run() { for(;;) { if (mem + hp > sp - 8) gc(); u x = *sp; if (isAddr(x)) *--sp = mem[x]; else switch(x) { |]++) . foldr (.) id (genComb <$> comdefs) . ([r| } } } void rts_init() { mem = malloc(TOP * sizeof(u)); altmem = malloc(TOP * sizeof(u)); hp = 128; for (u i = 0; i < sizeof(prog)/sizeof(*prog); i++) mem[hp++] = prog[i]; spTop = mem + TOP - 1; } void rts_reduce(u n) { static u ready;if (!ready){ready=1;rts_init();} *(sp = spTop) = app(app(n, _UNDEFINED), _END); run(); } |]++) resolve bigmap (m, s) = either (resolve bigmap) id $ (bigmap ! m) ! s mayResolve bigmap (m, s) = mlookup m bigmap >>= fmap (either (resolve bigmap) id) . mlookup s appCell (hp, bs) x y = (Right hp, (hp + 2, bs . (x:) . (y:))) enc tab mem = \case Lf n -> case n of Basic c -> (Right $ comEnum c, mem) Const c -> appCell mem (Right $ comEnum "NUM") $ Right c ChrCon c -> appCell mem (Right $ comEnum "NUM") $ Right $ ord c StrCon s -> enc tab mem $ foldr (\h t -> Nd (Nd (lf "CONS") (Lf $ ChrCon h)) t) (lf "K") s Link m s _ -> (Left (m, s), mem) LfVar s -> maybe (error $ "resolve " ++ s) (, mem) $ mlookup s tab Nd x y -> let (xAddr, mem') = enc tab mem x (yAddr, mem'') = enc tab mem' y in appCell mem'' xAddr yAddr asm hp0 combs = tabmem where tabmem = foldl (\(as, m) (s, t) -> let (p, m') = enc (fst tabmem) m t in (insert s p as, m')) (Tip, (hp0, id)) combs rewriteCombs tab = optim . go where go = \case LfVar v -> let t = follow [v] v in case t of Lf (Basic _) -> t LfVar w -> if v == w then Nd (lf "Y") (lf "I") else t _ -> LfVar v Nd a b -> Nd (go a) (go b) t -> t follow seen v = case tab ! v of LfVar w | w `elem` seen -> LfVar $ last seen | True -> follow (w:seen) w t -> t codegenLocal (name, (typed, _)) (bigmap, (hp, f)) = (insert name localmap bigmap, (hp', f . memF)) where rawCombs = optim . nolam . snd <$> typed combs = toAscList $ rewriteCombs rawCombs <$> rawCombs (localmap, (hp', memF)) = asm hp combs codegen ffis mods = (bigmap', mem) where (bigmap, (_, memF)) = foldr codegenLocal (Tip, (128, id)) $ toAscList mods bigmap' = (resolveGlobal <$>) <$> bigmap mem = resolveGlobal <$> memF [] ffiIndex = fromList $ zip (keys ffis) [0..] resolveGlobal = \case Left (m, s) -> if m == "{foreign}" then ffiIndex ! s else resolveGlobal $ (bigmap ! m) ! s Right n -> n getIOType (Qual [] (TAp (TC "IO") t)) = Right t getIOType q = Left $ "main : " ++ show q compile mods = do let ffis = foldr (\(k, v) m -> insertWith (error $ "duplicate import: " ++ k) k v m) Tip $ concatMap (toAscList . fst . snd) $ elems mods (bigmap, mem) = codegen ffis mods ffes = foldr (\(expName, v) m -> insertWith (error $ "duplicate export: " ++ expName) expName v m) Tip [ (expName, (addr, argcount)) | (modName, (_, (_, ffes))) <- toAscList mods , (expName, ourName) <- toAscList ffes , let addr = maybe (error $ "missing: " ++ ourName) id $ mlookup ourName $ bigmap ! modName , let argcount = arrCount $ mustType modName ourName ] mustType modName s = case mlookup s $ fst $ mods ! modName of Just (Qual [] t, _) -> t _ -> error "TODO: report bad exports" mayMain = do mainAddr <- mlookup "main" =<< mlookup "Main" bigmap (mainType, _) <- mlookup "main" $ fst $ mods ! "Main" pure (mainAddr, mainType) mainStr <- case mayMain of Nothing -> pure "" Just (a, q) -> do getIOType q pure $ genMain a pure $ ("typedef unsigned u;\n"++) . ("enum{_UNDEFINED=0,"++) . foldr (.) id (map (\(s, _) -> ('_':) . (s++) . (',':)) comdefs) . ("};\n"++) . ("static const u prog[]={" ++) . foldr (.) id (map (\n -> shows n . (',':)) mem) . ("};\nstatic u root[]={" ++) . foldr (.) id (map (\(addr, _) -> shows addr . (',':)) $ elems ffes) . ("0};\n" ++) . (preamble++) . (libc++) . foldr (.) id (ffiDeclare <$> toAscList ffis) . ("static void foreign(u n) {\n switch(n) {\n" ++) . foldr (.) id (zipWith ffiDefine [0..] $ toAscList ffis) . ("\n }\n}\n" ++) . runFun . foldr (.) id (zipWith (\(expName, (_, argcount)) n -> ("EXPORT(f"++) . shows n . (", \""++) . (expName++) . ("\")\n"++) . genExport argcount n) (toAscList ffes) [0..]) $ mainStr
-- FFI across multiple modules. module Main where import Base import Map import Ast import RTS import Typer import Kiselyov import System hide_prelude_here' = hide_prelude_here' dumpWith dumper s = case untangle s of Left err -> err Right tab -> foldr ($) [] $ map (\(name, mod) -> ("module "++) . (name++) . ('\n':) . (foldr (.) id $ dumper mod)) $ toAscList tab dumpLambs (typed, _) = map (\(s, (_, t)) -> (s++) . (" = "++) . shows t . ('\n':)) $ toAscList typed dumpTypes (typed, _) = map (\(s, (q, _)) -> (s++) . (" :: "++) . shows q . ('\n':)) $ toAscList typed dumpCombs (typed, _) = map go combs where rawCombs = optim . nolam . snd <$> typed combs = toAscList $ rewriteCombs rawCombs <$> rawCombs go (s, t) = (s++) . (" = "++) . shows t . (";\n"++) main = getArgs >>= \case "comb":_ -> interact $ dumpWith dumpCombs "lamb":_ -> interact $ dumpWith dumpLambs "type":_ -> interact $ dumpWith dumpTypes _ -> interact \s -> either id id $ untangle s >>= compile