Lovely as a Tree

Of the many ways to draw binary trees, computer scientists often choose level-based layout, that is, nodes have the same y-coordinate if and only if they belong to the same level. By convention, deeper levels appear below shallower levels, and levels are evenly spaced.

The Reingold-Tilford algorithm efficiently produces some of the prettiest pictures of this style:

  • Left nodes appear to the left of right nodes.

  • Parent nodes are centered above their children.

  • No edges cross.

  • Drawing the reverse of a tree results in its mirror image.

  • A subtree appears the same no matter where it occurs on the tree.

  • The width of the drawing is minimal.

I found a summary of level-based algorithms for drawing trees, which is nice apart from a few errors of varying severity. It seems "Figure 7" should be "Figure 8" in some places; "right" should be "left" in one sentence; "n/3" should be "n/4"; "Principle 6" is misleading because the algorithm of Buchheim et al. merely opportunistically spreads out the siblings between two others at a certain point in the algorithm without moving any other siblings. (Insisting all siblings be evenly spaced results in wider drawings.)

See also the chapter on drawing general trees from the Handbook of Graph Drawing and Visualization.

Preliminaries

Let’s walk through an implementation of the Reingold-Tilford algorithm. We need a bunch of imports:

{-# LANGUAGE CPP #-}
#ifdef __HASTE__
import Control.Monad
import Haste
import Haste.DOM
import Haste.Events
import Haste.Foreign
import Haste.JSString (pack)
import System.Random
#else
import System.Environment
import System.IO
#endif
import Control.Arrow
import Data.Tree
import Text.Parsec

We use parser combinators to read a simple language that lets us easily supply trees to our program. The language is modeled on the notation for composing functions. The right-associative operator "." connects subtrees, and any alphanumeric string is a leaf.

data Expr = Fun String | Com Expr Expr
expr :: Parsec String () Expr
expr = spaces >> atm `chainr1` com where
  atm = fun <|> between (sp $ char '(') (sp $ char ')') expr
  fun = sp $ Fun <$> many1 alphaNum
  com = const Com <$> sp (char '.')
  sp = (<* spaces)

The naive approach

The idea behind the algorithm is simple and elegant. We build the drawing bottom-up. The y-coordinate of a node is implicit, as it is determined by the level. We only need to pick good x-coordinates.

The base case places a leaf node at x = 0.

simpleDraw :: Expr -> Tree (Double, String)
simpleDraw (Fun s) = Node (0, s) []

For the inductive step, we first find the contours of the two subtrees. The right contour of a tree is the sequence of the x-coordinates of the rightmost node in each level, and the left contour is the same for the leftmost nodes.

simpleDraw (Com l r) = Node (m, "") xs where
  [ll, rr]       = simpleDraw <$> [l, r]
  [lCont, rCont] = [fst . head <$> levels rr, fst . last <$> levels ll]

We compare the right contour of the left subtree layout against the left contour of the right subtree layout. We intend to indent one of them so the minimum distance between the contours is 1 unit.

  d              = 1 - minimum (zipWith (-) lCont rCont)

To ensure the drawing’s leftmost point has x = 0, we indent the right subtree when they are too close, and the left subtree when they are too far apart.

  xs | d >= 0    = [                  ll, first (+ d) <$> rr]
     | otherwise = [first (+(-d)) <$> ll,                 rr]

Lastly, the parent is given the x-coordinate halfway between its children.

  m              = (sum $ fst . rootLabel <$> xs) / 2

Though concise, the running time of the above code is quadratic in the number of nodes because:

  1. Many nodes might be indented many times.

  2. Tracing a contour requires traversing the whole tree.

Thanks to lazy evaluation and zipWith, we only completely traverse the shorter of the two subtrees. This behaviour becomes important when we improve our algorithm.

From quadratic to linear

To fix the first issue, we record indent values in subtree roots. A final top-down pass recursively applies the indents cumulatively to compute the x-coordinates of each node. In essence, we are using:

fmap (x +) . fmap (y +) = fmap ((x + y) +)

To fix the second issue, we maintain auxiliary edges on the tree so we can follow contours without traversing its entirety.

Instead of Data.Tree we define a custom RT tree data structure. The shift field records the indent amount for the first optimization, and the link field may hold an auxiliary edge for the second optimization. Because of our first optimization, we need to record an indent modifier with each link. We parameterize so the node data structure can hold data of any given type.

data RT a = RT { xpos :: Int
               , shift :: Int
               , hold :: a
               , link :: Maybe (Int, RT a)
               , kids :: [RT a]
               }

The following applies all indents of an RT tree to produce a Data.Tree representing the final drawing:

addx :: Int -> RT a -> Tree (Int, a)
addx i rt = Node (xpos rt + shift rt + i, hold rt) $
  addx (i + shift rt) <$> kids rt

As for the link field, observe we can mostly figure out the right contour by starting at the root and recursively following the last child. Trouble arises when we reach a leaf but there are still more levels to go.

To solve this problem, for rightmost leaf nodes except those in the last level, we set link to point at the rightmost node of the next level. When first placing a node, it either lies on the deepest level or is guaranteed to be an internal node, so link is initially Nothing.

When combining 2 subtrees, the weave function traverses the rightmost children and links of both subtrees until it bottoms out on at least one of them. Then if needed, it follows at most one more edge or link on the other subtree to create a new link. It also stores the difference between the total indent of the link destination and that of the link source so we can update the indent value accordingly when following links.

Left contours are similarly handled. In fact, to avoid code duplication, our contour function takes an f argument that should be head for left contours and last for right contours, and our weave function calls a helper function with id or reverse depending on whether it’s acting on the left or the right side of the subtrees.

We could optimize further. For example, we know only one link on the shallower subtree needs adjustment. However, the asymptotic time complexity is unaffected.

Typical implementations modify data in place, but as our code is pure, we create new RT nodes instead.

contour :: ([RT a] -> RT a) -> (Int, RT a) -> [Int]
contour f (acc, rt) = h : case kids rt of
  [] -> maybe [] (contour f . first (+ acc')) (link rt)
  ks -> contour f (acc', f ks)
  where acc' = acc + shift rt
        h    = acc'+ xpos rt

weave :: RT a -> RT a -> [RT a]
weave l r = [weave' id (0, l) (0, r), weave' reverse (0, r) (0, l)]

weave' :: ([RT a] -> [RT a]) -> (Int, RT a) -> (Int, RT a) -> RT a
weave' f (accL, l) (accR, r)
  | Nothing      <- follow = l
  | Just (n, x)  <- link l = l { link = Just (n, weave' f (n + accL', x) y) }
  | (k:ks)   <- f $ kids l = l { kids = f $ weave' f (accL', k) y : ks }
  | otherwise              = l { link = first (+(-accL')) <$> follow }
  where
    accL' = accL + shift l
    accR' = accR + shift r
    follow | (k:_) <- f $ kids r = Just (accR', k)
           | otherwise           = first (accR' +) <$> link r
    Just y = follow

This time, we want a configurable minimum gap between siblings, as well as integral x-coordinates. So if necessary, we bump up the indent value so the average of the x values of the sibling roots is a whole number.

We also change our API to take any Tree a instead of our Expr data structure.

padding :: Int  -- Minimum horizontal gap between nodes.
padding = 50

placeRT :: Tree a -> RT a
placeRT (Node a [])     = RT 0 0 a Nothing []
placeRT (Node a [l, r]) = RT m 0 a Nothing xs where
  [ll, rr] = placeRT <$> [l, r]
  g = padding - minimum (zipWith (-)
    (contour head (0, rr)) (contour last (0, ll)))
  s = xpos ll + xpos rr
  gap = abs g + mod (abs g + s) 2  -- Adjust so midpoint is whole number.
  m = div (s + gap) 2
  xs = if g >= 0 then weave ll                 rr { shift = gap }
                 else weave ll { shift = gap } rr
placeRT _ = error "full binary trees only please"

drawRT :: Tree a -> Tree (Int, a)
drawRT = addx 0 . placeRT

drawExpr :: Expr -> Tree (Int, String)
drawExpr = drawRT . fromExpr where
  fromExpr (Fun s)   = Node s []
  fromExpr (Com l r) = Node "" $ fromExpr <$> [l, r]

Web version

Each time the "Draw!" button is clicked, we read the tree in the input text area and draw it.

For some reason that is probably important to some committee, JavaScript’s createElement creates elements that can be added to the SVG, but on my browser they are never rendered even though they appear in the DOM. Instead, we must call createElementNS to create SVG elements. Haste lacks a wrapper for this function so we define our own.

#ifdef __HASTE__
newElemSVG :: String -> IO Elem
newElemSVG = ffi $ pack $
  "(x => document.createElementNS('http://www.w3.org/2000/svg', x))"

We add a function that determines the maximum x- and y-coordinates of a tree drawing so we can set an appropriate viewBox for the SVG element.

maxXY :: Tree (Int, a) -> (Int, Int)
maxXY t = (maximum xs, length xs) where xs = fst . last <$> levels t

For the random tree feature, we have a function that generates a random expression with a given number of leaf nodes. We shall give it a random length whose expected value is 12.

randomExpr :: RandomGen g => g -> Int -> String
randomExpr g 1 = pure $ ['a'..'z'] !! mod (fst $ next g) 26
randomExpr g n = concat ["(", randomExpr gl m, ".", randomExpr gr (n - m), ")"]
  where (r, g1)  = next g
        (gl, gr) = split g1
        m        = 1 + mod r (n - 1)

The rest of the code is mostly tedious web stuff.

draw :: Elem -> Int -> Tree (Int, String) -> IO ()
draw soil y (Node (x, s) ks) = do
  forM_ ks $ \(Node (x2, _) _) -> appendChild soil =<< newElemSVG "line" `with`
    [ attr "x1" =: show x
    , attr "y1" =: show (40*y)
    , attr "x2" =: show x2
    , attr "y2" =: show (40*(y + 1))
    , attr "stroke" =: "black"
    ]
  if null ks then appendChild soil =<< newElemSVG "rect" `with`
    [ attr "x" =: show (x - 12)
    , attr "y" =: show (40*y - 12)
    , attr "width"  =: "24"
    , attr "height" =: "24"
    , attr "fill"   =: "white"
    , attr "stroke" =: "black"
    ]
  else appendChild soil =<< newElemSVG "circle" `with`
    [ attr "r"  =: "3"
    , attr "cx" =: show x
    , attr "cy" =: show (40*y)
    ]
  e <- newElemSVG "text" `with`
    [ attr "x" =: show x
    , attr "y" =: show (40*y)
    , attr "text-anchor" =: "middle"
    , attr "alignment-baseline" =: "central"
    ]
  setProp e "textContent" s
  appendChild soil e
  mapM_ (draw soil $ y + 1) ks

main :: IO ()
main = withElems ["soil", "input", "drawB", "randB"] $
    \[soil, input, drawB, randB] -> do
  let
    drawInput = do
      s <- getProp input "value"
      case parse expr "" s of
        Left err -> alert $ pack $ "Parse error: " ++ show err
        Right t -> do
          clearChildren soil
          let
            drawing = drawExpr t
            (x, y) = maxXY drawing
          setAttr soil "viewBox" $
            "-15 -15 " ++ show (x + 30) ++ " " ++ show (40*y + 30)
          draw soil 0 drawing
  drawInput
  void $ drawB `onEvent` Click $ const drawInput
  void $ randB `onEvent` Click $ const $ do
    g <- getStdGen
    let
      (gn, g') = split g
      (ge, g'') = split g'
      n = 1 + length (takeWhile (> 0) $ randomRs (0 :: Int, 10) gn)
      s = randomExpr ge n
    setProp input "value" s
    drawInput
    setStdGen g''

Command-line version

We read the first argument as a tree and print an SVG drawing of it. If no arguments are provided then a default tree is used.

#else
render :: Int -> Tree (Int, String) -> String
render depth (Node (x, s) ks) = concat $
  ((\(Node (x2, _) _) -> concat
    [ "<line x1='",  show $ x + x0, "' y1='", show $ depth*40 + y0
    , "' x2='", show $ x2 + x0, "' y2='", show $ (depth + 1)*40 + y0
    , "' stroke='black'/>"
    ]) <$> ks) ++
  (if null ks then
    [ "<rect x='", show $ x - 12 + x0, "' y='", show $ depth*40 - 12 + y0
    , "' width='24' height='24' stroke='black' fill='white'/>\n"
    ]
  else
    [ "<circle r='3' cx='", show $ x + x0, "' cy='", show $ depth*40 + y0
    , "' stroke='black'/>\n"
    ]) ++
  (if null s then [] else
    [ "<text text-anchor='middle' alignment-baseline='central'"
    , " x='", show $ x + x0, "' y='", show $ depth*40 + y0
    , "'>", s, "</text>\n"
    ]) ++
  (render (depth + 1) <$> ks)
  where (x0, y0) = (15, 15)

main :: IO ()
main = do
  args <- getArgs
  let
    (simpleMode, args1) = case args of
      ("-s":t) -> (True, t)
      _        -> (False, args)
    s = case args1 of (h:_) -> h
                      _     -> "(((1.2.3.4).5).(x.y)).(a.(b.((c.d).e).f))"
  case parse expr "" s of
    Left err -> hPutStrLn stderr $ show err
    Right t  -> putStrLn $ concat
      [ "<svg xmlns='http://www.w3.org/2000/svg'>"
      , render 0 $ if simpleMode
        then first (round . (50*)) <$> simpleDraw t
        else drawExpr t
      , "</svg>"
      ]
#endif

Ben Lynn blynn@cs.stanford.edu 💡