Nested and parallel loops: a story of scope and control

Introduction

The reason for existence of generators is to write loops. Generators were developed in mid-1970s as an ultimate abstraction of iteration. This page demonstrates one insight of the generator abstraction: the correspondence between the binding scope of a loop variable and the control scope (the evaluation context) of its bound generator. This is the story of nested and parallel loops, inspired by Olin Shiver's ICFP 2005 paper `The Anatomy of a Loop'. Generators make Olin Shiver's story more vivid.

The generators in our story come from the formula yield = exceptions + non-determinism. We will mostly use the Haskell implementation because of more revealing types; porting to OCaml or other languages is straightforward. The basic generator interface defines one (abstract) type GenT e m a and two functions, for yielding and iterating -- along with the observation function.

     -- Basic interface
     
     type GenT e m a  -- abstract
     instance Monad m => Monad (GenT e m)
     instance MonadTrans (GenT e)
     
     type LoopBody e m = e -> m ()
     
     yield    :: Monad m => e -> GenT e m ()
     for_loop :: (MonadTrans t, Monad m, Monad (t m)) => GenT e (t m) () -> LoopBody e (t m) -> t m ()
     
     observe :: (MonadTrans t, Monad m) => t m a -> m a

Gen e m a is the type of computations that yield values of the type e in the monad m, eventually returning the value of the type a. The particular implementation of generators used here requires the monad m be MonadPlus. Likewise the monad transformer t that appears in the signatures of for_loop and observe must be LogicT. Other implementations of the basic generator interface lift these requirements.

The primitive generator yield (along with the fact that Gen e m is a monad) lets us write other generators. In this article, we will use two. One is the generator that enumerates values within a given range. Compared to the familiar Enum a => enumFromTo :: a -> a -> [a], the generator does not create any list, even lazily. Rather, it yields elements as they are being enumerated:

     enumFromToG :: (MonadPlus m, Enum a) => a -> a -> GenT a m ()
     enumFromToG xbeg xend | fromEnum xbeg <= fromEnum xend = 
        yield xbeg >> enumFromToG (succ xbeg) xend
     enumFromToG xbeg xend = return ()

The second sample generator yields labels from an in-order traversal of a tree with integer-labeled leaves. We have seen this generator many times in an earlier page on generators.

     type Label = Int
     in_order2 :: (MonadIO m, MonadPlus m) => Tree -> GenT Label m ()

To perform the iteration, we connect a generator with the consumer of the generated values, that is, with a loop body. The sample loop:

     sl1 = for_loop (enumFromToG 'a' 'e') $ \i -> lift . print $ i

corresponds to the for-loop in Pascal

     for i:='a' to 'e' do writeln(i)

The second sample loop

     for_loop (in_order2 (make_full_tree 3)) $ \i -> lift . print $ i

cannot be idiomatically expressed in Pascal. It can be written however, roughly in the same form, in CLU, which had generators. To execute the overall iterative computation we have to observe it, in a base monad such as IO.

The basic interface is an abstraction for simple generators, such as those in CLU, which suffice for many cases. Parallel loops however need more powerful generators, described by the advanced interface below. The basic interface was the abstraction of the advanced one. The advanced interface exposes the implementation of generators as yield = exceptions + non-determinism.

     -- Advanced interface
     type GenT e m a = MonadPlus m => ErrorT e m a    -- now the concrete type
     
     yield :: MonadPlus m => e -> GenT e m ()
     yield x = raise x `mplus` return ()
     
     for_loop :: (LogicT t, MonadPlus (t m), Monad m) => GenT e (t m) () -> LoopBody e (t m) -> t m ()
     for_loop gen body = iter Nothing $ catchError' gen >>= body
     
     catchError' :: MonadPlus m => GenT e m () -> m e
     
     iter:: (LogicT t, MonadPlus (t m), Monad m) => Maybe Int -> t m () -> t m ()
     
     msplit :: (LogicT t, MonadPlus (t m), Monad m) =>
         t m a -> t m (Maybe (a, t m a))

Since generators combine exceptions (the ErrorT monad) and non-determinism (the LogicT monad), they inherit the operations of those monads. Specifically, catchError', an instance of catch, obtains the value yielded by a generator. The combinator iter, a particular case of bagOfN, takes a non-deterministic computation t m () and executes all of its choices, one-by-one. If the first argument of iter is Just n, only the first n choices are executed and the rest are discarded. Effectively, iter `runs' a non-deterministic computation, actualizing all its non-deterministic choices. For example, desugaring the first sample loop gives the following advanced code:

     iter Nothing $ do
            i <- catchError' $ enumFromToG 'a' 'e'
            lift . print $ i

The loop body non-deterministically obtains the value yielded by the generator, and prints it. The iter Nothing line -- the loop header, so to speak -- makes the body run through all choices, that is, all generated values.

References

Olin Shivers: The Anatomy of a Loop. A story of scope and control
ICFP 2005

Generators: yield = exceptions + non-determinism
The detailed description of the generators used on this page

GenLoop.hs [5K]
The complete code for this article

Nested Scopes and Nested Loops

Our story starts with putting one loop within the body of another. For example, we nest the two sample loops:

     nl1 = observe $ 
            for_loop (in_order2 (make_full_tree 3)) $ \i ->
              for_loop (enumFromToG 'a' 'e') $ \j ->  
                lift . print $ (i,j)

The print-out confirms -- although we have never doubted -- that for each value of i, j ranges over all its possible values.

Since the inner j-loop is within the i-loop body, the variable i scopes over the entire inner loop, including its generator. That is, the inner loop generator may refer to the outer-loop variable, making loop dependence obvious:

     nl2 = observe $ 
            for_loop (in_order2 (make_full_tree 3)) $ \i ->
              for_loop (enumFromToG i 3) $ \j ->  
               lift . print $ (i,j)

The binding scope of a loop variable matches the scope of iteration of the corresponding loop.

The story picks up as we desugar the nested loop nl1 and rearrange the code using equational reasoning. We obtain a different way of writing nested loops, now using the advanced interface:

     nl1' = observe $ 
             iter Nothing $ do               -- loop header
               i <- catchError' (in_order2 (make_full_tree 3))
               j <- catchError' (enumFromToG 'a' 'e')
               lift . print $ (i,j)          -- loop body

which we will write in the abbreviated form as

     nl1' = observe $ 
             iter Nothing $ do
               i <- e1
               j <- e2
               body

The binding and control scopes are a bit harder to see now. Recall that the Haskell notation

     do
       i <- e1
       j <- e2
       body

is a syntactic sugar for

     e1 >>= (\i -> e2 >>= (\j -> body))

exposing that i scopes over body and over e2 -- as was the case in the original nl1. The repeated do-bindings corresponds to what other languages call `nested let' or `repeated let':

     let i = e1 in
     let j = e2 in
     body

The control scope also becomes clear if we look at the evaluation context of the first generator e1:

     do
       i <- []
       j <- e2
       body

The context includes the generator e2. The non-deterministic expression e1 replicates its context; it plugs the first yielded value into the first context replica, the second yielded value into the second replica, etc. Each replica has the copy of e2, the generator for the j-loop, which runs anew in each replicated copy. The body is thus executed at each point in the Cartesian product of the domain of i and the domain of j. The loop nl1' indeed has the same behavior as the original nl1. The binding scope of a loop variable and the control scope -- the evaluation context of the generator -- still match.

Desugaring of nl2 re-iterates the point that the second generator is within the binding scope of the loop variable i associated with the first generator.

     nl2' = observe $ 
             iter Nothing $ do               -- loop header
               i <- catchError' (in_order2 (make_full_tree 3))
               j <- catchError' (enumFromToG i 3)
               lift . print $ (i,j)          -- loop body

The binding scope is lexical and hence is easier to see than the control scope. Since the two scopes match, we can tell which generator holds control over which by examining the binding scopes of their loop variables.

Although the desugared form of for_loop is less pretty to look at, it is more insightful. There is only one binding form for loop variable: do-binding, which is just the lambda-binding. Lambda provides binding and generators provide iteration. The binding scope of a loop variable matches the control scope of the iteration.

Parallel Bindings and Parallel Loops

We have seen that nested let- or do-bindings with generators as binding expressions correspond to nested loops. Many programming languages offer so-called parallel bindings, such as the following OCaml form:

     let i = e1 and j = e2 in body

Although i still scopes over body it no longer scopes over e2. If e2 contains an occurrence of i, it would remain free. The following two expressions should help see the difference in binding scope:

     let i = 1 in
       let i = i + 1 in let j = i + i in (i,j)
     
     let i = 1 in
       let i = i + 1 and j = i + i in (i,j)

The former, with all nested lets, evaluates to (2,4); the latter evaluates to (2,2).

The bindings are called parallel because e1 and e2 no longer have any data dependency and may be evaluated in any order: e2 can be evaluated without waiting for the value to be bound to i. The expressions e1 and e2 may even be evaluated in parallel, and the program should not -- ought not to -- notice.

Our story comes to a decisive point. Can we realize parallel bindings, first in Haskell, to maintain not only data independence of e1 and e2 but also their control independence? Can we use parallel bindings to implement parallel loops?

Parallel let such as let i = e1 and j = e2 in body may be regarded, in the first approximation, as a syntactic sugar for (fun (i,j) -> body) (e1,e2). The Haskell implementation starts off easy: pair e1 e2 >>= \ (i,j) -> body. What remains is to determine the zip-like combinator to pair-up two generators. The naive implementation:

     seq2G :: Monad m => m t -> m t1 -> m (t, t1)
     seq2G e1 e2 = do
       i <- e1
       j <- e2
       return (i,j)

alas is not satisfactory: e2 is in the evaluation context of e1, and hence control-dependent on e1. Somehow we have to exclude e1 and e2 from the evaluation contexts of each other, we have to isolate non-deterministic effects of e1 and e2 and prevent them from affecting each other.

Fortunately, LogicT lets us run a non-deterministic computation in an isolated `sand-box'.

     msplit :: (LogicT t, MonadPlus (t m), Monad m) =>
         t m a -> t m (Maybe (a, t m a))

That is, msplit e1 runs e1 in the sandbox and reports the outcome: Nothing if e1 fails or Just (x,e1') if e1 produced x; the computation e1' represents the remaining non-deterministic choices. Technically speaking, msplitreifies the non-determinism effect as a Maybe (a, t m a) value. This sandbox isolation is exactly what is needed for the parallel combinator:

     par2G :: (LogicT t, MonadPlus (t m), Monad m) =>
     	  t m t1 -> t m t2 -> t m (t1, t2)
     par2G e1 e2 = do
       iR <- msplit e1
       jR <- msplit e2
       case (iR,jR) of
        (Just (i,e1'), Just (j,e2')) -> return (i,j) `mplus` par2G e1' e2'
        _                            -> mzero

We determine the first choice of e1 and, separately, the first choice of e2, and pair them up as the first choice of the par2G e1 e2.

The story concludes with the parallel loop

     pl1' = observe $ 
             iter Nothing $ do              -- loop header
               (i,j) <- par2G (catchError' (in_order2 (make_full_tree 3)))
                              (catchError' (enumFromToG 'a' 'e'))
               lift . print $ (i,j)         -- loop body

Lexically, the two generator expressions are not in scope of either loop variable binding. The print-out shows the two generators indeed run in parallel. Neither of the two generators is in the control scope of the other. Binding and control scopes match again.

Our pairing combinator par2G does not report which of the two generators finished first. We describe a more general combinator below.

Version

The current version is July 2012.

References

GenLoop.hs [5K]
The complete code for this article

Running generators side-by-side

The designers of CLU deliberately restricted the expressiveness of iterators (i.e., generators) to implement them efficiently on the single stack: ``a yield effectively calls the loop body, which returns to the iterator when it is finished.'' The loop body invoked by yield may contain a nested loop, with an inner generator. The loop body may also yield, that is, invoke the body of some outer loop. The CLU design thus permits nesting of iterators and loops; however, it does not support parallel loops, running iterators side-by-side. Many other languages with generators have made the same design trade-off.

The Haskell and OCaml generators derived on this page effectively run on the spaghetti stack, one may say, which they inherit from the implementation of non-determinism. The Haskell and OCaml generators hence can run side-by-side. Recall that our generators can be represented as -- reified into -- a lazy stream (see msplit). Writing a zipWith-like function to process two streams in parallel is straightforward then.

We demonstrate the parallel composition of Haskell and OCaml generators on the famous ``same-fringe problem'', determining if two binary trees have the same fringe, that is, yield the same sequence of labels when traversed in a particular order. Crucially, we must stop the traversal as soon as the mismatch is found, returning it. To make clear the early traversal termination, we print the labels as we traverse the tree. The printing operation makes the traversal effectful and hence precludes lazy lists.

We solve the problem by defining the parallel loop -- the zipWith-like combinator for streams. We show the Haskell code first. To let the zipping function deal with the streams of unequal size, we make its argument types to be Maybe a, with Nothing signifying stream's end:

     zipWithL :: (Monad m, LogicT t, MonadPlus (t m)) =>
                 (Maybe a -> Maybe b -> c) -> t m a -> t m b -> t m c
     zipWithL f m1 m2 = do
      r1 <- msplit m1
      r2 <- msplit m2
      case (r1,r2) of
       (Nothing, Nothing)    -> return $ f Nothing Nothing
       (Just (v1,_),Nothing) -> return $ f (Just v1) Nothing
       (Nothing,Just (v2,_)) -> return $ f Nothing (Just v2)
       (Just (v1,t1),Just (v2,t2)) -> return (f (Just v1) (Just v2))
                                      `mplus` zipWithL f t1 t2

The function same_fringe below takes two generator expressions and runs them side-by-side, generating the stream of mismatches. The one-element stream of (Nothing,Nothing) denotes the complete match.

     same_fringe :: (Monad m, LogicT t, MonadPlus (t m), Eq a) =>
     	        ErrorT a (t m) () -> ErrorT a (t m) () -> t m (Maybe a, Maybe a)
     same_fringe m1 m2 = zipWithL (,) (catchError' m1) (catchError' m2) >>= check
      where
      check r@(Nothing,_) = return r
      check r@(_,Nothing) = return r
      check r@(Just x,Just y) | x /= y = return r
      check _             = mzero

The following function compares the in-order sequences of labels of two trees, returning the first mismatch.

     sfringe_test :: MonadIO m => Tree -> Tree -> m (Maybe Label, Maybe Label)
     sfringe_test t1 t2 = observe $ same_fringe (in_order2 t1) (in_order2 t2)

The generator in_order2, described earlier on this page, prints the trace of the traversal. The trace reveals that the two trees are indeed traversed side-by-side; the traversal stops as soon as the first mismatch is found. We may also print, say, the first three mismatches.

The OCaml implementation is the same. We define a parallel loop over two generators:

     let for_loop_2 p gen1 gen2 body : unit =
      let rec loop = function
        | (Nil,Nil)         -> body None None
        | (Cons (v1,_),Nil) -> body (Some v1) None
        | (Nil,Cons (v2,_)) -> body None (Some v2)
        | (Cons (v1,t1), Cons (v2,t2)) -> 
     	body (Some v1) (Some v2); loop (t1(),t2())
      in loop (msplit p gen1, msplit p gen2)

employing it to write the generator for mismatches of two other generators gen1 and gen2:

     let pmismatch : ('a option * 'a option) stream Delimcc.prompt = new_prompt ()
     let same_fringe p gen1 gen2 =
      for_loop_2 p gen1 gen2 (fun x y ->
        match (x,y) with
        | (None,_) | (_,None) as r              -> yield pmismatch r
        | (Some x,Some y) as r when not (x = y) -> yield pmismatch r
        | _  -> ())

We may run same_fringe as any other generator, to obtain the first mismatch or the first n mismatches.

Version

The current version is June 2011.

References

GenCode.hs [15K]
The complete code for the Haskell examples.

test_gen.ml [7K]
The complete code for the OCaml examples.

Last updated July 4, 2012

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML