This chapter introduces the concurrency mechanisms of the Concurrent ML library, or CML for short(see [CML]). The concurrency model is based around a collection of threads which communicate by sending messages (rather than sharing access to variables). CML does not use the "kernel" threads of the operating system. Instead its implementation is based on coroutines. However a timer mechanism triggers pre-emptive scheduling of the threads. The coroutine mechanism is in turn based on the idea of continuations.
The CML library adds a collection of modules containing concurrent operations and also replaces some of the Basis and Utility library modules with thread-safe versions. There is some reference documentation provided with the library in HTML and Postscript formats. The text book on CML is [Reppy]. This chapter will cover the concepts in enough detail to carry on with the project in part II.
There is a bug in the version of CML distributed with SML/NJ 110.0.7. You will need to get a copy from the CML home page dated later than 14 Jan 2001 to run all of these examples successfully. The Appendix C explains how to do this.
In the section called Tail Recursion as Iteration in Chapter 2 and the section called Tail Recursion for Finite State Machines in Chapter 2 we saw how a function call can be equivalent to and as light-weight as the goto of a language like C. Turning our view-point around we can represent any transfer of control as a call to some function. In the simple flow-chart of Figure 6-1(a) there are implicit transfers of control from one block to the next. Data values are also passed (implicitly), for example the value of x is passed to the addition in the second block.
The transfer from the first to the second block can be modeled as a function call by considering everything that happens from the second block onwards as being the computation of a function. This function will be passed the value of x as an argument. The function is said to continue the execution of the program after the first block. This continuation function is shown in Figure 6-1(b) as the function C. The assignment to x can be reduced to a call to C passing the value 3 as the argument x of the continuation. The transfers of control inside the function C can be further decomposed into calls to continuation functions.
Since this function C represents all the rest of the execution of the program it never returns. So any call to it must be a tail call. These are the essential characteristics of a continuation: a tail call is made to a continuation to continue the rest of the execution of the program, passing as arguments all the values that will be used by the rest of the program. A continuation function is often passed in as an argument to a function to give greater flexibility in the choice of direction for the program.
Continuations were introduced in the study of the semantics of programming languages. A notation called Denotational Semantics was developed in the late 1960s and 1970s by Scott and Strachey among others to describe formally the semantics of programming languages. See [Allison] for an introduction. Denotational semantics was based on lambda calculus which gave it a functional style. This created the problem of how to represent the control-flow of an imperative language. Continuations were invented to model control-flow in lambda calculus. They are a primitive notion that can be used to model any flow of control, including long-distance transfers such as raising exceptions. For example an if statement in C has a then and an else part and control is transferred to one of them. This can be modeled by having two continuations. The first contains the execution of the then part followed by the rest of the program. The second contains the execution of the else part followed by the rest of the program. The predicate of the if statement chooses which continuation to call.
At any point in the execution of the program we can define a current continuation. This is a hypothetical continuation function that represents the rest of the execution after that point in the program. If we could capture the current continuation as a real function we would obtain a snapshot of the execution of the program at that point. Some programming languages provide this feature, called call with current continuation or call/cc. Scheme for example provides this. So does SML/NJ. The call/cc operation captures the current continuation, from the point after the call/cc, and passes it as a function value to a function that you provide. Your function can do anything with the continuation including storing it for later use. If the program later calls the continuation it results in a resumption of execution from after the call/cc.
When you continue a program by calling a continuation function you will probably be continuing the execution of a number of called functions which have piled up on the call stack. The continued execution will include returning from called functions to functions higher up the call stack. So the contents of the call stack are an essential part of the snapshot represented by the continuation. But as an extra complication you can call a continuation more than once, just as you can call any function more than once. This results in the mind-bending possibility of rerunning parts of your program, possibly restarting in arbitrary locations in the middle of functions. You could call a function once but it returns twice. So when call/cc makes its snapshot it, at least in principle, needs to save a copy of the call stack. This can be expensive. Some implementations of Scheme do exactly this.
The implementation of call/cc in SML/NJ is much simpler. The language is implemented using continuations. The compiler identifies all continuations in the program before generating machine code for each continuation. See [Appel2] for the details. The call stack is maintained in the heap so that it does not need to be destroyed as functions return, as happens in the C language. So it costs nothing to save the call stack when a call/cc is done, since it is already in the heap. It is only necessary to retain a pointer to the top of the call stack so that the garbage collector does not take it.
With call/cc being almost zero cost in SML/NJ you can use it frequently for all sorts of tricks. But unless you are careful you will get code that is as spaghetti in nature as the worst Fortran or assembly language program. I won't use call/cc directly in this book. Instead its use will be stereotyped by CML. CML uses call/cc to implement switching execution between threads. The stereotyped pattern is the coroutine.