In this section I present an alternative implementation of semaphores using the M-variables that were introduced in the section called Synchronous Variables. This implementation is closer to the traditional implementation of languages like Java where multiple threads cooperate as peers to guarantee the safety of the critical section. This contrasts with the implementation presented in the section called Semaphores which relies on a central manager thread.
The simplest case is the mutex which protects a critical section of code so that no more than one thread can perform the code at a time. A mutex is a binary semaphore, that is one with a count of only 0 or 1. The resource is the critical section and only one copy is available for use. Here is an implementation of a mutex using an M-variable.
structure Mutex: MUTEX = struct structure SV = SyncVar type Mutex = bool SV.mvar fun create() = SV.mVarInit true fun lock mutex func = ( SV.mTake mutex; let val r = func() in SV.mPut(mutex, true); r end handle x => ( SV.mPut(mutex, true); raise x ) ) end |
The M-variable either holds a value or it is empty. It doesn't matter what that value is. I've used a bool. The critical section is represented by the body of a function that is passed as an argument. The function doesn't call the acquire and release operations itself. This ensures that every acquire is matched by a release.
When a thread calls the lock function it attempts to take the value out of the mutex. If it succeeds then it can go on to run the argument function. It puts the value back into the mutex to release the lock. Other threads that call lock at the same time will block since the mutex is already empty. The lock function must be careful to release the mutex if the argument function raises an exception.
Next I would like to generalise this for counting semaphores. At first glance you might try to use an M-variable with an integer value containing the number of available copies of the resource. When the count drops to zero leave the M-variable empty so that acquirers are forced to block when they read the count. But this creates a problem for the release operation. The release mustn't block on an empty M-variable and it can't test if the M-variable is empty without a race condition or using some other mutex around the M-variable.
In the following implementation I don't let the M-variable become empty. Instead I introduce a condition variable for acquirers to block on. The design is similar to the basic Java implementation which looks something like the following (see [Holub]).
public synchronized void acquire() { while (count_ <= 0) { wait(); } count_--; } public synchronized void release() { if (count_++ == 0) { notify(); } } |
The methods are synchronised to protect access to the count. If the count is zero then the calling thread is blocked and put onto a wait queue for the semaphore. The release method sends a notification to wake one of the waiting threads. It only does this when the count increments from zero since that is when there are acquirers waiting.
For the CML implementation I've reverted to having separate acquire and release functions, rather than the argument function of the mutex above. This is because there must be multiple sections of code in different threads using the acquired resources. I've also ignored time-outs to simplify the code. Here is the definition of the semaphore.
structure Sema: SEMAPHORE = struct structure SV = SyncVar datatype Sema = Sema of { rsrc: int SV.mvar, (* count of resources avail *) cond: unit CML.chan (* signals a resource is avail *) } fun new n = ( Sema { rsrc = SV.mVarInit(Int.max(0, n)), cond = CML.channel() } ) |
I use a channel to send notifications to waiting acquirers. This lets there be multiple outstanding notifications and waiting acquirers and each notification will wake one acquirer. Here is the acquire function.
fun acquire (sema as Sema {rsrc, cond}) = let val n = SV.mTake rsrc in if n = 0 then (SV.mPut(rsrc, n); CML.recv cond; acquire sema) else SV.mPut(rsrc, n-1) end |
The decrement of the count is synchronised by taking it out of the M-variable. Any other thread trying to acquire will be forced to wait on the M-variable. If the count is zero then the zero is put back into the M-variable to release it and the acquirer blocks waiting for a notification on the channel. When it is notified it tries to acquire the semaphore again. Here is the release function.
fun release (Sema {rsrc, cond}) = let fun notify() = ignore(CML.spawn(fn() => CML.send(cond, ()))) val n = SV.mTake rsrc in SV.mPut(rsrc, n+1); if n = 0 then notify() else () end |
Again it takes the count out of the M-variable for synchronisation and puts the incremented value back in. If the count was zero then it sends a notification on the channel. This is done with an auxillary thread so that the release does not block. The notifications are queued by simply leaving auxillary threads waiting for the opportunity to send.
An essential property of a correct implementation is that there be no waiting acquirers while the count is greater than zero. But proving this is rather tricky. All the different interleaving of steps over multiple acquirers and releasers must be considered. For example what happens if one or more releases happen in between the mPut and recv cond in the acquire function? It appears to work correctly but I'm not certain. If I allowed time-outs it would be worse. If a waiting acquirer disappeared because of a time-out there would be an excess of notifications. Would the semaphore still work correctly?
I'm more confident that the implementation of the section called Semaphores is correct. The protocol for dealing with the count and the waiting acquirers is implemented within a sequential manager thread. This is much easier to reason about. This is the strength of the CML paradigm. Within the boundary of a manager thread of a concurrent object the interactions between client threads are kept strictly sequential and therefore much easier to understand.