Monday, September 29, 2008

More A Sink Kronos programming

Today, we'll look at the implementation behind some of the code presented in this post. Let's start by looking at the implementation of the IAsyncResult interface. Because there are two pairs of BeginInvoke/EndInvoke, one that just calls a procedure and one that is designed to call a function, we'll start with a base implementation that will be shared between these two patterns. Here's the declaration for the TBaseAsyncResult class that forms the core of the IAsyncResult implementation:

  TBaseAsyncResult = class;

IGetAsyncResult = interface
['{E833C81B-0F16-4A67-BE7C-48E21A784B1A}']
function GetAsyncResult: TBaseAsyncResult;

property AsyncResult: TBaseAsyncResult read GetAsyncResult;
end;

TBaseAsyncResult = class(TInterfacedObject, IAsyncResult, IGetAsyncResult)
private
FCompleted: Boolean;
FSynchronous: Boolean;
FAsyncHandle: TSimpleEvent;
FInvokingThread: Cardinal;
procedure Complete;
procedure DoAsyncDispatch;
{ IGetAsyncResult }
function GetAsyncResult: TBaseAsyncResult;
{ IAsyncResult }
function GetAsyncHandleObject: THandleObject;
function GetCompletedSynchronously: Boolean;
function GetIsCompleted: Boolean;

property AsyncHandleObject: THandleObject read GetAsyncHandleObject;
protected
FInvokingException: TObject;
procedure AsyncDispatch; virtual; abstract;
function Invoke: IAsyncResult;
procedure InitDispatch;
procedure WaitForCompletion;
public
destructor Destroy; override;
end;

Let me explain the IGetAsyncResult interface. Since you cannot merely typecast an interface instance back to an object, we needed some way to get back to the actual implementation. Using this internal-only interface we can simply perform and "as" cast on an IAsyncResult variable or parameter to get access to the underlying implementation. If an exception is raised due to an invalid cast, most likely someone passed in another implementation of IAsyncResult. The the rule is that you can only use the result of BeginInvoke, with the corresponding EndInvoke. Calling the wrong EndInvoke method will cause an invalid type cast exception to be raised.


We could have done all of this by returning an object instance (say a pure abstract TAsyncResult type) and eliminated the interface altogether. However, that would have meant that the BeginInvoke caller would now be responsible for destroying the instance, it would also have actually complicated the implementation because There would now have to be two object instances because the lifetime of the instance returned by BeginInvoke() would have to specifically be independent from the actual asynchronous call. Otherwise, the destructor would have to block until the call completed. Another reason is that we'll re-use this interface in other asynchronous scenarios.


Let's cover the Invoke and WaitForCompletion methods first:

function TBaseAsyncResult.Invoke: IAsyncResult;
begin
FInvokingThread := GetCurrentThreadId;
Result := Self;
_AddRef();
TThread.Queue(nil, DoAsyncDispatch);
end;

procedure TBaseAsyncResult.WaitForCompletion;
var
LException: TObject;
begin
if not FCompleted then
ASyncHandleObject.WaitFor(INFINITE);
LException := InterlockedExchangePointer(Pointer(FInvokingException), nil);
if LException <> nil then
raise LException;
end;

The intended usage of the Invoke method is to be called immediately upon successful construction, preferably within the same expression:

  Result := TAsyncProcResult.Create(AProc).Invoke;

We don't schedule, or queue, the procedure until the Invoke call because that causes a race-condition. Because the queued procedure, DoAsyncDispatch, is a normal method pointer, it doesn't serve as an interface reference. There is a chance that the BeginInvoke caller could schedule a call, then immediately drop the IAsyncResult (or simply go out of scope) reference which would have freed the instance prematurely. We could have done some of the Invoke work in the constructor, but we'd also have to deal with the way TInterfacedObject does its construction (another topic, look in System.pas).


Invoke, first gets the invoking thread so we know which thread scheduled the call. We'll use this later to determine whether or not TThread.Queue made a direct method call because BeginInvoke was called from the main thread and is the basis for the CompletedSynchronously property on IAsyncResult. Next is assigns the Result. Here the compiler will perform a static cast of "Self" to get the interface and then call _AddRef(). The caller is guaranteed a valid reference now. Next we manually call _AddRef() because as indicated in the previous paragraph, the caller is free to drop the IAsyncResult reference at any point. By doing the _AddRef() before the call is queued, we can now guarantee that when DoAsyncDispatch is called, it is still a valid instance. Finally the call is queued and the function returns.


WaitForCompletion, serves a dual-fold purpose. The first is as the name implies, it merely blocks the caller until the queued method is completed. The other purpose is to properly propagate any exception that may have occurred during the call on the main thread. We'll see how this is handled when we look at DoAsyncDispatch() and Complete() next.

procedure TBaseAsyncResult.DoAsyncDispatch;
begin
FSynchronous := FInvokingThread = GetCurrentThreadId;
try
try
AsyncDispatch;
except
FInvokingException := AcquireExceptionObject
end;
finally
Complete;
_Release;
end;
end;

procedure TBaseAsyncResult.Complete;
begin
System.TMonitor.Enter(Self);
try
FCompleted := True;
if FAsyncHandle <> nil then
FAsyncHandle.SetEvent;
finally
System.TMonitor.Exit(Self);
end;
end;

The first thing that DoAsyncDispatch does is to determine whether or not this method was called synchronously. Again, this can happen if the TThread.Queue call above decided to directly execute the method. Then it proceeds to do the actual work of the call by calling AsyncDispatch, which is an abstract virtual method. The intent here is that only descendants should ever be instantiated and they must override AsyncDispatch. Here is also where we'll trap all exceptions. The AquireExceptionObject function is from the System unit which will find the current exception object from the raise list and return a reference to it. It will also drop that reference from the raise list which will mean it will no longer be automatically destroyed. When an except block is exited normally or another object instance is raised (raise; is a special case), the existing exception object instance is automatically freed. This prevents that instance from being freed and allows us to hang on to a reference for use in WaitForCompletion where we can raise it again.


The next call is to Complete which encapsulates the steps necessary to safely set the completed state and optionally release any callers blocked in WaitForCompletion. Here is uses the new "monitor" support available on any object instance for locking. Now back in DoAsyncDispatch, the last call is to _Release. This undoes the manual _AddRef call Invoke because we now know the TThread queue doesn't have a reference to this instance anymore. It is also entirely possible that this _Release call could cause this instance to be freed. This is safe because there is no other code following this call that accesses any instance fields. Only the local stack and exception frames are cleaned up and control is returned to the caller.


Here are the remaining methods of TBaseAsyncResult. They should be self-explanatory:

destructor TBaseAsyncResult.Destroy;
begin
FAsyncHandle.Free;
FInvokingException.Free;
inherited;
end;

function TBaseAsyncResult.GetAsyncHandleObject: THandleObject;
begin
if FAsyncHandle = nil then
begin
System.TMonitor.Enter(Self);
try
if FAsyncHandle = nil then
begin
FAsyncHandle := TSimpleEvent.Create();
if FCompleted then
FAsyncHandle.SetEvent;
end;
finally
System.TMonitor.Exit(Self);
end;
end;
Result := FAsyncHandle;
end;

function TBaseAsyncResult.GetAsyncResult: TBaseAsyncResult;
begin
Result := Self;
end;

function TBaseAsyncResult.GetCompletedSynchronously: Boolean;
begin
Result := FSynchronous and FCompleted;
end;

function TBaseAsyncResult.GetIsCompleted: Boolean;
begin
Result := FCompleted;
end;

Now that the base functionality is defined, we can create the specialized descendant classes that will provide the specific functionality for the procedure and function call cases. We'll start with the simple procedure call first:

type
TAsyncProcedureResult = class sealed (TBaseAsyncResult)
private
FAsyncProcedure: TProc;
protected
procedure AsyncDispatch; override;
constructor Create(const AAsyncProcedure: TProc);
end;

As you can imagine, this one is pretty simple now. All the "meat" of the functionality is in TBaseAsyncResult. All this class needs to do is to add an instance field for a place to store the TProc reference, and override the AsyncDispatch virtual method. There is some tricky code here :-):

constructor TAsyncProcedureResult.Create(const AAsyncProcedure: TProc);
begin
inherited Create;
FAsyncProcedure := AAsyncProcedure;
end;

procedure TAsyncProcedureResult.AsyncDispatch;
begin
FAsyncProcedure();
end;

The function call version gets a little more interesting and is made possible through the judicious application of a generic type:

  TAsyncFunctionResult<T> = class sealed (TBaseAsyncResult)
private
FAsyncFunction: TFunc<T>;
FRetVal: T;
protected
procedure AsyncDispatch; override;
function GetRetVal: T;
constructor Create(const AAsyncFunction: TFunc<T>);
end;

Here we've created a generic type from a non-generic ancestor. The reason is that we don't know what the result type will be. Like the TAsyncProcedureResult type, we need an instance field to hold on to the TFunc<T> reference. For this class, however, we also need to save off the result of the function call, thus the FRetVal: T; field. We also need to have a way to get this value, so I've added the GetRetVal: T; function. The reason for the function call rather than directly accessing the field is to ensure that the caller blocks until the actual async call is completed by calling WaitForCompletion.

constructor TAsyncFunctionResult<T>.Create(const AAsyncFunction: TFunc<T>);
begin
inherited Create;
FAsyncFunction := AAsyncFunction;
end;

procedure TAsyncFunctionResult<T>.AsyncDispatch;
begin
FRetVal := FAsyncFunction();
end;

function TAsyncFunctionResult<T>.GetRetVal: T;
begin
WaitForCompletion;
Result := FRetVal;
end;

The final step is to put all of this together. Now that the core functionality is handled, the implementation of BeginInvoke/EndInvoke becomes very simple:

function TControlHelper.BeginInvoke(const AProc: TProc): IAsyncResult;
begin
Result := TAsyncProcedureResult.Create(AProc).Invoke;
end;

function TControlHelper.BeginInvoke<T>(const AFunc: TFunc<T>): IAsyncResult;
begin
Result := TAsyncFunctionResult<T>.Create(AFunc).Invoke;
end;

procedure TControlHelper.EndInvoke(const ASyncResult: IAsyncResult);
begin
(AsyncResult as IGetAsyncResult).AsyncResult.WaitForCompletion;
end;

function TControlHelper.EndInvoke<T>(const AsyncResult: IAsyncResult): T;
begin
Result := ((AsyncResult as IGetAsyncResult).AsyncResult as TAsyncFunctionResult<T>).GetRetVal;
end;

Finally, I need to explain the little "gotcha" with the usage example from my last post. Understanding how anonymous methods "capture" the enclosing context will serve to protect you from potentially confusing runtime errors, especially when dealing with asynchronous code. When the compiler "captures" the context it does so by capturing variables, not values. This is an important and critical distinction. It essentially means that you still have to work to protect variables being accessed by multiple threads. In the example code, the SR: TSearchRec; variable is used both inside and outside the anonymous method. The outer context is looping and updating the SR structure, and the inner anonymous method context is reading the value. Because of the asynchronous nature of the whole BeginInvoke/EndInvoke thing, they are now running completely independent of one another. Because of the very thing we're trying to accomplish, this causes a race on the SR variable.  The example implicitly synchronizes access to SR by not allowing FindNext() to execute until after the FForm.EndInvoke(AR); call. In this case, it essentially "re-serializes" the code and doesn't really demonstrate truly async code. Next time I'll present a technique that allows you to capture the values instead of the variables so that we don't actually "capture" SR and can "stack" lots of asynchronous calls. We'll ignore the exception propagation thing for now.

Friday, September 26, 2008

A Sink Programming.

In this post I demonstrated how you can use an Anonymous Method to "synchronize" a background thread with the main UI thread. However there are times where you don't want to block execution of the thread but still want to have something happen in the main UI thread, asynchronously. For several releases of Delphi there has been the Queue() method on TThread. This allows you to schedule a TThreadMethod (just like Synchronize) to execute on the main UI thread. The difference is that Synchronize will block the caller until the UI thread completes the call, while Queue will return immediately. The problem with the Queue() method is that there is no simple way to know when the "queued" event is done executing. Queue() really only works in a "queue it and forget it" scenario. It is still a form of Async programming has limited usefulness. New to D2009, a new Queue() overload was added just like Synchronize() that takes "reference to procedure" type (the underlying type to which you can assign an Anonymous Method). Let's see if we can use the Queue() method as the underlying mechanism for doing some "A Sink" programming :-).

Since the point of this exercise is to create something that we can use to asynchronously run some code on the UI thread and also know when it is complete, we need something that will represent a particular scheduled event. For this we'll declare an interface:

type
IAsyncResult = interface
function GetAsyncHandleObject: THandleObject;
function GetCompletedSynchronously: Boolean;
function GetIsCompleted: Boolean;
property AsyncHandleObject: THandleObject read GetAsyncHandleObject;
property CompletedSynchronously: Boolean read GetCompletedSynchronously;
property IsCompleted: Boolean read GetIsCompleted;
end;

The next thing I'm going to do is a trick I'm going to use to "extend" the existing framework using a class helper. This allows us to syntactically add something to the framework we will eventually fold into the class itself. If you looked closely we did this for the Windows Vista extensions to VCL for D2007, then in D2009, all those added methods and properties were folded directly into the classes themselves and the helpers were removed. Any consumers of the D2007 code could remain unchanged. But, I digress...

  TControlHelper = class helper for TControl
public
function BeginInvoke(const AProc: TProc): IAsyncResult; overload;
function BeginInvoke<T>(const AFunc: TFunc<T>): IAsyncResult; overload;
procedure EndInvoke(const ASyncResult: IAsyncResult); overload;
function EndInvoke<T>(const AsyncResult: IAsyncResult): T; overload;
end;

The non-generic versions of the methods are used for normal procedure calls where you're only interested in calling some method that doesn't have a return value. The non-generic EndInvoke() method simply blocks until the method call is complete and returns. If the call is complete before calling EndInvoke, then it will return immediately. If you call BeginInvoke from the UI thread, then the Proc is executed synchronously, and CompletedSynchronously returns true. The generic versions of these methods allow you to asynchronously call a function and get the return value once it becomes available. Using the AsyncHandleObject property on IAsyncResult, you can dispatch several different async calls with BeginInvoke() and then wait for one or all of them to complete using the new class function THandlObject.WaitForMultiple() in SyncObjs.pas. Another thing is that if an exception is raised during the execution of the Proc, it will be caught, saved off and then re-raised when you call EndInvoke(). So you should always call EndInvoke() at some point.


So that's an overview of the exposed interfaces. In this example, this is the execute method of a TThread descendant that searches a disk folder for all the files and fills up a list box the results. An edit box on the form holds the path to the folder, which is read asynchronously. Yes, a regular "Synchronize" could have been used in this case, but I needed a quick little example:

procedure TBeginInvokeTestForm.TSearchThread.Execute;
var
SR: TSearchRec;
SH: Integer;
AR: IAsyncResult;
begin
if not Terminated then
begin
AR := FForm.BeginInvoke<string>(TFunc<string>(function: string
begin
Result := FForm.Edit1.Text;
end));
FFolder := FForm.EndInvoke<string>(AR);
SH := FindFirst(IncludeTrailingPathDelimiter(FFolder) + '*.*', faAnyFile, SR);
while (SH = 0) and not Terminated do
begin
//Sleep(10); // this makes the background thread go a little slower.
AR := FForm.BeginInvoke(procedure
begin
if not Terminated then
FForm.ListBox1.Items.Add(SR.Name);
end);
FForm.EndInvoke(AR);
SH := FindNext(SR);
end;
end;
end;

This shows both the usage of the generic and non-generic BeginInvoke/EndInvoke functions. In the next post we'll start to look at the implementation behind all of this. We'll also start to look at how this same pattern can be used for async IO using I/O Completion ports and a simple thread pool. There is also a little "gotcha" in the above code using this technique that I'll explain.


Yes, this this looks remarkably like what is available in .NET. Hey they co-opted a bunch of ideas from Delphi and VCL... why not return the favor :-).


NOTE: The TFunc<string>() cast in the first BeginInvoke<string>() call is to work around an overload resolution bug that is slated to be fixed in a future service pack.

Thursday, September 25, 2008

News Flash! Someone has already invented the wheel!

I'm sure most, if not all, of you have heard the phrase "Why reinvent the wheel?" Another one I'm very fond of is "Standing on the shoulders of giants." The latter was often heard coming from Distinguished Engineer, Anders Hejlsberg during his tenure at Borland. So, why reinvent the wheel when there are giant shoulders to stand on ;-)?

If you've been following along for the last year or so, I've been working on something I loosely refer to as the Delphi Parallel Library. While I've not been blogging about the DPL for a while, I have been going back to it from time to time as I gain more knowledge and more information into the fascinating subject of parallelism and task dispatching. There is a veritable wellspring of information flooding the internet on this subject, so keeping up on it has been a daunting task. One excellent source of information is Joe Duffy the lead for the Task Parallel Libary team at Microsoft. Another is an interesting project at MIT called Cilk (that's pronounced "silk"). There is a spin-off company that is set to deliver on the work started from the Cilk project, called Cilk Arts. There is the Thread Building Blocks library from Intel. Finally, there are many smaller upstart projects, open source and otherwise out there. The point is that this is a hot topic.

One of the interesting facets about all of the projects I mentioned above, is that none of these libraries are really about "threading." They're about breaking down iterative algorithms into smaller and smaller chunks of work or "tasks" and then working on ways to execute as many of these tasks in parallel on separate cores. They're about scalability and performance. They try and minimize the overhead associated with traditional threading models. As I get deeper into creating a Delphi Parallel Library, the plan is to cull together some of the key concepts from all the existing research and libraries out there and use them as a "blueprint" for the DPL. I want to take these concepts and apply them in a "Delphi-like" manner that would be familiar and intuitive to the Delphi programmer while leveraging as much of the newer Delphi language features such as Generics and Anonymous methods. Joe Duffy's series about creating a Work Stealing Queue and combining it with a thread pool in which forms the basis for the .NET TPL is an excellent source of information. This whole notion of distributed work queues and work stealing is also something that the Cilk project has also done. A lot of the current research on parallelism also relies heavily on the latest in lock-free programming and algorithms. An excellent source for various lock-free lists, stacks and queues for the Delphi & C# folks is Julian Bucknall's blog.

The best way to learn is to actually do. While playing with some of these ideas and concepts, I've found that I really need to have an easy way to do some simple performance measuring. In .NET, there is an interesting little class, System.Diagnostics.Stopwatch. As I begin to lay the foundation, some of these little utility classes will be heavily used. Why not provide a stopwatch gizmo for Delphi? While the stopwatch is just a thin wrapper around the QueryPerformanceCounter() and QueryPerformanceFrequency() APIs, it provides a neatly encapsulated way to start, stop, reset, for measuring elapsed time. When I looked at the various use-cases for the stopwatch, it became clear that implementing this thing as a Delphi class would be require more housekeeping overhead than I wanted. Instead, I opted for doing this as a record with methods. Sorry, no Generics or Anonymous Methods in this one :-). Here's a Delphi version of the declaration:

  TStopwatch = record
strict private
FElapsed: Int64;
FRunning: Boolean;
FStartTimeStamp: Int64;
function GetElapsedMilliseconds: Int64;
function GetElapsedTicks: Int64;
function GetElapsedSeconds: Double;
class procedure InitStopwatchType; static;
public
class function Create: TStopwatch; static;
class function GetTimeStamp: Int64; static;
procedure Reset;
procedure Start;
class function StartNew: TStopwatch; static;
procedure Stop;
property ElapsedMilliseconds: Int64 read GetElapsedMilliseconds;
property ElapsedTicks: Int64 read GetElapsedTicks;
property ElapsedSeconds: Double read GetElapsedSeconds;
property IsRunning: Boolean read FRunning;
public class var Frequency: Int64;
public class var IsHighResolution: Boolean;
end;

Here's a simple use case using Barry's little benchmarker class (only the actual benchmarking functions):

class function TBenchmarker.Benchmark(const Code: TProc; Iterations,
Warmups: Integer): Double;
var
SW: TStopwatch;
i: Integer;
begin
for i := 1 to Warmups do
Code;

SW := TStopwatch.StartNew;
for i := 1 to Iterations do
Code;
SW.Stop;

Result := SW.ElapsedSeconds / Iterations;
end;

function TBenchmarker.Benchmark<T>(const Name: string; const Code: TFunc<T>): T;
var
SW: TStopwatch;
i: Integer;
begin
for i := 1 to FWarmups do
Result := Code;

SW := TStopwatch.StartNew;
for i := 1 to FIterations do
Result := Code;
SW.Stop;

FReportSink(Name, SW.ElapsedSeconds / Iterations - FOverhead);
end;

It's minor, yes, but it does reduce some of the thinking about calculating the time interval and having to declare a start and stop local variable. Also, since it is a record that is allocated on the stack, there are no heap allocations and there is no need to do any cleanup. Although, I admit that it would be nice to have a type constructor in order to better handle the initialization of the IsHighResolution and Frequency class variables.


And now the implementation:

class function TStopwatch.Create: TStopwatch;
begin
InitStopwatchType;
Result.Reset;
end;

function TStopwatch.GetElapsedMilliseconds: Int64;
begin
Result := GetElapsedTicks div (Frequency div 1000);
end;

function TStopwatch.GetElapsedSeconds: Double;
begin
Result := ElapsedTicks / Frequency;
end;

function TStopwatch.GetElapsedTicks: Int64;
begin
Result := FElapsed;
if FRunning then
Result := Result + GetTimeStamp - FStartTimeStamp;
end;

class function TStopwatch.GetTimeStamp: Int64;
begin
if IsHighResolution then
QueryPerformanceCounter(Result)
else
Result := GetTickCount * 1000;
end;

class procedure TStopwatch.InitStopwatchType;
begin
if Frequency = 0 then
begin
IsHighResolution := QueryPerformanceFrequency(Frequency);
if not IsHighResolution then
Frequency := 1000;
end;
end;

procedure TStopwatch.Reset;
begin
FElapsed := 0;
FRunning := False;
FStartTimeStamp := 0;
end;

procedure TStopwatch.Start;
begin
if not FRunning then
begin
FStartTimeStamp := GetTimeStamp;
FRunning := True;
end;
end;

class function TStopwatch.StartNew: TStopwatch;
begin
InitStopwatchType;
Result.Reset;
Result.Start;
end;

procedure TStopwatch.Stop;
begin
if FRunning then
begin
FElapsed := FElapsed + GetTimeStamp - FStartTimeStamp;
FRunning := False;
end;
end;

Now here's a question I've been unable to get a definite answer to: When would QueryPerformanceCounter and QueryPerformanceFrequency ever fail on a standard Windows system? Does this only occur for things like WinCE, Embedded NT, or systems with strange HAL layers? Even though I've taken into account that, according to the documentation, it could fail, in practice when is that?

Another "MacGyver" moment

Or, "More fun with Generics and Anonymous methods".

I'll just leave it up to you whether or not these utility functions are useful, but here they are:

type
Obj = class
class procedure Lock(O: TObject; Proc: TProc); static;
class procedure Using<T: class>(O: T; Proc: TProc<T>); static;
end;

class procedure Obj.Lock(O: TObject; Proc: TProc);
begin
TMonitor.Enter(O);
try
Proc();
finally
TMonitor.Exit(O);
end;
end;

class procedure Obj.Using<T>(O: T; Proc: TProc<T>);
begin
try
Proc(O);
finally
O.Free;
end;
end;

While very contrived, here's how you could use Obj.Using():

procedure TForm1.Button1Click(Sender: TObject);
begin
Obj.Using<TStringList>(TStringList.Create, procedure (List: TStringList)
begin
List.Add('One');
List.Add('Two');
List.Add('Three');
List.Add('Four');
ListBox1.Items := List;
end);
end;

And here's how you could use Obj.Lock():

procedure TMyObject.Process;
begin
Obj.Lock(Self, procedure
begin
//code executing within critical section
end);
end; 

Thursday, September 18, 2008

A "Nullable" Post

Solving problems can be fun and challenging. What is really challenging is solving a problem using only the tools at hand. During the development of Delphi 2009, there were several language features that were dropped in favor of generics and anonymous methods. One such feature was what we called "managed records." This allowed for some very interesting things where you could now implement a constructor, destructor and several assignment class operators on a record that would be automatically called when the record came into scope, went out of scope, and was assigned to something or something was assigned to it (like another instance of itself). Early in the development cycle, I cobbled together a generic record type that would implement a Nullable<T> type. I modeled it after the Nullable<T> type present in the .NET framework. So in Delphi, it looked like this:

type
Nullable<T> = record
private
FValue: T;
FHasValue: Boolean;
function GetValue: T;
public
constructor Create(AValue: T);
function GetValueOrDefault: T; overload;
function GetValueOrDefault(Default: T): T; overload;
property HasValue: Boolean read FHasValue;
property Value: T read GetValue;
class operator Implicit(Value: Nullable<T>): T;
class operator Implicit(Value: T): Nullable<T>;
class operator Explicit(Value: Nullable<T>): T;
end;

However, it suffered from one major flaw that managed records would have cleanly solved. Mainly, that just declaring a variable of type Nullable<T> where T is some concrete type, would leave the resulting structure uninitialized.  Namely, the FHasValue field could contain random garbage, which would make it return false positives for the HasValue property. Managed records would have solved this because I could instruct the compiler to call my special constructor when a variable of this type came into scope, like when program flow entered a method with one of the local variables declared as Nullable<sometype>; Well, alas... that didn't happen so I figured I'd have to push the notion of having a Nullable<T> type to the next release... maybe :-).


During the 1980's I enjoyed watching a television show called "MacGyver." The premise was that the main character used his own clever ingenuity to get out of a jam using only the tools and items he had at his disposal at the moment. As sort of a pop-cultural icon, this MacGyver character would concoct some of the most unlikely solutions to get away from the "bad guys" each week. The jokes and hyperbole surrounding this was fun, "MacGyver saves the day by shutting down the critical nuclear reactor core with only a paper clip and plastic bandage!" Even though a lot of what this character did defied common-sense and logic, the premise was that using a little ingenuity, you can solve seemingly impossible problems.


In true MacGyver fashion I figured I'd dust off my Nullable<T> type and see if there is something about the available tools I can leverage to accomplish the above task. Part of what spurred this on was this post by Barry Kelly about implementing a "smart-pointer" in Delphi 2009. In this post his clever use of an interface reference to "catch" the in-to-and-out-of-scope transitions sparked an idea.


Let's go back to the key problem with the above type, all we really care about is that when a variable of type Nullable<sometype> is declared, we need to guarantee that the FHasValue field is initialized to a known value that we can interpret as "False." In the case where the above is embedded into a class, the above declaration would work fine because we know that all data areas of a class instance are specifically initialized to 0. This would mean that a field declared as FField: Nullable<Integer>; would initially be considered Null, meaning no value has ever been assigned to it. Even though the FValue field is also 0, that is a legitimate value so we cannot just say that '0' means Null. The real problem happens when we declare a local variable as Nullable<sometype>. Whatever happens to be on the stack at that location is what the values will be. Hmmm... this is different from interfaces, strings, variants, and dynamic arrays, which are always guaranteed to be initialized to a known value (zero) upon entry to the method. This is to ensure "exception safety" which I discussed a long time ago right here.


In the Nullable<T> case, all we really care about is knowing whether or not the FValue field was ever assigned. Enter interfaces; In this case we want the FHasValue field to always be initialized to a known state. Any field or variable declared as some interface type, the compiler ensures will be initialized to nil. Let's tweak the record above using the interface "paperclip."

type
Nullable<T> = record
private
FValue: T;
FHasValue: IInterface;
...
function GetHasValue: Boolean;
public
constructor Create(AValue: T);
...
property HasValue: Boolean read GetHasValue;
...
end;

So I've changed the FHasValue field to be an IInterface and changed the HasValue property to now call a getter method since HasValue is still a Boolean and FHasValue is an interface. All GetHasValue does is to return whether or not FHasValue is nil. Then we change the constructor to assign something to FHasValue, namely just create an instance of TInterfacedObject and assign the implemented IInterface to the field.

constructor Nullable<T>.Create(const AValue: T);
begin
FValue := AValue;
FHasValue := TInterfacedObject.Create;
end;

Now when you assign one Nullable<T> to another Nullable<T>, the compiler generates the proper code to ensure that the FHasValue field is copied properly and the the lifetime of the TInterfacedObject is properly managed. In other words, we don't leak the object.


But can this be made a little more efficient? After all, if we're "MacGyver-ing" this up, might as well go all the way. If we pull another trick out of MacGyver's bag-of-tricks (say a "plastic bandage"), we can eliminate the worry about leaking the object and get a slight performance gain to boot. This trick is used down inside Generics.Defaults.pas for creating a singleton interface. All we really care about is that FHasValue is properly initialized to nil and when we assign to the Nullable<T>, it gets properly set to a valid non-nil value. The compiler will still generate the AddRef and Release calls, but we can now be more efficient about it.


Down in the implementation section we declare the following:

function NopAddref(inst: Pointer): Integer; stdcall;
begin
Result := -1;
end;

function NopRelease(inst: Pointer): Integer; stdcall;
begin
Result := -1;
end;

function NopQueryInterface(inst: Pointer; const IID: TGUID; out Obj): HResult; stdcall;
begin
Result := E_NOINTERFACE;
end;

const
FlagInterfaceVTable: array[0..2] of Pointer =
(
@NopQueryInterface,
@NopAddref,
@NopRelease
);

FlagInterfaceInstance: Pointer = @FlagInterfaceVTable;

And in the Nullable<T> constructor, just change it to:

constructor Nullable<T>.Create(const AValue: T);
begin
FValue := AValue;
FHasValue := IInterface(@FlagInterfaceInstance);
end;

And now all instances of Nullable<T> will use the same "fake interface" instance, which will never leak since it isn't even on the heap. It is also a little faster since there isn't any bus-locking "Interlocked" calls to handle a reference-count since that is not even needed in this case. Here's the whole thing in all it's "MacGyver" gory... er, uh, glory :-):

unit Foo;

interface

uses Generics.Defaults, SysUtils;

type
Nullable<T> = record
private
FValue: T;
FHasValue: IInterface;
function GetValue: T;
function GetHasValue: Boolean;
public
constructor Create(AValue: T);
function GetValueOrDefault: T; overload;
function GetValueOrDefault(Default: T): T; overload;
property HasValue: Boolean read GetHasValue;
property Value: T read GetValue;

class operator NotEqual(ALeft, ARight: Nullable<T>): Boolean;
class operator Equal(ALeft, ARight: Nullable<T>): Boolean;

class operator Implicit(Value: Nullable<T>): T;
class operator Implicit(Value: T): Nullable<T>;
class operator Explicit(Value: Nullable<T>): T;
end;

procedure SetFlagInterface(var Intf: IInterface);

implementation

function NopAddref(inst: Pointer): Integer; stdcall;
begin
Result := -1;
end;

function NopRelease(inst: Pointer): Integer; stdcall;
begin
Result := -1;
end;

function NopQueryInterface(inst: Pointer; const IID: TGUID; out Obj): HResult; stdcall;
begin
Result := E_NOINTERFACE;
end;

const
FlagInterfaceVTable: array[0..2] of Pointer =
(
@NopQueryInterface,
@NopAddref,
@NopRelease
);

FlagInterfaceInstance: Pointer = @FlagInterfaceVTable;

procedure SetFlatInterface(var Intf: IInterface);
begin
Intf := IInterface(@FlagInterfaceInstance);
end;

{ Nullable<T> }

constructor Nullable<T>.Create(AValue: T);
begin
FValue := AValue;
SetFlagInterface(FHasValue);
end;

class operator Nullable<T>.Equal(ALeft, ARight: Nullable<T>): Boolean;
var
Comparer: IEqualityComparer<T>;
begin
if ALeft.HasValue and ARight.HasValue then
begin
Comparer := TEqualityComparer<T>.Default;
Result := Comparer.Equals(ALeft.Value, ARight.Value);
end else
Result := ALeft.HasValue = ARight.HasValue;
end;

class operator Nullable<T>.Explicit(Value: Nullable<T>): T;
begin
Result := Value.Value;
end;

function Nullable<T>.GetHasValue: Boolean;
begin
Result := FHasValue <> nil;
end;

function Nullable<T>.GetValue: T;
begin
if not HasValue then
raise Exception.Create('Invalid operation, Nullable type has no value');
Result := FValue;
end;

function Nullable<T>.GetValueOrDefault: T;
begin
if HasValue then
Result := FValue
else
Result := Default(T);
end;

function Nullable<T>.GetValueOrDefault(Default: T): T;
begin
if not HasValue then
Result := Default
else
Result := FValue;
end;

class operator Nullable<T>.Implicit(Value: Nullable<T>): T;
begin
Result := Value.Value;
end;

class operator Nullable<T>.Implicit(Value: T): Nullable<T>;
begin
Result := Nullable<T>.Create(Value);
end;

class operator Nullable<T>.NotEqual(const ALeft, ARight: Nullable<T>): Boolean;
var
Comparer: IEqualityComparer<T>;
begin
if ALeft.HasValue and ARight.HasValue then
begin
Comparer := TEqualityComparer<T>.Default;
Result := not Comparer.Equals(ALeft.Value, ARight.Value);
end else
Result := ALeft.HasValue <> ARight.HasValue;
end;

end.

Here's a little test that shows how it works:

procedure TestNullable;
var
NullInt: Nullable<Integer>;
NullInt2: Nullable<Integer>;
I: Integer;
begin
try
I := NullInt; // Exception raised here because NullInt was never assigned a value;
except
Writeln('Success');
end;
if not NullInt.HasValue then // Check if the FHasValue field is actually nil
NullInt := 10; // Exercise an Implicit operator
NullInt2 := NullInt; // Non-Null Nullable assigned to a Null Nullable makes it non-Null
if NullInt2 = 10 then // Exercise the Equal() class operator and the Implicit operator
Writeln('Success');
end;

begin
TestNullable;
end.

Now, in true "Raymond Chen" fashion, I'd like to relieve many of you from the compulsive need to post the following comments :-).


Pre-emptive snarky comments:


If you would just implement non-reference counted interfaces like any "modern" language (aka, C#, Java, etc..), you wouldn't have to resort to those ugly hacks to fool the compiler.


Ok, so why didn't you just implement this with the "?" syntax like C# and build it in the first place?

Monday, September 8, 2008

Retrofitting a classic

When Delphi 2 was released targeting the 32bit Windows API there were some new-to-Delphi features of the operating system that opened up some new possibilities; Pre-emptive multi-tasking and multi-threading. Coupled with this "new" concept of a "thread," Delphi introduced the new TThread class that was an abstract base class from which one would derived in order to "wrap" an operating system thread. Along with this new functionality, a rather contrived demo was also introduced that showcased this overall notion of "multi-threaded" programming by visually representing the speed differences between three different sorting algorithms, the Bubble-Sort, the Selection-Sort and the Quick-Sort. That demo has remained virtually unchanged ever since. One thing this demo also showed was a way to update the UI by "synchronizing" the sorting thread with the main, or UI thread. This ensured that any UI updates occurred only on the UI thread and only when the UI thread was ready. Even though this is not the best technique in terms of performance, it is safe.

This synchronization was accomplished through the use of the Synchronize method on TThread which took a parameterless method as the parameter which would then block the calling thread, switch to the main, or UI thread, call the method and then return. Since this method took no parameters it was often very tedious to pass information from the running thread over to the UI thread. As the thread demo showed, this involved communicating the parameter data with the foreground by storing this data in fields on the TThread descendant instance. This worked fine even though it was cumbersome.

Fast-forward to now. Delphi 2009 was recently announced and in fact just went "gold" yesterday, Sunday, September 7th, 2008 at around 4pm PDT. One of the more exciting language features to be included in this release aside from generics, is anonymous methods or more accurately, closures. The reason we call them anonymous methods is two-fold. The corresponding concept in .NET is also called an anonymous methods and in C++Builder a method pointer is already declared using the extended __closure keyword syntax. Rather than introduce confusion among both Delphi and C++Builder customers we opted for the "Anonymous Method" moniker. However, for those computer science purists, you can most certainly think of them as true closures, but I digress :-). Because of this new-fangled anonymous method thingy, a new Synchronize overload was added to TThread that now takes a parameterless anonymous method. Here's the old code in the thread demo that would update the UI:

{ Since DoVisualSwap uses a VCL component (i.e., the TPaintBox) it should never
be called directly by this thread. DoVisualSwap should be called by passing
it to the Synchronize method which causes DoVisualSwap to be executed by the
main VCL thread, avoiding multi-thread conflicts. See VisualSwap for an
example of calling Synchronize. }

procedure TSortThread.DoVisualSwap;
begin
with FBox do
begin
Canvas.Pen.Color := clBtnFace;
PaintLine(Canvas, FI, FA);
PaintLine(Canvas, FJ, FB);
Canvas.Pen.Color := clRed;
PaintLine(Canvas, FI, FB);
PaintLine(Canvas, FJ, FA);
end;
end;

{ VisusalSwap is a wrapper on DoVisualSwap making it easier to use. The
parameters are copied to instance variables so they are accessable
by the main VCL thread when it executes DoVisualSwap }

procedure TSortThread.VisualSwap(A, B, I, J: Integer);
begin
FA := A;
FB := B;
FI := I;
FJ := J;
Synchronize(DoVisualSwap);
end;

Notice that it takes two methods, the one called from within the thread and then the one that is "synchronized" with the UI thread. You need to declared these methods on the class, which can sometimes be tedious. Also, this code requires manual assignment of the instance fields from the parameters. What if you could just pass in the code to synchronize along with the local state? With anonymous methods this is easy. Here's the above code changed to use an inlined anonymous method:

procedure TSortThread.VisualSwap(A, B, I, J: Integer);
begin
Synchronize(procedure
begin
with FBox do
begin
Canvas.Pen.Color := clBtnFace;
PaintLine(Canvas, I, A);
PaintLine(Canvas, J, B);
Canvas.Pen.Color := clRed;
PaintLine(Canvas, I, B);
PaintLine(Canvas, J, A);
end;
end);
end;

By using an anonymous method, I've eliminated the DoVisualSwap method and removed the need for the FA, FB, FI, and FJ instance fields. Once you get used to the new syntax, this code is much easier to understand an use.

Wednesday, September 3, 2008

Multicast Events - the finale

In my previous two posts I presented a technique using the new generics language feature of Delphi 2009 to create a typesafe multicast event. In the previous post, I showed how you can create a TMulticastEvent<T> instance and assign it to an event handler for an existing event on a TComponent derived type. Using the existing FreeNotification mechanism, you didn't need to worry about explicitly freeing the multicast event object. What if one of the components in the sink event handlers in the multicast event list was freed? The good thing is that the FreeNotification mechanism works both ways. We can leverage this functionality again to handle cleanup from the other direction.

In order to implement the complete cleanup for the TComponentMulticastEvent<T>, we need to know when an event handler was added and when one was removed. To do this I added these two virtual methods to the base TMulticastEvent class (the base non-generic version). There are also helper functions, RemoveInstanceReferences() and IndexOfInstance() that can be used in descendants to remove all event handlers that refer to a specific object instance and check if a specific instance is being referenced within the list.

  TMulticastEvent = class
...
strict protected
procedure EventAdded(const AMethod: TMethod); virtual;
procedure EventRemoved(const AMethod: TMethod); virtual;
protected
procedure RemoveInstanceReferences(const Instance: TObject);
function IndexOfInstance(const Instance: TObject): Integer;
...
end;

They're not marked abstract because the immediate descendant, TMulticastEvent<T> doesn't need to and should not be forced to override them. They just do nothing in the base class. In the corresponding Add and Remove methods on TMulticastEvent, these virtual methods are then called with event just added or just removed. Now we override the EventAdded and EventRemoved methods in the TComponentMulticastEvent<T> class:

  TComponentMulticastEvent<T> = class(TMulticastEvent<T>)
...
private
FSink: TNotificationSink;
strict protected
procedure EventAdded(const AMethod: TMethod); override;
procedure EventRemoved(const AMethod: TMethod); override;
...
end;

We also need to hold a reference to the internal notification sink class in order to use its FreeNotification mechanism. Here's the implementation of these methods:

procedure TComponentMulticastEvent<T>.EventAdded(const AMethod: TMethod);
begin
inherited;
if TObject(AMethod.Data) is TComponent then
FSink.FreeNotification(TComponent(AMethod.Data));
end;

procedure TComponentMulticastEvent<T>.EventRemoved(const AMethod: TMethod);
begin
inherited;
if (TObject(AMethod.Data) is TComponent) and (IndexOfInstance(TObject(AMethod.Data)) < 0) then
FSink.RemoveFreeNotification(TComponent(AMethod.Data));
end;

And then the Notification on the private TNotificationSink class:

procedure TComponentMulticastEvent<T>.TNotificationSink.Notification(AComponent: TComponent;
Operation: TOperation);
begin
inherited;
if Operation = opRemove then
if AComponent = FOwnerComp then
Free
else
FEvent.RemoveInstanceReferences(AComponent);
end;

In the EventRemoved method we call IndexOfInstance() to ensure that there aren't multiple references to the same instance in the list before we remove the free notification hook. This is because FreeNotification will add the instance to its internal list only once.


So there you go, a multicast event that also performs full auto-cleanup for both the source and the sink instances. If the source instance goes away, the multicast event instance is automatically cleaned up. Likewise, if one of the sink event handlers' instances go away, it will automatically be removed from the list so there are no stale references. Of course, I'll remind the reader that with this implementation, it only works for TComponent derived instances. For non-TComponent derived instances, you can still use a descendant of TMulticastEvent<T> in a manner similar to TComponentMulticastEvent<T> mixed with, for instance, a technique described here.