The Oracle at Delphi

Thursday, September 25, 2008

News Flash! Someone has already invented the wheel!

I'm sure most, if not all, of you have heard the phrase "Why reinvent the wheel?" Another one I'm very fond of is "Standing on the shoulders of giants." The latter was often heard coming from Distinguished Engineer, Anders Hejlsberg during his tenure at Borland. So, why reinvent the wheel when there are giant shoulders to stand on ;-)?

If you've been following along for the last year or so, I've been working on something I loosely refer to as the Delphi Parallel Library. While I've not been blogging about the DPL for a while, I have been going back to it from time to time as I gain more knowledge and more information into the fascinating subject of parallelism and task dispatching. There is a veritable wellspring of information flooding the internet on this subject, so keeping up on it has been a daunting task. One excellent source of information is Joe Duffy the lead for the Task Parallel Libary team at Microsoft. Another is an interesting project at MIT called Cilk (that's pronounced "silk"). There is a spin-off company that is set to deliver on the work started from the Cilk project, called Cilk Arts. There is the Thread Building Blocks library from Intel. Finally, there are many smaller upstart projects, open source and otherwise out there. The point is that this is a hot topic.

One of the interesting facets about all of the projects I mentioned above, is that none of these libraries are really about "threading." They're about breaking down iterative algorithms into smaller and smaller chunks of work or "tasks" and then working on ways to execute as many of these tasks in parallel on separate cores. They're about scalability and performance. They try and minimize the overhead associated with traditional threading models. As I get deeper into creating a Delphi Parallel Library, the plan is to cull together some of the key concepts from all the existing research and libraries out there and use them as a "blueprint" for the DPL. I want to take these concepts and apply them in a "Delphi-like" manner that would be familiar and intuitive to the Delphi programmer while leveraging as much of the newer Delphi language features such as Generics and Anonymous methods. Joe Duffy's series about creating a Work Stealing Queue and combining it with a thread pool in which forms the basis for the .NET TPL is an excellent source of information. This whole notion of distributed work queues and work stealing is also something that the Cilk project has also done. A lot of the current research on parallelism also relies heavily on the latest in lock-free programming and algorithms. An excellent source for various lock-free lists, stacks and queues for the Delphi & C# folks is Julian Bucknall's blog.

The best way to learn is to actually do. While playing with some of these ideas and concepts, I've found that I really need to have an easy way to do some simple performance measuring. In .NET, there is an interesting little class, System.Diagnostics.Stopwatch. As I begin to lay the foundation, some of these little utility classes will be heavily used. Why not provide a stopwatch gizmo for Delphi? While the stopwatch is just a thin wrapper around the QueryPerformanceCounter() and QueryPerformanceFrequency() APIs, it provides a neatly encapsulated way to start, stop, reset, for measuring elapsed time. When I looked at the various use-cases for the stopwatch, it became clear that implementing this thing as a Delphi class would be require more housekeeping overhead than I wanted. Instead, I opted for doing this as a record with methods. Sorry, no Generics or Anonymous Methods in this one :-). Here's a Delphi version of the declaration:

  TStopwatch = record
  strict private
    FElapsed: Int64;
    FRunning: Boolean;
    FStartTimeStamp: Int64;
    function GetElapsedMilliseconds: Int64;
    function GetElapsedTicks: Int64;
    function GetElapsedSeconds: Double;
    class procedure InitStopwatchType; static;
  public
    class function Create: TStopwatch; static;
    class function GetTimeStamp: Int64; static;
    procedure Reset;
    procedure Start;
    class function StartNew: TStopwatch; static;
    procedure Stop;
    property ElapsedMilliseconds: Int64 read GetElapsedMilliseconds;
    property ElapsedTicks: Int64 read GetElapsedTicks;
    property ElapsedSeconds: Double read GetElapsedSeconds;
    property IsRunning: Boolean read FRunning;
  public class var Frequency: Int64;
  public class var IsHighResolution: Boolean;
  end;

Here's a simple use case using Barry's little benchmarker class (only the actual benchmarking functions):

class function TBenchmarker.Benchmark(const Code: TProc; Iterations,
  Warmups: Integer): Double;
var
  SW: TStopwatch;
  i: Integer;
begin
  for i := 1 to Warmups do
    Code;

  SW := TStopwatch.StartNew;
  for i := 1 to Iterations do
    Code;
  SW.Stop;

  Result := SW.ElapsedSeconds / Iterations;
end;

function TBenchmarker.Benchmark<T>(const Name: string; const Code: TFunc<T>): T;
var
  SW: TStopwatch;
  i: Integer;
begin
  for i := 1 to FWarmups do
    Result := Code;
  
  SW := TStopwatch.StartNew;
  for i := 1 to FIterations do
    Result := Code;
  SW.Stop;
  
  FReportSink(Name, SW.ElapsedSeconds / Iterations - FOverhead);
end;

It's minor, yes, but it does reduce some of the thinking about calculating the time interval and having to declare a start and stop local variable. Also, since it is a record that is allocated on the stack, there are no heap allocations and there is no need to do any cleanup. Although, I admit that it would be nice to have a type constructor in order to better handle the initialization of the IsHighResolution and Frequency class variables.

And now the implementation:

class function TStopwatch.Create: TStopwatch;
begin
  InitStopwatchType;
  Result.Reset;
end;

function TStopwatch.GetElapsedMilliseconds: Int64;
begin
  Result := GetElapsedTicks div (Frequency div 1000);
end;

function TStopwatch.GetElapsedSeconds: Double;
begin
  Result := ElapsedTicks / Frequency;
end;

function TStopwatch.GetElapsedTicks: Int64;
begin
  Result := FElapsed;
  if FRunning then
    Result := Result + GetTimeStamp - FStartTimeStamp;
end;

class function TStopwatch.GetTimeStamp: Int64;
begin
  if IsHighResolution then
    QueryPerformanceCounter(Result)
  else
    Result := GetTickCount * 1000;
end;

class procedure TStopwatch.InitStopwatchType;
begin
  if Frequency = 0 then
  begin
    IsHighResolution := QueryPerformanceFrequency(Frequency);
    if not IsHighResolution then
      Frequency := 1000;
  end;
end;

procedure TStopwatch.Reset;
begin
  FElapsed := 0;
  FRunning := False;
  FStartTimeStamp := 0;
end;

procedure TStopwatch.Start;
begin
  if not FRunning then
  begin
    FStartTimeStamp := GetTimeStamp;
    FRunning := True;
  end;
end;

class function TStopwatch.StartNew: TStopwatch;
begin
  InitStopwatchType;
  Result.Reset;
  Result.Start;
end;

procedure TStopwatch.Stop;
begin
  if FRunning then
  begin
    FElapsed := FElapsed + GetTimeStamp - FStartTimeStamp;
    FRunning := False;
  end;
end;

Now here's a question I've been unable to get a definite answer to: When would QueryPerformanceCounter and QueryPerformanceFrequency ever fail on a standard Windows system? Does this only occur for things like WinCE, Embedded NT, or systems with strange HAL layers? Even though I've taken into account that, according to the documentation, it could fail, in practice when is that?

Another "MacGyver" moment

Or, "More fun with Generics and Anonymous methods".

I'll just leave it up to you whether or not these utility functions are useful, but here they are:

type
  Obj = class
    class procedure Lock(O: TObject; Proc: TProc); static;
    class procedure Using<T: class>(O: T; Proc: TProc<T>); static;
  end;

class procedure Obj.Lock(O: TObject; Proc: TProc);
begin
  TMonitor.Enter(O);
  try
    Proc();
  finally
    TMonitor.Exit(O);
  end;
end;

class procedure Obj.Using<T>(O: T; Proc: TProc<T>);
begin
  try
    Proc(O);
  finally
    O.Free;
  end;
end;

While very contrived, here's how you could use Obj.Using():

procedure TForm1.Button1Click(Sender: TObject);
begin
  Obj.Using<TStringList>(TStringList.Create, procedure (List: TStringList)
    begin
      List.Add('One');
      List.Add('Two');
      List.Add('Three');
      List.Add('Four');
      ListBox1.Items := List;
    end);
end;

And here's how you could use Obj.Lock():

procedure TMyObject.Process;
begin
  Obj.Lock(Self, procedure
    begin
      //code executing within critical section
    end);
end;

Thursday, September 18, 2008

A "Nullable" Post

Solving problems can be fun and challenging. What is really challenging is solving a problem using only the tools at hand. During the development of Delphi 2009, there were several language features that were dropped in favor of generics and anonymous methods. One such feature was what we called "managed records." This allowed for some very interesting things where you could now implement a constructor, destructor and several assignment class operators on a record that would be automatically called when the record came into scope, went out of scope, and was assigned to something or something was assigned to it (like another instance of itself). Early in the development cycle, I cobbled together a generic record type that would implement a Nullable<T> type. I modeled it after the Nullable<T> type present in the .NET framework. So in Delphi, it looked like this:

type
  Nullable<T> = record
  private
    FValue: T;
    FHasValue: Boolean;
    function GetValue: T;
  public
    constructor Create(AValue: T);
    function GetValueOrDefault: T; overload;
    function GetValueOrDefault(Default: T): T; overload;
    property HasValue: Boolean read FHasValue;
    property Value: T read GetValue;
    class operator Implicit(Value: Nullable<T>): T;
    class operator Implicit(Value: T): Nullable<T>;
    class operator Explicit(Value: Nullable<T>): T;
  end;

However, it suffered from one major flaw that managed records would have cleanly solved. Mainly, that just declaring a variable of type Nullable<T> where T is some concrete type, would leave the resulting structure uninitialized. Namely, the FHasValue field could contain random garbage, which would make it return false positives for the HasValue property. Managed records would have solved this because I could instruct the compiler to call my special constructor when a variable of this type came into scope, like when program flow entered a method with one of the local variables declared as Nullable<sometype>; Well, alas... that didn't happen so I figured I'd have to push the notion of having a Nullable<T> type to the next release... maybe :-).

During the 1980's I enjoyed watching a television show called "MacGyver." The premise was that the main character used his own clever ingenuity to get out of a jam using only the tools and items he had at his disposal at the moment. As sort of a pop-cultural icon, this MacGyver character would concoct some of the most unlikely solutions to get away from the "bad guys" each week. The jokes and hyperbole surrounding this was fun, "MacGyver saves the day by shutting down the critical nuclear reactor core with only a paper clip and plastic bandage!" Even though a lot of what this character did defied common-sense and logic, the premise was that using a little ingenuity, you can solve seemingly impossible problems.

In true MacGyver fashion I figured I'd dust off my Nullable<T> type and see if there is something about the available tools I can leverage to accomplish the above task. Part of what spurred this on was this post by Barry Kelly about implementing a "smart-pointer" in Delphi 2009. In this post his clever use of an interface reference to "catch" the in-to-and-out-of-scope transitions sparked an idea.

Let's go back to the key problem with the above type, all we really care about is that when a variable of type Nullable<sometype> is declared, we need to guarantee that the FHasValue field is initialized to a known value that we can interpret as "False." In the case where the above is embedded into a class, the above declaration would work fine because we know that all data areas of a class instance are specifically initialized to 0. This would mean that a field declared as FField: Nullable<Integer>; would initially be considered Null, meaning no value has ever been assigned to it. Even though the FValue field is also 0, that is a legitimate value so we cannot just say that '0' means Null. The real problem happens when we declare a local variable as Nullable<sometype>. Whatever happens to be on the stack at that location is what the values will be. Hmmm... this is different from interfaces, strings, variants, and dynamic arrays, which are always guaranteed to be initialized to a known value (zero) upon entry to the method. This is to ensure "exception safety" which I discussed a long time ago right here.

In the Nullable<T> case, all we really care about is knowing whether or not the FValue field was ever assigned. Enter interfaces; In this case we want the FHasValue field to always be initialized to a known state. Any field or variable declared as some interface type, the compiler ensures will be initialized to nil. Let's tweak the record above using the interface "paperclip."

type
  Nullable<T> = record
  private
    FValue: T;
    FHasValue: IInterface;
    ...
    function GetHasValue: Boolean;
  public
    constructor Create(AValue: T);
    ...
    property HasValue: Boolean read GetHasValue;
    ...
  end;

So I've changed the FHasValue field to be an IInterface and changed the HasValue property to now call a getter method since HasValue is still a Boolean and FHasValue is an interface. All GetHasValue does is to return whether or not FHasValue is nil. Then we change the constructor to assign something to FHasValue, namely just create an instance of TInterfacedObject and assign the implemented IInterface to the field.

constructor Nullable<T>.Create(const AValue: T);
begin
  FValue := AValue;
  FHasValue := TInterfacedObject.Create;
end;

Now when you assign one Nullable<T> to another Nullable<T>, the compiler generates the proper code to ensure that the FHasValue field is copied properly and the the lifetime of the TInterfacedObject is properly managed. In other words, we don't leak the object.

But can this be made a little more efficient? After all, if we're "MacGyver-ing" this up, might as well go all the way. If we pull another trick out of MacGyver's bag-of-tricks (say a "plastic bandage"), we can eliminate the worry about leaking the object and get a slight performance gain to boot. This trick is used down inside Generics.Defaults.pas for creating a singleton interface. All we really care about is that FHasValue is properly initialized to nil and when we assign to the Nullable<T>, it gets properly set to a valid non-nil value. The compiler will still generate the AddRef and Release calls, but we can now be more efficient about it.

Down in the implementation section we declare the following:

function NopAddref(inst: Pointer): Integer; stdcall;
begin
  Result := -1;
end;

function NopRelease(inst: Pointer): Integer; stdcall;
begin
  Result := -1;
end;

function NopQueryInterface(inst: Pointer; const IID: TGUID; out Obj): HResult; stdcall;
begin
  Result := E_NOINTERFACE;
end;

const
  FlagInterfaceVTable: array[0..2] of Pointer =
  (
    @NopQueryInterface,
    @NopAddref,
    @NopRelease
  );

  FlagInterfaceInstance: Pointer = @FlagInterfaceVTable;

And in the Nullable<T> constructor, just change it to:

constructor Nullable<T>.Create(const AValue: T);
begin
  FValue := AValue;
  FHasValue := IInterface(@FlagInterfaceInstance);
end;

And now all instances of Nullable<T> will use the same "fake interface" instance, which will never leak since it isn't even on the heap. It is also a little faster since there isn't any bus-locking "Interlocked" calls to handle a reference-count since that is not even needed in this case. Here's the whole thing in all it's "MacGyver" gory... er, uh, glory :-):

unit Foo;

interface

uses Generics.Defaults, SysUtils;

type
  Nullable<T> = record
  private
    FValue: T;
    FHasValue: IInterface;
    function GetValue: T;
    function GetHasValue: Boolean;
  public
    constructor Create(AValue: T);
    function GetValueOrDefault: T; overload;
    function GetValueOrDefault(Default: T): T; overload;
    property HasValue: Boolean read GetHasValue;
    property Value: T read GetValue;

    class operator NotEqual(ALeft, ARight: Nullable<T>): Boolean;
    class operator Equal(ALeft, ARight: Nullable<T>): Boolean;

    class operator Implicit(Value: Nullable<T>): T;
    class operator Implicit(Value: T): Nullable<T>;
    class operator Explicit(Value: Nullable<T>): T;
  end;

procedure SetFlagInterface(var Intf: IInterface);

implementation

function NopAddref(inst: Pointer): Integer; stdcall;
begin
  Result := -1;
end;

function NopRelease(inst: Pointer): Integer; stdcall;
begin
  Result := -1;
end;

function NopQueryInterface(inst: Pointer; const IID: TGUID; out Obj): HResult; stdcall;
begin
  Result := E_NOINTERFACE;
end;

const
  FlagInterfaceVTable: array[0..2] of Pointer =
  (
    @NopQueryInterface,
    @NopAddref,
    @NopRelease
  );

  FlagInterfaceInstance: Pointer = @FlagInterfaceVTable;

procedure SetFlatInterface(var Intf: IInterface);
begin
  Intf := IInterface(@FlagInterfaceInstance);
end;

{ Nullable<T> }

constructor Nullable<T>.Create(AValue: T);
begin
  FValue := AValue;
  SetFlagInterface(FHasValue);
end;

class operator Nullable<T>.Equal(ALeft, ARight: Nullable<T>): Boolean;
var
  Comparer: IEqualityComparer<T>;
begin
  if ALeft.HasValue and ARight.HasValue then
  begin
    Comparer := TEqualityComparer<T>.Default;
    Result := Comparer.Equals(ALeft.Value, ARight.Value);
  end else
    Result := ALeft.HasValue = ARight.HasValue;
end;

class operator Nullable<T>.Explicit(Value: Nullable<T>): T;
begin
  Result := Value.Value;
end;

function Nullable<T>.GetHasValue: Boolean;
begin
  Result := FHasValue <> nil;
end;

function Nullable<T>.GetValue: T;
begin
  if not HasValue then
    raise Exception.Create('Invalid operation, Nullable type has no value');
  Result := FValue;
end;

function Nullable<T>.GetValueOrDefault: T;
begin
  if HasValue then
    Result := FValue
  else
    Result := Default(T);
end;

function Nullable<T>.GetValueOrDefault(Default: T): T;
begin
  if not HasValue then
    Result := Default
  else
    Result := FValue;
end;

class operator Nullable<T>.Implicit(Value: Nullable<T>): T;
begin
  Result := Value.Value;
end;

class operator Nullable<T>.Implicit(Value: T): Nullable<T>;
begin
  Result := Nullable<T>.Create(Value);
end;

class operator Nullable<T>.NotEqual(const ALeft, ARight: Nullable<T>): Boolean;
var
  Comparer: IEqualityComparer<T>;
begin
  if ALeft.HasValue and ARight.HasValue then
  begin
    Comparer := TEqualityComparer<T>.Default;
    Result := not Comparer.Equals(ALeft.Value, ARight.Value);
  end else
    Result := ALeft.HasValue <> ARight.HasValue;
end;

end.

Here's a little test that shows how it works:

procedure TestNullable;
var
  NullInt: Nullable<Integer>;
  NullInt2: Nullable<Integer>;
  I: Integer;
begin
  try
    I := NullInt; // Exception raised here because NullInt was never assigned a value;
  except
    Writeln('Success');
  end;
  if not NullInt.HasValue then // Check if the FHasValue field is actually nil
    NullInt := 10;     // Exercise an Implicit operator
  NullInt2 := NullInt; // Non-Null Nullable assigned to a Null Nullable makes it non-Null
  if NullInt2 = 10 then // Exercise the Equal() class operator and the Implicit operator
    Writeln('Success');
end;

begin
  TestNullable;
end.

Now, in true "Raymond Chen" fashion, I'd like to relieve many of you from the compulsive need to post the following comments :-).

Pre-emptive snarky comments:

If you would just implement non-reference counted interfaces like any "modern" language (aka, C#, Java, etc..), you wouldn't have to resort to those ugly hacks to fool the compiler.

Ok, so why didn't you just implement this with the "?" syntax like C# and build it in the first place?

Monday, September 8, 2008

Retrofitting a classic

When Delphi 2 was released targeting the 32bit Windows API there were some new-to-Delphi features of the operating system that opened up some new possibilities; Pre-emptive multi-tasking and multi-threading. Coupled with this "new" concept of a "thread," Delphi introduced the new TThread class that was an abstract base class from which one would derived in order to "wrap" an operating system thread. Along with this new functionality, a rather contrived demo was also introduced that showcased this overall notion of "multi-threaded" programming by visually representing the speed differences between three different sorting algorithms, the Bubble-Sort, the Selection-Sort and the Quick-Sort. That demo has remained virtually unchanged ever since. One thing this demo also showed was a way to update the UI by "synchronizing" the sorting thread with the main, or UI thread. This ensured that any UI updates occurred only on the UI thread and only when the UI thread was ready. Even though this is not the best technique in terms of performance, it is safe.

This synchronization was accomplished through the use of the Synchronize method on TThread which took a parameterless method as the parameter which would then block the calling thread, switch to the main, or UI thread, call the method and then return. Since this method took no parameters it was often very tedious to pass information from the running thread over to the UI thread. As the thread demo showed, this involved communicating the parameter data with the foreground by storing this data in fields on the TThread descendant instance. This worked fine even though it was cumbersome.

Fast-forward to now. Delphi 2009 was recently announced and in fact just went "gold" yesterday, Sunday, September 7th, 2008 at around 4pm PDT. One of the more exciting language features to be included in this release aside from generics, is anonymous methods or more accurately, closures. The reason we call them anonymous methods is two-fold. The corresponding concept in .NET is also called an anonymous methods and in C++Builder a method pointer is already declared using the extended __closure keyword syntax. Rather than introduce confusion among both Delphi and C++Builder customers we opted for the "Anonymous Method" moniker. However, for those computer science purists, you can most certainly think of them as true closures, but I digress :-). Because of this new-fangled anonymous method thingy, a new Synchronize overload was added to TThread that now takes a parameterless anonymous method. Here's the old code in the thread demo that would update the UI:

{ Since DoVisualSwap uses a VCL component (i.e., the TPaintBox) it should never
  be called directly by this thread.  DoVisualSwap should be called by passing
  it to the Synchronize method which causes DoVisualSwap to be executed by the
  main VCL thread, avoiding multi-thread conflicts. See VisualSwap for an
  example of calling Synchronize. }

procedure TSortThread.DoVisualSwap;
begin
  with FBox do
  begin
    Canvas.Pen.Color := clBtnFace;
    PaintLine(Canvas, FI, FA);
    PaintLine(Canvas, FJ, FB);
    Canvas.Pen.Color := clRed;
    PaintLine(Canvas, FI, FB);
    PaintLine(Canvas, FJ, FA);
  end;
end;

{ VisusalSwap is a wrapper on DoVisualSwap making it easier to use.  The
  parameters are copied to instance variables so they are accessable
  by the main VCL thread when it executes DoVisualSwap }

procedure TSortThread.VisualSwap(A, B, I, J: Integer);
begin
  FA := A;
  FB := B;
  FI := I;
  FJ := J;
  Synchronize(DoVisualSwap);
end;

Notice that it takes two methods, the one called from within the thread and then the one that is "synchronized" with the UI thread. You need to declared these methods on the class, which can sometimes be tedious. Also, this code requires manual assignment of the instance fields from the parameters. What if you could just pass in the code to synchronize along with the local state? With anonymous methods this is easy. Here's the above code changed to use an inlined anonymous method:

procedure TSortThread.VisualSwap(A, B, I, J: Integer);
begin
  Synchronize(procedure
   begin
     with FBox do
     begin
       Canvas.Pen.Color := clBtnFace;
       PaintLine(Canvas, I, A);
       PaintLine(Canvas, J, B);
       Canvas.Pen.Color := clRed;
       PaintLine(Canvas, I, B);
       PaintLine(Canvas, J, A);
     end;
   end);
end;

By using an anonymous method, I've eliminated the DoVisualSwap method and removed the need for the FA, FB, FI, and FJ instance fields. Once you get used to the new syntax, this code is much easier to understand an use.

Wednesday, September 3, 2008

Multicast Events - the finale

In my previous two posts I presented a technique using the new generics language feature of Delphi 2009 to create a typesafe multicast event. In the previous post, I showed how you can create a TMulticastEvent<T> instance and assign it to an event handler for an existing event on a TComponent derived type. Using the existing FreeNotification mechanism, you didn't need to worry about explicitly freeing the multicast event object. What if one of the components in the sink event handlers in the multicast event list was freed? The good thing is that the FreeNotification mechanism works both ways. We can leverage this functionality again to handle cleanup from the other direction.

In order to implement the complete cleanup for the TComponentMulticastEvent<T>, we need to know when an event handler was added and when one was removed. To do this I added these two virtual methods to the base TMulticastEvent class (the base non-generic version). There are also helper functions, RemoveInstanceReferences() and IndexOfInstance() that can be used in descendants to remove all event handlers that refer to a specific object instance and check if a specific instance is being referenced within the list.

  TMulticastEvent = class
    ...
  strict protected
    procedure EventAdded(const AMethod: TMethod); virtual;
    procedure EventRemoved(const AMethod: TMethod); virtual;
  protected
    procedure RemoveInstanceReferences(const Instance: TObject);
    function IndexOfInstance(const Instance: TObject): Integer;
    ...
  end;

They're not marked abstract because the immediate descendant, TMulticastEvent<T> doesn't need to and should not be forced to override them. They just do nothing in the base class. In the corresponding Add and Remove methods on TMulticastEvent, these virtual methods are then called with event just added or just removed. Now we override the EventAdded and EventRemoved methods in the TComponentMulticastEvent<T> class:

  TComponentMulticastEvent<T> = class(TMulticastEvent<T>)
    ...
  private
    FSink: TNotificationSink;
  strict protected
    procedure EventAdded(const AMethod: TMethod); override;
    procedure EventRemoved(const AMethod: TMethod); override;
    ...
  end;

We also need to hold a reference to the internal notification sink class in order to use its FreeNotification mechanism. Here's the implementation of these methods:

procedure TComponentMulticastEvent<T>.EventAdded(const AMethod: TMethod);
begin
  inherited;
  if TObject(AMethod.Data) is TComponent then
    FSink.FreeNotification(TComponent(AMethod.Data));
end;

procedure TComponentMulticastEvent<T>.EventRemoved(const AMethod: TMethod);
begin
  inherited;
  if (TObject(AMethod.Data) is TComponent) and (IndexOfInstance(TObject(AMethod.Data)) < 0) then
    FSink.RemoveFreeNotification(TComponent(AMethod.Data));
end;

And then the Notification on the private TNotificationSink class:

procedure TComponentMulticastEvent<T>.TNotificationSink.Notification(AComponent: TComponent;
  Operation: TOperation);
begin
  inherited;
  if Operation = opRemove then
    if AComponent = FOwnerComp then
      Free
    else
      FEvent.RemoveInstanceReferences(AComponent);
end;

In the EventRemoved method we call IndexOfInstance() to ensure that there aren't multiple references to the same instance in the list before we remove the free notification hook. This is because FreeNotification will add the instance to its internal list only once.

So there you go, a multicast event that also performs full auto-cleanup for both the source and the sink instances. If the source instance goes away, the multicast event instance is automatically cleaned up. Likewise, if one of the sink event handlers' instances go away, it will automatically be removed from the list so there are no stale references. Of course, I'll remind the reader that with this implementation, it only works for TComponent derived instances. For non-TComponent derived instances, you can still use a descendant of TMulticastEvent<T> in a manner similar to TComponentMulticastEvent<T> mixed with, for instance, a technique described here.

Monday, August 25, 2008

Multicast Events - the cleanup

In my last post, I introduced a multicast event that uses generics to address the problem of needing to manually declare and implement a new multicaster for each unique event type. If you remember, I referred to some other posts that presented a technique for doing automatic cleanup of both the multicast event object itself and automatic removal of listeners. While the technique presented is indeed very generic and will work for nearly all instances, my only critique is that it carries a fairly heavy "contract" in order to fully realize its potential. This set all the creaky wheels moving as I tried to come up with something a little less "contract" heavy. This solution has an assumption that most events and event handlers are on TComponent derived classes. Yes, there are plenty of folks out there that use method pointers on object instances of types other than TComponent. I'm using the 80/20 rule. This will be highly useful to 80% of you out there and not as useful for the remaining 20%.

One of the main problems of a "bolt-on" multicast event class in Delphi for Win32 is proper cleanup. With normal single-cast events there is but one sender and one listener. The end-points are already presumed to be lifetime managed. For multicast events, no longer is there a point-to-point connection, but rather an intermediate "agent" that serves to translate a single event into "n" events for "n" listeners. In a nice garbage collected run-time environment, this isn't a problem because this intermediary will be properly cleanup in due time. This is not meant to be a ding against the inherent non-GC Delphi/Win32 environment but merely that we need to be a little more creative. The good thing here is that once we've established this solution, you can return to your "regularly scheduled programming" and no longer worry about this cleanup.

Let's start with a little review. TComponent and its derivatives carry the notion of "ownership" for the exact reason of not being able to rely on a GC. This model was established from the very first introduction of Delphi and the VCL. Any component that has an "owner" will be automatically cleaned up when its owner was cleaned up. In order for this to operate effectively, there was a need for some mechanism to let everyone involved know when an instance was going away. This is the reason for the Notification virtual method on TComponent. Every component that is being cleaned up, or more accurately, being removed from the list of "owned" components is sent broadcast to all the other owned components through this Notification method with the "opRemove" operation enumeration value. Initially, this notification only worked for components that shared a common "owner," which worked well for Delphi 1 and 2. In Delphi 3, form-inheritance and form-linking where introduced which allowed cross-form/datamodule references between components. For instance, the TTable component on a datamodule could now be referenced from a TDataSource component on a form. This presented a problem because the datamodule and form could have very different lifetimes and the previously mentioned notification mechanism just doesn't work. This is when the FreeNotification() method was introduced on TComponent. This allows any component to insert themselves into the "free notification" list of another component in order to "link" them together and know when each other is going away. Now proper cleanup (such as setting references to nil) can be done even for components that do not share the same owner. We can use this knowledge to devise a solution for properly cleaning up a multicast event object.

I created the following generic descendant of TMulticastEvent<T>:

type
  TComponentMulticastEvent<T> = class(TMulticastEvent<T>)
  private type
    TNotificationSink = class(TComponent)
    private
      FEvent: TMulticastEvent;
      FOwnerComp: TComponent;
    protected
      procedure Notification(AComponent: TComponent; Operation: TOperation); override;
    public
      constructor Create(AOwner: TComponent; AEvent: TMulticastEvent); reintroduce;
      destructor Destroy; override;
    end;
  public
    constructor Create(AOwner: TComponent);
  end;

Since the TMulticastEvent ancestor isn't a TComponent derivative I needed a way to "listen in" on the notifications sent between TComponents. This is why there is a private component derived from TComponent which provides the link between the multicast event instance and the component notifications. I could have simply made this private component owned by the component and not even worried about the Notification method because all owned components are properly cleaned up. However, the intent is that this internal component needs to be invisible (relatively speaking) and not show up in the list of owned components. A lot of code out there iterates through the list of owned components and performs some operation on each one. Don't want to risk breaking that code. So, when this private component is instantiated, it isn't "owned" by any component but by using FreeNotification, it will still know when the "owner" component is freed.

In order to make it a little easier to instantiate these multicast events I added the following public class static methods to the base non-generic TMulticastEvent class:

type
  TMulticastEvent = class
    ...
  public
    ...
    class function MulticastEvent<T>(const AMethod: T): TMulticastEvent<T>; static;
    class function Create<T>(AComponent: TComponent): T; overload; static;
    class function Create<T>: T; overload; static;
  end;

The first one we'll come back to in a moment. The two overloaded Create methods will create a TComponentMulticastEvent<T> or simply a TMulticastEvent<T>, respectively. If you'll notice that the multicast event instance is not being returned, but rather a value of the method pointer type itself is returned. This is the Invoke property. You can now simply do this:

procedure TForm1.Form1Create(Sender: TObject);
begin
  Button1.OnClick := TMulticastEvent.Create<TNotifyEvent>(Button1);
end;

Notice how that looks remarkably similar to a normal object instantiation? Cool. Uh... waitaminnit. How do I add listeners? I cannot call Add, or Remove now. One solution would be to first create the event, add the listeners, then assign the Invoke property to the OnClick event like this:

procedure TForm1.Form1Create(Sender: TObject);
var
  Click: TMulticastEvent<TNotifyEvent>;
begin
  Click := TComponentMulticastEvent<TNotifyEvent>.Create(Button1);
  Click.Add(Button1Click);
  Click.Add(Button2Click);
  Button1.OnClick := Click.Invoke;
end;

Yes, that would work, but only if you know up front what the listeners would be and that the list won't ever change once it is setup. The beauty of multicast events is that they're dynamic and can change at run-time. Remember the first class static method listed above? This is where that cute little method comes into play. Let's redo the above code using that method:

procedure TForm1.Form1Create(Sender: TObject);
begin
  Button1.OnClick := TMulticastEvent.Create<TNotifyEvent>(Button1);
  TMulticastEvent.MulticastEvent<TNotifyEvent>(Button1.OnClick).Add(Button1Click);
  TMulticastEvent.MulticastEvent<TNotifyEvent>(Button1.OnClick).Add(Button2Click);
end;

While the above techniques are equivalent in this instance, what if the addition or removal of the listeners occurred elsewhere? Instead of keeping a separate reference to the multicast event instance, it is simply fished out of the method pointer value itself. Also, because we know it too will be cleaned up, we can now simply concentrate on using it.

On a final note, the above implementation doesn't currently support automatic removal of listeners when they go away, but that can certainly be done. We'll look at that in the next installment. Your homework is this; using the class static methods above to in-turn implement class static Include() and Exclude() methods that will mimic the Include/Exclude standard functions in Delphi for .NET.

Friday, August 15, 2008

Multicast events using generics

Ever since before Delphi 1, the (then Delphi only) RAD Studio IDE has been full of home-grown multicast event class types. I usually refer to these as an "event bus." This is from my hardware days when I designed microcontroller based security/access control equipment. A CPU has an "address bus" and a "data bus" which can have one sender and any number of "listeners," not unlike a multicast event.

I've been following with interest these series of posts about creating a multicast event class in native Delphi. What struck me was that the class being presented was remarkably similar to the class and its descendants that we've used in the RAD Studio IDE since before Delphi 1. However the one presented in those posts does take it to another level and adds some extra housekeeping to notify the listener and/or the multicast event when each other is being freed. While that's a nice addition, it does tend to complicate its use. From my point of view, regardless of whether you have a single-cast or multicast event, you should always balance assigning the event with clearing the event. IOW, make sure the event isn't pointing to a dead instance. Overall, what was presented is a pretty decent design given what the Delphi language has to offer today.

One of the key new Delphi language features for Tiburón is the introduction of generics. During the development cycle, I began thinking if there was a way to create a multicast event class using generics. At first glance this seemed like an easy thing to do. After all, a multi cast event class is just a wrapper around an array of method pointers. To send out the event you iterate through the array of method pointers invoking each event in turn. So surely it's as simple as this:

type
  TMulticastEvent<T> = class
  private
    FHandlers: array of T;
  public
    procedure Add(Handler: T);
    procedure Remove(Handler: T);
    procedure Invoke(<uh oh>);
  end;

Hmm... Do you see the problem? How can I actually invoke the event? Adding and removing handlers looks straight forward, but invoking events is a problem. The multicast event class inside the Delphi IDE starts with a base class that handles the array of method pointers and it is up to the manually created descendants to add the Add and Remove methods and the specific Invoke (or Send in our case) method that provides the right parameter list and does the job of iterating through the assigned method pointers. Lots of ugly typecasting was happening in addition to having to manually create the descendants.

Ok, now I have a challenge. The gauntlet has been thrown. I'm going to design a generic multicast event using Delphi native code, dagnabbit! The end result still looked very similar to the internal IDE multicast event classes. The way I solved the invoke problem was to declare a property of type T.

type
  TMulticastEvent<T> = class
  private
    FInvoke: T;
    <same above as>
    property Invoke: T read FInvoke;
  end;

Syntactically you just call Invoke like any old method:

type
  TMyEvent = procedure (Sender: TObject; IntValue: Integer; FloatValue: Double) of object;
  MulticastMyEvent: TMulticastEvent<TMyEvent>
begin
  MulticastMyEvent := TMulticastEvent<TMyEvent>.Create;
  MulticastMyEvent.Invoke(Obj, 10, 3.14);
end.

There is still an interesting problem here. Do you see it? So I can use the event property "simulate" a method, but what is assigned to this method pointer? Where does control go when you call through this method pointer? When is is assigned? This is when things started getting a little interesting.

The problem with all the other multicast event class implementations I've seen (including our own internal one) is that for type-safety they require you to manually create a descendant class with the actual type of the event. It's tedious and error prone, and promotes gross duplication of code. These are the problems that generics are suppose to solve, right? However in this instance, it was almost like I needed to instantiate code at run-time (something that .NET actually does which solves many of these problems). Barring that, I had to reach a little deeper into my bag-o-tricks. All the multicast event classes share a couple of things; maintain a list or array of method pointers and iterate over that list and invoke each method. All method pointers are the same size (8 bytes consisting of a pointer to an object instance and a pointer to the method to invoke). I created a non-generic base class much like the existing base class we've had for years:

type
  TMulticastEvent = class
  strict protected type TEvent = procedure of object;
  strict private
    FHandlers: array of TMethod;
    procedure Add(const AMethod: TEvent); overload;
    prodedure Remove(const AMethod: TEvent); overload;
    function IndexOf(const AMethod: TEvent): Integer; overload;
  end;

Hmmmm... Why is the array declared here instead of the descendant? How will this be "typesafe?" You can't cast an arbitrary "T" to a TMethod. Oh and waidaminnit... the Add, Remove and IndexOf methods are private! To address these issues, I pulled out some assembler tricks out of my proverbial bag and I added the following protected methods.

type
  TMulticastEvent = class
    <stuff from above>
  protected
    procedure InternalAdd;
    procedure InternalRemove;
    procedure InternalIndexOf;
  end;

Uh, dude, there are no parameters on those methods. What do they do? Here's the method body from the InternalAdd method. The others are very similar.

procedure TMulticastEvent.InternalAdd;
asm
       XCHG  EAX,[ESP]
       POP   EAX
       POP   EBP
       JMP   Add
end;

What this function does is removes itself and the immediate caller from the call chain and directly transfers control to the corresponding "unsafe" method while retaining the passed in parameter(s). As we'll see later on, we're going to create a generic descendant class from this one where the method bodies of the actual typesafe versions of Add, Remove and IndexOf merely call the corresponding "InternalXXX' versions from this base class. Here's some of what that descendant will look like and the body of the Add method:

type
  TMulticastEvent<T> = class(TMulticastEvent)
  public
    procedure Add(const AMethod: T); overload;
    procedure Remove(const AMethod: T); overload;
    function IndexOf(const AMethod: T): Integer; overload;
  end;

procedure TMulticastEvent.Add(const AMethod: T);
begin
  InternalAdd;
end;

Ah, there's the "T"... It's beginning to look pretty generic. Why do it this way? The primary reason is that you cannot have any assembly code in the method body of any generic type or method, but you can refer to other methods that are written in assembly. So in this case the ancestor class merely provides those "non-generic-able" assembler functions. Another reason, which we'll get to, is that I also want to make sure that iterating through the array of event handlers is also done in some common code, which will also need to be in assembly due to some more stack tricks that must be done.

When this generic class is instantiated with an event type, the Add, Remove, and IndexOf methods will always get a stack frame. Even though the method body doesn't touch the parameter, the frame is still there and needs to be cleaned up because the size of the event type is > 4 bytes. Since it wont fit in a register, it is always passed by value on the stack. This is also why I declared a private TEvent type up in the base class so that the stack pattern will match. If I had merely declared the parameters as a TMethod, which is a record that allows you to "crack open" the event type, it would not have worked. This is because method pointers are passed differently from records because you can directly refer to a method and instance in code as the parameter when calling a method that takes a method pointer:

begin
  ...
  MC.Add(Obj.EventHandler);
  ...
end;

In this case the compiler simply pushes the instance reference from the variable Obj onto the stack, followed by the address of the EventHandler method. However for a record, the compiler passes a pointer to the record, which may even be in a register. Then, depending upon whether or not the parameter is declared as "const" or not, the called method will use the pointer directly or make a local copy of the record on the stack.

I won't bore you with the details of how the Add, Remove, and IndexOf methods actually interact with the array as that really isn't the interesting part and is merely boilerplate code anyway. Let's instead turn our attention to triggering or invoking the event. Let's add back in the FInvoke field and the Invoke property:

type
  TMulticastEvent<T> = class(TMulticastEvent)
  strict private
    FInvoke: T;
  public
    procedure Add(const AMethod: T); overload;
    procedure Remove(const AMethod: T); overload;
    function IndexOf(const AMethod: T): Integer; overload;

    property Invoke: T read FInvoke;
  end;

So here we are again. What is assigned to the FInvoke field? This is where yet another trick from my bag comes out. We're going to leverage some of the dynamic dispatching code that has been laying somewhat dormant down in the ObjAuto.pas unit for many Delphi releases. If you haven't had the chance to look at some of the little gems that live in that unit and the associated ObjComAuto.pas, there is some pretty powerful bits of code in there. ObjAuto's primary purpose is to allow late-bound calling of object methods. It does this through some rather rich RTTI that the compiler has generated for several releases. A lot of this extra information is generated for public and published methods on classes and descendants that have been declared within a {$METHODINFO ON} block. However, for method pointer types, this information is always available regardless of that setting. While I'm not going to be generating code at runtime, I will be calculating some information about call signature of the method at runtime from this extra RTTI.

In ObjAuto, there is an interesting global function called CreateMethodPointer(). This function takes a pointer to a type data record (the compiler RTTI) and an interface. It returns a TMethod record that points to a stub object and method that knows how to "pick apart" the passed in parameters convert them to Variants and then call some methods on the interface. Don't worry, we're not going down that (using Variants) road! Down in the implementation of this unit was something much closer to what I wanted. I just need raw data structure that was an opaque representation of the Invoke call. I don't care what the data is, only how large it is and where it is located. IOW, what data is in a register and what data is on the stack and how much. I won't go into the raw details of the changes made down in ObjAuto since you'll be able to see that once Tiburón ships. I added another overloaded CreateMethodPointer() function, that takes a method pointer callback instead of an interface and a pointer to a type data record. Here's what the declarations look like:

type
  ...
  PParameters = ^TParameters;
  TParameters = packed record
    Registers: array[paEDX..paECX] of Cardinal;
    Stack: array[0..1023] of Byte;
  end;
  TDynamicInvokeEvent = procedure (Params: PParameters; StackSize: Integer) of object;

function CreateMethodPointer(const ADynamicInvokeEvent: TDynamicInvokeEvent; TypeData: PTypeData): TMethod; overload;
procedure ReleaseMethodPointer(MethodPointer: TMethod);

Call CreateMethodPointer with a reference to a method of type TDynamicInvokeEvent and the type data for the method pointer (T). It will return a TMethod record, which is essentially a raw method pointer. We'll use this value to assign to FInvoke. The trick is to get around the fact that the compiler won't allow this cast:

  FInvoke := T(CreateMethodPointer(...)); // compiler chokes on this code

We have to resort to another assembler trick. Remember how I said that passing a method pointer as a parameter is a little different that with records? In this case, it's not going to matter because we're going to declare a method in the generic class that takes a "var" parameter. In this case the compiler passes the address of the method pointer variable rather than the value. Let's add this method:

type
  TMulticastEvent<T> = class(TMulticastEvent)
  strict private
    FInvoke: T;
    procedure SetEventDispatcher(var ADispatcher: T; ATypeData: PTypeData);
  public
    < same as above >
  end;

procedure TMulticastEvent.SetEventDispatcher(var ADispatcher: T; ATypeData: PTypeData);
begin
  InternalSetDispatcher;
end;

There's another "InternalXXXX" call! That one is only slightly different than the others, but the effect is the same. InternalSetDispatcher simply jumps over to the SetDispatcher method on the TMulticastEvent type, which is declared and implemented on the base TMulticastEvent class:

type
  TMulticastEvent = class
  strict protected type TEvent = procedure of object;
  strict private
    FHandlers: array of TMethod;
    FInternalDispatcher: TMethod; // this class needs to keep it's own reference for cleanup later
    ...
    procedure InternalInvoke(Params: PParameters; StackSize: Integer);
    procedure SetDispatcher(var AMethod: TMethod; ATypeData: PTypeData);
  protected
    ...
    procedure InternalSetDispatcher;
  end;

procedure TMulticastEvent.SetDispatcher(var AMethod: TMethod; ATypeData: PTypeData);
begin
  if Assigned(FInternalDispatcher.Code) and Assigned(FInternalDispatcher.Data) then
    ReleaseMethodPointer(FInternalDispatcher);
  FInternalDispatcher := CreateMethodPointer(InternalInvoke, ATypeData);
  AMethod := FInternalDispatcher;
end;

We're almost there. One of the last bits to cover is the InternalInvoke method and the constructor of the generic TMulticastEvent class. InternalInvoke is called from the little stub class instance created by the CreateMethodPointer() call. It passes in the pointer to the TParameters record and the space taken up on the stack by the parameters that "spill" from the CPU registers. As you may have already guessed, we're going to drop down to some more assembly:

procedure TMulticastEvent.InternalInvoke(Params: PParameters; StackSize: Integer);
var
  LMethod: TMethod;
begin
  for LMethod in FEvents do 
  begin 
    // Check to see if there is anything on the stack. 
    if StackSize > 0 then 
      asm
        // if there are items on the stack, allocate the space there and 
        // move that data over. 
        MOV ECX,StackSize
        SUB ESP,ECX 
        MOV EDX,ESP
        MOV EAX,Params 
        LEA EAX,[EAX].TParameters.Stack[8]
        CALL System.Move 
      end; 
    asm
      // Now we need to load up the registers. EDX and ECX may have some data 
      // so load them on up.
      MOV EAX,Params 
      MOV EDX,[EAX].TParameters.Registers.DWORD[0] 
      MOV ECX,[EAX].TParameters.Registers.DWORD[4]
      // EAX is always "Self" and it changes on a per method pointer instance, so 
      // grab it out of the method data. 
      MOV EAX,LMethod.Data 
      // Now we call the method. This depends on the fact that the called method 
      // will clean up the stack if we did any manipulations above. 
      CALL LMethod.Code 
    end; 
  end; 
end;

Now the constructor for the generic version of TMulticastEvent<T>:

constructor TMulticastEvent<T>.Create;
var
  MethInfo: PTypeInfo;
  TypeData: PTypeData;
begin
  MethInfo := TypeInfo(T);
  TypeData := GetTypeData(MethInfo);
  inherited Create;
  Assert(MethInfo.Kind = tkMethod, 'T must be a method pointer type');
  SetEventDispatcher(FInvoke, TypeData);
end;

Notice the Assert, that is there because we cannot "constrain" the generic type at compile-time to only accept method pointer types for "T". So we have to do that check at run-time. This is something we'll look at for future releases. We can safely call SetEventDispatcher because FInvoke will be of type "T".

There you have it, a generic multicast event. At some point, I'll present some extensions that will help make using this easier by adding some automatic cleanup of the multicast event instance itself and a technique for using the source event itself to be the only instance reference. For now here's one potential use scenario:

procedure TForm1.FormCreate(Sender: TObject);
var
  E: TMulticastEvent<TNotifyEvent>;
begin
  E := TMulticastEvent<TNotifyEvent>.Create;
  // Since the Invoke property is of type "T" and T = TNotifyEvent, you can
  // Assign it directly to any event of type TNotifyEvent, such as OnClick.
  Button1.OnClick := E.Invoke;  
  // Now you can add as many "listeners" as you want.
  E.Add(Button1Click);
  E.Add(AnotherButtonClick);
end;

I will try to get the full code posted within the next few weeks or so. I still have Tiburón to get out the door and take a few days off.

There are also a lot of caveats and cases where a multicast event can really mess things up. For instance, what if the method pointer had a result type? What if it had one or more "var" or and "out" parameters? Which method invocation in the list of listening handlers decides what the result is, or assigns to an "out" parameter? What if an exception is raised? Are there any ordering dependencies or expectations? From a framework designer standpoint, these are some real questions one has to ask. Making all events multicast is akin to the shotgun approach of making all methods virtual. It doesn't really foster extensibility in the way one would think. As a matter of fact it can serve to severely limit extensibility and evolve-ability of a given framework.

There are many types of events that lend themselves very well to supporting multicasting, and there are plenty of other cases where multicasting only creates more problems than it solves. Having multicast events as the default behavior for any event type is certainly nice from a consistency and overall availability standpoint, but it just makes it way too easy to hang yourself inadvertently. Supporting multicasting for an event source should be a decision left to the designer of the class on which the event is placed.

Wednesday, July 16, 2008

Tiburón - String Theory

No, not that String Theory, or even this one. What this is about is an interesting extension to AnsiString. During the field test cycle of Tiburón and our own internal porting of the IDE code (which was accomplished in about 1.5 months, by 2-3 folks, with > 2 million LOC), it became clear that there was a need for easily encoding UTF16 character data as UTF8. For the astute among you, you probably already know about the RTL defined UTF8String which is really just an alias to AnsiString. The problem is that it is UTF8 in name only. Unless you explicitly ensured that only UTF8 data was placed into the payload, it could just as easily hold normal Ansi character data. We needed to make the use of UTF8String easier. As we looked at how AnsiString worked, it was clear that AnsiString always had this "affinity" to carry it's data payload encoded as whatever the RTL had determined was the active code page, at runtime. So we wondered, "what if we could create an AnsiString type where the programmer determined at compile time, the code page affinity for AnsiString?"

It turns out that, to Windows, UTF8 encoding is just another code page. Down in the RTL, the conversions to/from UnicodeString or WideString use the Windows API functions WideCharToMultiByte() and MultiByteToWideChar(). One of the parameters is the code page identifier to or from which the data is converted. If you pass CP_UTF8 (65001) to those functions, they'll convert between UTF16 and UTF8. This is a lossless conversion. In Tiburón, we're introducing an enhancement to declaring your own "typed" AnsiString. You have always been able to create a unique type based on any intrinsic type by declaring the new type with the "typed type" syntax:

type
  MyString = type AnsiString;

This would create a new type that is assignment compatible* with AnsiString, but with a unique type name and a unique RTTI structure. We've used this in VCL to distinguish normal "strings" from special strings such as TFileName or TCaption. By creating these unique string types, it was possible to create property editors that would be associated with a specific type of property, regardless of which component it used on or what the property name was. This is how only the Caption property on many components will automatically update the live design-time component as you type in the Caption value.

The thing is, with the above declaration, MyString will continue to always have an affinity for whatever the current runtime code page was. So, we introduced the following syntax for AnsiString only:

type
  MyString = type AnsiString(<1..65534>);

You can now control the code page affinity of any "typed type" AnsiString at compile time. The "parameter" to AnsiString must be a word constant expression. The values 0 and 65535 have special meanings. 0 is a normal AnsiString, and 65535 ($FFFF) means "no affinity." $FFFF is worth noting here as already being declared as a RawByteString. When assigning between AnsiStrings or passing them as parameters, if the code page affinity of the source and destination strings are different, an automatic conversion is done. In order to minimize potential dataloss during the conversion, all conversions go "through" UTF16. However, a string with an affinity of $FFFF tells both the compiler and the RTL, that none of these conversions should be done and to just move over the payload. In practice, however, there would be only a few instances of needing to use RawByteString, but it is there for your use.

So we now have the following declarations in the System unit:

type
  UTF8String = type AnsiString(CP_UTF8);
  RawByteString = type AnsiString($FFFF);

Like I stated previously, any assignment (or passing as a parameter to procedure or function) where the code page affinity between the source and destination are different, the payload will automatically be converted. Say you have a function that must only take UTF8 data. You can declare it like this:

procedure WriteUTF8Data(const Data: UTF8String);
begin
  // write UTF8 data to stream, file, socket, etc...
end;

Now, no matter what type of string you pass to this procedure, you know that the payload will have been coerced into UTF8. Pass a normal AnsiString, and the data arrives as the UTF8 version of that AnsiString converted from whatever the active codepage was or whatever that AnsiString's affinity was set to. Pass a UnicodeString or WideString to the function and it too will be converted to UTF8. Pretty cool, no?

With the title to this post... All those physicists out there are going to hate me and Google.. hehe ;-)

*"Assignment compatible" means that you can assign or pass as a parameter from one "typed type" to another "typed type" or the intrinsic type on which it was based. They are not "var/out parameter" compatible. This means you can pass them by value to a function but not by reference.

Tuesday, July 1, 2008

Brand New Day...

Well, my access card worked. I guess I still have a job :-).

I just finished listening to a company (Embarcadero not Borland) wide conference call announcing to the whole company the closure of the Embarcadero+CodeGear deal. Wayne Williams, our new boss, made some very encouraging statements. Most notably was that just like ER/Studio, RapidSQL, and DBArtisan, Delphi/C++Builder are core Embarcadero product offerings. This means that these products are key to the business. Yes, there are many other products being sold, incubated and introduced, but it is the aforementioned products that form the pillars on which the company is based. Without them, many of the other products would not be possible. Like the foundation of a home, you just don't take a metaphoric jackhammer to them and expect the structure (the company) to remain sound.

Another encouraging (or maybe scary, depending upon your perspective) point was that we are the last independent tools vendor with the breadth of offerings we have out there. This means no vendor or stack lock-in. We have tools for nearly every database and OS platform out there. We are also one the of the very few software companies that offer very strong non-Open Source tools right along side tools built either on or for Open Source stacks. JBuilder and 3rdRail are built on top of the very popular Eclipse framework. Delphi for PHP and 3rdRail are built for the very popular and widely used Open Source PHP and Ruby/Rails environments, respectively.

How things will change or even stay the same is still being planned and scoped. A lot of work had been done between the announcement of this deal and its close, but now is where most of the work can actually take place. Now that we're no longer joined to Borland, we can now chart a new course under a new captain. I wish all the best for Borland as I've seen many happy and exciting days while there.

An interesting anomaly is that many of the CodeGear folks have had their service bridged, which means that some of us have, on paper, now worked for Embarcadero longer then they've existed :-). This is now day 6022 for me.

Monday, June 30, 2008

The Last Day.

Today is my last official day as a Borland/CodeGear employee. After today, I will have been a Borland/CodeGear employee for 6021 days, or 16 years, 5 months, and 25 days. What is interesting about this is that tomorrow, I will continue to drive to the same building, ride the same elevator, and unlock the same office door. The only difference will be that I will be employed by Embarcadero. It has certainly been an interesting ride in my many years at Borland/CodeGear.

For some perspective, I started when the Borland stock was around $80 a share (ouch). The new Borland Campus wasn't completed. Borland occupied a reasonably large number of buildings throughout Scotts Valley. If Borland didn't occupy some or all of a building, chances are Seagate did. To move between the various buildings, there was a continuously running shuttle bus service for Borland employees. I also remember the clout that Borland had locally. When we (my family and myself) first arrived and were looking for someplace to rent, when asked where I worked and I said Borland, suddenly we were at the top of the list without any kind of other checks. It was pretty nice.

After today, there will be no official presence of Borland in Scotts Valley. The only thing linked to Borland will be name on the lease they maintain on part of this building that was once the Borland Corporate HQ and campus. Embarcadero will sub-lease the space we currently occupy from Borland.

Even though there have been many things Borland has done over the years that I firmly disagreed with, they've also done some pretty good stuff too. I've learned so much more than I ever could have imagined about software, the software industry, developer tools, frameworks, and compilers. I've also had the pleasure of knowing and meeting many folks in the industry from all around the world.

Tomorrow will start a new chapter with new challenges and opportunities. I'm sure it will take some time until the dust settles, the cultures merge, and things settle down to the right groove. Don't expect any earth-shattering announcements or radical changes immediately out of the gate. This is a new experience for nearly all involved, so some period of acclimation is bound to take place.