Monday, November 12, 2007

Stupid Enumerator Tricks - And now for something completely different

Sometimes I get admittedly kooky, crazy ideas.  While playing with Tasks, ParallelFor, and Thread Pooling, a rather interesting thought popped into my head.  I've been pouring over the very interesting ideas and techniques outlined in this article on MSDN that is referring to the Parallel FX library that is in the works for the .NET framework.  One thing I've been looking at is how to make something like this applicable to natively compiled code in Delphi for Win32.  While I understand that .NET has a lot of great features and that a managed runtime environment has some interesting advantages, right now I'm all about researching how to take native Win32 code and Delphi to a new level.

What does this all have to do with Enumerators and Stupid tricks.  Well... it turns out that in Win32, the for..in language construct is the only case where the compiler will actually properly manage the lifetime of a class.  If you're curious to what I mean, make sure you look into this excellent rundown on how to write them.  It you look at some of the code in the referenced MSDN article (ignoring, for now the use of anonymous methods) there is this bit of code:

static void For( int from, int to, Action<int> body )
{
int index = from;
ReplicableTask rtask = new ReplicableTask( delegate {
int i;
while ((i = Interlocked.Increment(ref index)-1) < to) {
body(i);
}
});
rtask.Wait();
}

Since .NET is a nice garbage collected environment, there is no need to explicitly worry about freeing the "rtask" variable.  It will be collected for you.  However in the cold-cruel world of natively compiled code, you are afforded no such luxury*... sort of.  Ignoring that whole anonymous method thing, but using the nested procedure trick I mentioned here, here's a potential implementation in Delphi for Win32:



procedure ParallelFor(AFrom, ATo: Integer; AEvent: TIteratorEvent);
var
RTask: TReplicableTask;
Index: Integer;

procedure ExecuteEvent;
var
I: Integer;
begin
I := InterlockedIncrement(Index) - 1;
while I < ATo do
begin
AEvent(I);
I := InterlockedIncrement(Index) - 1;
end;
end;

begin
Index := AFrom;
RTask := TReplicableTask.Create(@ExecuteEvent);
try
RTask.Wait;
finally
RTask.Free;
end;
end;

But what if we could eliminate writing that whole try...finally thing?  Here's where it gets a little out there and on the edge :).  What if, instead we write this:



procedure ParallelFor(AFrom, ATo: Integer; AEvent: TIteratorEvent);
var
RTask: TReplicableTask;
Index: Integer;

procedure ExecuteEvent;
var
I: Integer;
begin
I := InterlockedIncrement(Index) - 1;
while I < ATo do
begin
AEvent(I);
I := InterlockedIncrement(Index) - 1;
end;
end;

begin
Index := AFrom;
for RTask in TReplicableTask.Create(@ExecuteEvent)do
RTask.Wait;
end;

So how does this work?  If you read up on enumerators on Primoz' and Hallvard's blogs, you'll know that as long as the TReplicableTask object has a method called "GetEnumerator" that returns an enumerator with the Current property of type TReplicableTask.  It will also automatically embed a try..finally block into the code and properly destroy the enumerator object for you.  Like I mentioned above, this is the only place where the compiler does this for you.  So if we added a GetEnumerator method to TReplicableTask that looked something like this:



type
TReplicableTask = class;
TTaskDestructor = class
private
FTask: TReplicableTask;
FMoved: Integer;
public
constructor Create(ATask: TReplicableTask);
destructor Destroy; override;
function GetCurrent: TAutoObject; inline;
function MoveNext: Boolean; inline;
property Current: TAutoObject read GetCurrent;
end;

TReplicableTask = class
public
...
function GetEnumerator: TObjectDestructor;
...
end;

{ TTaskDestructor }
constructor TTaskDestructor.Create(ATask: TReplicable);
begin
inherited Create;
FTask := ATask;
FMoved := -1;
end;

destructor TTaskDestructor.Destroy;
begin
FTask.Free;
inherited;
end;

function TTaskDestructor.GetCurrent: TAutoObject;
begin
Result := FTask;
end;

function TTaskDestructor.MoveNext: Boolean;
begin
Result := InterlockedIncrement(FMoved) > 0;
end;

function TReplicableTask.GetEnumerator: TTaskDestructor;
begin
Result := TTaskDestructor.Create(Self);
end;


So there you go...  An enumerator that is not really an enumerator, but actually a poor-man's implementation of an auto-destroyed object.  Yep, a Stupid Enumerator Trick of the highest order :).  So... what if we really did have anonymous methods in some future version of the compiler?  What would the above code look like then?  This is pure speculation, but I'd imagine it might look something like this:



procedure ParallelFor(AFrom, ATo: Integer; AEvent: TIteratorEvent);
var
RTask: TReplicableTask;
Index: Integer;
begin
Index := AFrom;
for RTask in TReplicableTask.Create(
procedure
var
I: Integer;
begin
I := InterlockedIncrement(Index) - 1;
while I < ATo do
begin
AEvent(I);
I := InterlockedIncrement(Index) - 1;
end;
end
) do
RTask.Wait;
end;


And, just to remind you... I'm talking natively compiled code here :).  Let the rampant speculation begin....


 


*Some would argue that this is a "good thing," but that is a can-of-worms I will refuse to open at this point :)

16 comments:

  1. The source code is not properly formatted thus it's really hard to understand your examples.

    ReplyDelete
  2. Um, can't you do exactly the same thing by creating a reference counted wrapper for your little piece of code?

    The lifetime issue is then taken care of by the compiler generating the code to _Release the interface of the container object, which may also involve a compiler generated try..finally, certainly it seems to be guaranteed behaviour even in the presence of exceptions.

    Certainly I've used reference counted lifetime management to achieve what you seem to be describing for "throw away" utility classes. That is, classes whose instances are routinely called into existence, used and then discarded in the life of a single procedure.

    e.g. a string formatter that accepts params that it then uses to format a list of strings either added individually or from a TStrings, and yields the contents as a single, formatted string (for when "CommaText" just isn't good enough - LOL):

    var
    f: IStringListFormatter;
    begin
    f := StringListFormatter( ... );
    f.Add( aStringList );
    Form.Caption := f.Result;
    end;

    Internally this code instantiates a TStringListFormatter which isautomatically disposed of once 'f' falls out of scope.

    Combining this behaviour with a singleton also yielded a very useful little hourglass cursor manager:

    begin
    HourglassOn;
    :
    // lengthy operation that may even call procedures that
    // in turn call HourGlassOn... that's OK, reference counting
    // takes care of everything
    end;


    HourglassOn calls a function which instantiates a singleton object (if it does not already exist) and returns a reference to it. When that reference falls out of scope, it is _Release()ed. When ALL references have been _Release()ed, the singleton gets destroyed, so that the next call to HourglassOn() will create a new one.

    The singleton sets Screen.Cursor = crHourglass when created and to crDefault when destroyed (I know, it could be cleverer than that but it suited my needs).

    GUI methods that wish to turn on the hourglass simply call "HourGlassOn". It will get turned off again when that method returns, whether normally or by raising (or not handling) an exception.

    ReplyDelete
  3. Kinda cute idea, but the code is lying. You're not enumerating. Big-time code smell.

    Why not just add something like C#'s "using" block, so you're not lying at all? Just make it so that for..in *isn't* the only construct where the compiler will memory-manage a class.

    using RTask := TReplicableTask.Create(@ExecuteEvent)do
    RTask.Wait;

    Actually, better yet in this case would be to add a static method, and hide the try..finally inside it:

    TReplicableTask.Execute(@ExecuteEvent);

    ReplyDelete
  4. Just give us garbage collection in win32. Problem solved.

    Sean

    ReplyDelete
  5. Look at my experimental auto object implementation....
    http://cc.codegear.com/item/25108

    ReplyDelete
  6. Apart from the obvious ommisions/errors in the declarations of TTaskDestructor and TObjectDestructor, wouldn't class helpers fit perfectly into this solution?!?
    You could make a helper that adds GetEnumerator/GetCurrent/MoveNext to TReplicableTask, so no extra instance is needed.
    Or are class-helpers incompatible with the for-in construct?

    ReplyDelete
  7. Nice trick but the code reads really funny.

    I'm usually abusing Delphi's ability to auto-destroy interfaces to automatically destroy objects. In your case, it could be something like this:

    with AutoDestroy(TReplicableTask.Create(@ExecuteEvent)).Task do
    Wait;

    or

    AutoDetroy(TReplicableTask.Create(@ExecuteEvent)).Task.Wait;

    Where AutoDestroy is defined along those lines (untested):

    type
    IAutoDestroyWrapper = interface
    function GetTask: TReplicableTask;
    property Task: TReplicableTask read GetTask;
    end;

    TAutoDestroyWrapper = class(TInterfacedObject, IAutoDestroyWrapper)
    FOwnedTask: TReplicableTask;
    constructor Create(task: TReplicableTask);
    destructor Destroy;
    function GetTask: TReplicableTask;
    end;

    function AutoDestroy(task: TReplicableTask): IAutoDestroyWrapper;
    begin
    Result := TAutoDestroyWrapper.Create(task);
    end;

    constructor TAutoDestroyWrapper.Create(task: TReplicableTask);
    begin
    inherited Create;
    FOwnedTask := task;
    end;

    destructor TAutoDestroyWrapper.Destroy;
    begin
    FOwnedTask.Free;
    inherited;
    end;

    function TAutoDestroyWrapper.GetTask: TReplicableTask;
    begin
    Result := FOwnedTask;
    end;

    --
    Primoz

    ReplyDelete
  8. I agree with Joe White. Nice trick, but I would prefer something that is explicit instead of making the code lie.

    ReplyDelete
  9. Jolyon,

    I've used that trick several times in the past as well. The difference here is that the enumerator is freed immediately upon exit from the "for" loop rather than upon exit of the whole procedure. Sometimes you need the instance to be freed immediately.

    Allen.

    ReplyDelete
  10. Joe,

    I never said it was something I'd actually recommend doing.. it was just some kooky idea I thought up. I was really looking for an excuse to show the last experimental implementation :-)

    Allen.

    ReplyDelete
  11. Sean,

    You're jumping ahead. Need to stay with the class... ;-)...

    Allen.

    ReplyDelete
  12. Erick,

    Take another look. I think I fixed all of the formatting errors. Even Windows Live Writer has issues with properly formatting code.

    Allen.

    ReplyDelete
  13. Allen:

    "Sometimes you need the instance to be freed immediately"

    Then use an explicit try..finally (or a hypothetical "using..") construct. Since the lifetime management is compiler defined, it is also subject to compiler changes.

    The same applies of course to ref counted interfaces, but I thought the point of your example was that ensuring that the instance will _certainly_ be cleaned up (and making it less cumbersome to do so) is more important than knowing strictly _when_ it will get cleaned up.

    If you "need" to know exactly when an instance will be/has been freed, free it yourself at the appropriate time.

    Anything else is a convenient side effect that might just turn bite you in the future.

    ReplyDelete
  14. Hi Allen,

    Second post, not quite related to lambdas but more related to "auto destroying" things.
    We have some problems with the WITH construction: when we have an anonymous (!) object created there then, if the programmer doesn't pay attention, then that object is leaked, when, in fact it's obvious that its scope is only in the 'with' block. Think which can be clearly detected by the compiler (or it should). So, imho, it's much safer to inject an "auto-destroy" code if we have an anonymous object creation, something like:

    User input:


    With TStringList.Create do
    Begin
    //bunch of useful code
    End;


    Compiled code:


    With TStringList.Create do
    Begin
    //the same bunch of useful code
    .Free; //in a 'try' block perhaps?
    End;


    …this will apply also when we'll have (in Tiburon, of course :-) ) inline variables, ie. something like 'with tmp:=TStringList.Create do' (QC #679 - #9 Voted in QC!) and 'for i: integer=1 to 10 do' (QC #49588)

    Just my 2c

    ReplyDelete
  15. Why not do it like in .Net?
    And when adding a new feature, do it right, enforce an alias!

    using sl := TStringList.Create() do
    begin
    sl.Add('abc');
    sl....
    end;

    This gives reduced scope and autom. destruction. :-)

    ReplyDelete
  16. Robert,

    That wasn't the point of this post. It was about using the existing facilities that are already present in the language. Besides, who's to say that what you suggest is "right" or "wrong." It is merely different.

    Allen.

    ReplyDelete

Please keep your comments related to the post on which you are commenting. No spam, personal attacks, or general nastiness. I will be watching and will delete comments I find irrelevant, offensive and unnecessary.