Wednesday, November 1, 2006

Exceptional Safety

Back in late 1992 or 1993 or so, we had a dilemma.  We wanted to add exceptions to the Turbo Pascal (what was soon to become the basis for Delphi's programming language).  WindowsNT was under full-swing development. With this new OS came a new-fangled thingy called Structured Exception Handling (SEH).  So what was the dilemma here?  If you'll recall, the first release of Delphi was targeting Windows 3.1, aka. Win16.  There was no OS-level support for SEH.  Also, WindowsNT wasn't going to be released as a mass-market consumer-based OS.  Again, what's the problem?  Just add your own implementation of SEH to the language and move on.  The problem was all about "safety."  So we, of course, added our own specific implementation of SEH to the 16bit version of the Delphi language.  Aren't exceptions suppose to make your code more safe?  Ok, I'm being obtuse and a little evasive here.

The fundemental problem we faced was the notion of a partially constructed object.  What if halfway through an object's constructor an exception was raised?  How do you know how far into the execution of the constructor you got by the time the exception handler is executed and the object's destructor is called?  The constructor is excuting merrily along, initializing fields, constructing other objects, allocating memory buffers, etc...  Suddenly, BAM!  One of those operations fail (can't open a file, bad pointer passed in as a constructor parameter, etc...).  Since the compiler had already injected an implicit exception handler around the execution of the constructor, it catches the exception and promptly and dutifully calls the destructor. Once that is complete and the partially constructed object is destroyed and all the resources are freed, the exception is allowed to continue, in other words, is re-raised.  The problem in this scenario is the fact that the destructor really has absolutely no clue why it got called (sure it could see that an exception is currently in play, but so what?).  The destructor doesn't know if the instance was ever fully constructed and if not, how much got constructed.

The solution turned out to be remarkably simple and somewhat clever at the same time.  Since all classes in Delphi have an associated virtual method table (VMT) each instance must be initialized to point to that table.  Since Delphi classes allow you to make virtual method calls from within the constructor, that VMT pointer has to be initialized before the constructor is allowed to execute.  If the VMT pointer has to be set before the constructor executes, why not just initialize the entire instance to some known state?  This is exactly what is done.  The entire instance is initialized to zero (0), the VMT pointer is set, if the object implements interfaces, those VMT pointers are also set.  Because once the user's code in the object constructor begins to execute you know that the instance data is in a known state.  By using this fact, the destructor can easily tell how far in the object's initialization sequence things got before the world went haywire.  Remember yesterday's post where I mentioned the FreeAndNil procedure?  Another item to note is the non-virtual TObject.Free method.  Because you can assume that if an instance field contains a non-nil or non-zero value, it must have been successfully initialized, it should also be OK to de-initialize it.  This is more important for any memory allocations or object constructions that took place in the constructor (or any other object method for that matter).  The destructor has to know when a valid pointer is in that field.  So a non-nil value means, go ahead and free the memory or destroy the object.

We realized too, that it would be very tedious and error-prone for the user to always have to remember to always do this pattern: if FField <> nil then FField.Destroy;  Enter TObject.Free.  If you opened System.pas and looked at the implementation of Free, it simply does if Self <> nil then Destroy;  So you can safely call the Free on a nil, or unassigned instance.  That is because the check is done within that method.  All you need to do is FField.Free; and your destructor is now "exception safe."  The same thing can be done for memory allocated with GetMem.  You can safely call FreeMem(FField), even if FField is nil.   It just returns.  Finally, it should be noted that certain "managed types" such as strings, interfaces, dynamic arrays and variants are also automatically handled through some compiler generated meta-data.  So just before an object instance's memory is freed, an RTL function is called that will take the instance and this meta-data table which contains field types and offsets so this RTL function knows how to free certain instance fields.  Again, if a particular field is nil, it is simply passed over with no action needed.

What about the FreeAndNil thingy?  For the astute among you, you've probably noticed that the implementation of that procedure actually sets the passed in reference to nil first and then destroys the instance.  Shouldn't the name actually be NilAndFree?  Yeah, probably.  But it just doesn't roll of the tougue very well and is equally confusing.  "So if you set the referene to nil first... how can you destroy it?"  So why was it implemented in this manner?  Exception safety is big one reason.  Another significantly more obscure reason involves intimately linked objects.  Suppose you have one object that holds a list of other objects which in-turn hold a reference back to the "owner" object?  Depending on the order in which the objects get destroyed, they may need to notify other objects of their impending doom.  Since the owner and the owned objects are intimately linked, they directly call methods on each other throughout their lifetime.  However, during destruction, it could be very dangerous to willy-nilly call methods on the other instance while it is in the throws of death. By setting the instance pointer to nil before destroying the object, a simple nil-check can be employed to make sure no method calls are made while the other instance is being destroyed.

So there you have it; a few little tips on ensuring your objects are "exception safe" and a couple of hints into when you should use FreeAndNil.  By peeking under the hood and examining the code, you can get a better understanding of why and how things are implemented.  So, you could always use the if FField <> nil then FField.Destroy pattern buy why when calling FField.Free; does all the work for you?  Using the pattern, if FField <> nil then FField.Free; is a redundant, as is, if Assigned(FField) then FField.Free;

18 comments:

  1. Not seeing a freeAndNil for yesterday, just Assigned. Did I miss something?

    ReplyDelete
  2. Patricio MoschcovichNovember 1, 2006 at 3:19 AM

    I want more of these stories!


    How about writing a book with all the tidbits? It could be a "sequel" to Danny's book. It looks like Julian Bucknall's book printed through Lulu was a success.

    ReplyDelete
  3. Thanks Allen, what an elightening post! Keep 'em coming!

    ReplyDelete
  4. Oh by the way, while we're at it:


    What about the EmptyStr constant vs. '' ?

    ReplyDelete
  5. Ah, forget I ever said anything. I just found the source comment in the SysUtils unit... ;)

    D'Oh!

    ReplyDelete
  6. Lluis (Albert Research)November 1, 2006 at 10:33 PM

    Great ! ;-)

    I really appreciate this kind of posts - stories as they teach me how to do it better with my code... Keep on writting. :-)

    ReplyDelete
  7. Allen,


    Just to be sure I got you, are you saying that in the code:


    var o: TMyClass;

    if Assigned(o) then o.Free;


    it makes NO sense to test for Assigned, and that one can safely simply call "o.Free" without testing for nil?


    p.s.

    I read this from your last sentance "Using the pattern, if FField <> nil then FField.Free; is a redundant, as is, if Assigned(FField) then FField.Free;"





    ReplyDelete
  8. Zarko,


    Yes, you've got it. There is no point in "if Assigned(o) then o.Free;", or

    "if o <> nil then o.Free;". Just "o.Free" works fine. :-)

    ReplyDelete
  9. Allen,


    please go on with these articles.

    I can get enough of this insider stuff! ;-)


    ReplyDelete
  10. Ken, I wanted to warn about something else here ...


    Calling o.Free after o := nil is very bad as it will not call the destructor.


    And also, calling o.Free without instantiating o to nil would produce some strange effects.

    ReplyDelete
  11. Zarko,


    I think what Ken was getting at was that in the case of an instance of a class, the initialization of the all the instance fields are initialized to 0 automatically and *before* the constructor is run. Global variables are *also* initialized to 0 as well. I've seen this many times (and have been just as guilty of myself):


    var

    O: TObject = nil;


    The nil assignment is not needed and in fact can serve to actually *increase* the size of your executable since it changes which segment that variable goes into in the EXE.


    In *most* cases within a destructor, you need not set an instance field to nil after it has been freed because that memory is going to be deallocated anyway. Only in the rare cases where you have a set of classes that interact with one another that you *may* need to use FreeAndNil.

    ReplyDelete
  12. Allen,


    I was not talking about the destructor, but rather the following case


    var obj : TMyClass;

    begin



    {

    a lot of code here BUT

    maybe NOT a line like


    obj:=TMyClass.Create

    }


    obj.Free //AV


    *OR*


    if Assigned(obj) then obj.Free //AV

    end;


    Therefore, obj must be assigned a NIL "value" if you are unsure whether something like obj := TMyClass.Create will be executed. Since local (inside a method) variables are NOT initialized to NIL as glogal (unit) or private (form level, for example) are.

    ReplyDelete
  13. Zarko,


    In that case you should use the following pattern:


    var

    Obj: TMyClass;

    begin

    Obj := TMyClass.Create;

    try

    finally

    Obj.Free;

    end;

    end;


    If an exception is raised in the constructor, the "Obj :=" assignment will not occur, and since you've not entered the try..finally block, the Obj.Free; line will not execute either. In fact, in the above case, the exception should continue on to an outer scope and nothing else in this function will execute.


    Allen.

    ReplyDelete
  14. Allen,


    My only issue here is that not all fields do get created. Sometimes they just don't. They were born nil and they die nil. But if you're making sure you call field.Free in the destructor, you get an A/V if no one ever created it....at least that is what I believe happens ;).


    Am I wrong here? That's why I'm always checking Assigned()...not so much for a double free, but for a 'did you get created at all during the lifetime of this object'?

    ReplyDelete
  15. Randy,


    That was the point. For an object instance, you can be sure that all fields will be nil upon entering the constructor. Since they are guaranteed nil, you can safely only call Obj.Free; in the destructor even if that object instance was never allocated. The field will remain nil. However, if that field is "cycled" throughout the life of the object, then you *should* set it to nil whenever the instance is freed.


    Allen.

    ReplyDelete
  16. One thing to be aware of with FreeAndNil is that it's not thread-safe. If you have 2 threads trying to free the same object it can happen that both read the variable, both set the variable to nil and then both enter the destructor of the object.


    That's one of the reasons why I use this implementation instead:


    procedure nxFreeAndNil(var Obj);

    asm

    xor ecx, ecx

    lock xchg [eax], ecx

    mov eax, ecx

    test eax, eax

    {$IFNDEF DCC6OrLater}

    jnz TObject.Free

    {$ELSE}

    jz @@Exit

    mov dl, 1

    mov ecx, [eax]

    jmp dword ptr [ecx + VMTOFFSET TObject.Destroy]

    {$ENDIF}

    @@Exit:

    end;


    ReplyDelete
  17. Thorsten,


    Of course you're right, but I haven't even gotten to the notion of thread-safety. However your implementation looks interesting... and maybe something we'll look into using ;-)


    Allen.

    ReplyDelete
  18. Allen,


    Feel free to use it...


    The key advantages are:

    a) the lock xchg makes sure that only a single of multiple threads concurrently calling the procedure for the same variable will actually go ahead and try freeing the object.

    b) the direct jump into the destructor prevents the need to first call Free, which then checks again if self is nil and then calls the destructor, after which the return address on the stack would go back to Free and from there back to FreeAndNil. The stack is in the same state when calling nxFreeAndNil and when calling the destructor. The only thing expected on the stack is the return address. By jumping to the destructor instead of calling to the destructor the RET in the restructor will later return directly to the caller of nxFreeAndNil instead of returning there just to find another RET statement.

    ReplyDelete

Please keep your comments related to the post on which you are commenting. No spam, personal attacks, or general nastiness. I will be watching and will delete comments I find irrelevant, offensive and unnecessary.