Tuesday, August 25, 2009

Procrastinators Unite… Eventually!

We’re all taught at an early age to “Never put off until tomorrow that which can be done today.” In general, that is wise advice. However there are some cases where you do want to wait until the last possible moment to do (or not do) something. In fact, that is one of the overall tenets of  Agile Programming; delay decisions until the last possible moment because you always know more about a problem tomorrow than you do today and can make a better, more informed decision. But, I digress. I’m not here to talk about philosophies of life, or to introduce another “agile methodology” or even about a new weight loss plan based on bacon, lard and cheese puffs.

How many of you have written this same bit of boilerplate code or something similar over and over again?


if OSVersion >= 5.0 then
begin
Module = LoadLibrary('kernel32.dll');
if Module <> 0 then
begin
Proc := GetProcAddress(Module, 'APICall');
if Assigned(Proc) then
Proc(Param1, Param2);
end;
end else
{ gracefully fallback }

 


What if you could just do this?


if OSVersion >= 5.0 then
APICall(Param1, Param2);
else
{ gracefully fallback }

 


The astute among you would immediately see that with previous Delphi releases, the second form, while certainly preferable, at some point the call to “APICall” would eventually have to effectively go through some bit of code similar to the first bit of code. Normally, you would declare an external API reference like this:


procedure APICall(Param1, Param2: Integer); stdcall; external 'kernel32.dll' name 'APICall';

That will cause the linker to emit a external reference into the executable binary that is resolved at load time. Therein lies the problem. That is the situation that the first bit of code above was designed to avoid. If APICall didn’t existing in ‘kernel32.dll’ at load time, the whole application would fail to load. Game over. Thanks for playing. Now, what if you could write code similar the second block of code above and still declare your external API calls in a manner similar to above?


Starting with Delphi 2010, you can do exactly the scenario I describe. To make the second code block work even if “APICall” isn’t available on the version of the OS on which your application is currently running, we’ve introduced a new directive to be used only in the above context, delayed;

procedure APICall(Param1, Param2: Integer); stdcall; external 'kernel32.dll' name 'APICall' delayed;


Simply put, by adding the delayed directive, that instructs the linker to generate the external API reference differently in the executable binary. Now Delphi’s RTL will take care of all that ugly “late-binding” boilerplate code for you. Rather than generating the import in the normal “Imports” section of the executable’s PE file, it is generated into the “Delayed Imports” section following the published PE spec. This also requires that the RTL now has a generic function that does the proper lookups and binds the API whenever it is called the first time. Subsequent calls are just as fast as a normal import.


This is different than similar functionality available in ILink32 from C++Builder wherein you can only specify an entire dll in which all references are delay loaded. Also, in C++ you cannot specify kernel32.dll or even ntdll.dll to be delay loaded, since the very startup of any application or dll requires them to already be loaded. In Delphi you can, since it is on an API-by-API basis. The intent of this feature was to make managing all the new APIs from Windows Vista and now Windows 7 much easier without having to continuously and manually write all the delay loaded boilerplate code. Throughout VCL, we can simply make runtime decisions on which APIs to call without always going through those manually coded “thunks.” They are now simply handled by the compiler/linker and the source code barely reveals the fact that something is late-bound.


In a future release, we are considering adding both the API-by-API capability to C++Builder and a way to specify an entire dll to be late-bound in Delphi.

25 comments:

  1. Personally, I would like to see an extension to the delayed keyword that pointed to a graceful fallback routine to avoid repeating the logic in the code everywhere.

    Something like

    procedure APICall(Param1, Param2: Integer); stdcall; external 'kernel32.dll' name 'APICall' delayed MyAPICallReplacement Test MyAPICallTest;

    With MyAPICallReplacement being a procedure/function with the same parameters. And MyAPICallTest be a boolean function. Then you could easily have the defered method work AND not have to write in logic everywhere in your code (and if you re-use the function, you can save a boatload of tweaks later on. One routine to test for each OS level, or whatever condition you wanted to support. Heck, you could even abuse it in even more interesting ways)

    Give it a think. It is, I believe, the natural extension of the concept (and follows the Default, Stored etc model for properties that we are all so familiar with)

    ReplyDelete
  2. Xepol,

    At a global level you can "hook" the whole library loading and API lookup process. You can even provide your own fallback DLL or API addresses. It's not exactly what you're asking for, but it does allow for same thing.

    ReplyDelete
  3. This is a fine step forward. In its absence I've been using Halvard Vassbotn's delay load unit which has served me very well indeed. Thanks a lot Halvard!

    If you were to implement somethin akin to Xepol's extension then you could make the test clause optional. It it was omitted then the fallback would be called whenever the entry point could not be found.

    When I import API functions I tend to wrap them up inside higher level functions or quite often classes. This allows me to convert the status code error checking of Windows API into more natural exception based handling in Delphi. So I personally don't see that Xepol's request would make much of a difference in terms of avoiding duplication of logic.

    Actually when I write code that does this kind of delay loading I prefer to take the decision based on whether or not the entry point can be found as opposed to the version based logic that you use in your example (OSVersion >= 5.0). This is best practice (as I understand it anyway) - for example see http://windowsteamblog.com/blogs/developers/archive/2009/08/05/version-checking-just-don-t-do-it.aspx

    Is there any efficient way to implement this approach? Presumable if I call a delayed import and it can't be found then an exception is raised. I guess I could catch this but it would be nice to have a slicker and better performing way of doing it? And what if I call the delayed import repeatedly. Presumably the attempt to resolve the import is only performed the first time it is called. Do I need to worry about race conditions where multiple threads may try to resolve the import simulataneously?

    ReplyDelete
  4. Allen. You say "At a global level you can "hook" the whole library loading and API lookup process." Could you give me some more details please? Perhaps I should know about this already but I seem a little ignorant! Is this new in D2010?

    ReplyDelete
  5. David,

    Yes, this is new in D2010. The hook mechanism is identical to how it is done in C++ and very similar to how it is done in VC++. There are a couple of global function pointers that you can assign.

    ReplyDelete
  6. Hi Allen, this is nice, VERY nice indeed!

    Thank you.

    ReplyDelete
  7. Javier Santo DomingoAugust 25, 2009 at 11:35 AM

    What makes a problem easier to solve is a big improvement, and that is what the delayed keyword is. Its nice finally to see Delphi with this mechanism too.
    Btw, there are many other common tasks that can be abbreviated automatically with new keywords... there are a couple of posts out there about several ideas around that, and there are some of the features of other Pascal dialects that are really cool also. In any case it will be nice to see the Delphi evolution.
    In fact, may be there will be some developers with doubts about the new RTTI extension style (its ok for me but since its attribute-like that may cause some resistance), but i think the whole community will always welcome new keywords like the delayed one. Everything that is respectful to the "orthodoxy" of the "pascal-sense" will be welcome for sure, and more if it is a language extension really useful like this one. Keep this kind of improvements coming! Congrats!

    ReplyDelete
  8. What if OSVersion >= 5.0 and APICall(Param1, Param2); isn't available?

    ReplyDelete
  9. Assuming that OSVersion is not the only one factor which determines the availability of some specific API call, it would be also nice to be able to write something like this:

    try
    APICall1(Param1, Param2);
    except // if APICall1 not available
    APICall2(Param1, Param2);
    ...
    end;

    ReplyDelete
  10. Allen Bauer, thanks for sharing the blog post on “The Oracle at Delphi.” I can relate…no doubt!
    I really thankful to u for doing such a great job

    ReplyDelete
  11. IMHO even better would be

    if IsAvailable(APICall) then
    APICall(Param1, Param2)
    else
    { gracefully fallback }

    as APIs (MS's included) are often not tied to a particular OS version and can be retrofitted, and there is always the case of all the non-MS APIs which aren't tied to the OS version at all.
    Best check if the API is there directly IME, it just works.

    ReplyDelete
  12. [...] The Oracle at Delphi » Procrastinators Unite… Eventually! blogs.embarcadero.com/abauer/2009/08/25/38894 – view page – cached Embarcadero Developer Network Embarcadero Developer Network Communities Articles Blogs Resources Downloads Help Embarcadero Blogs » The Oracle at Delphi — From the page [...]

    ReplyDelete
  13. Pity we have to check versions manually -- and worse yet, check them every time we call a given API function, and make sure we're checking for the *correct* version, every single time. If you load the API manually, you don't have any such problems -- if the function isn't there, GetProcAddress returns null. Easy.

    Why didn't you do the same thing with this feature? "if Assigned(APICall)" would fit in so nicely.

    ReplyDelete
  14. Lots of variants on the same question here but no answers yet!

    Any chance of some enlightenment Allen?

    ReplyDelete
  15. [...] DLLs. Allen Bauer posted an article about this new feature for Delphi under the strange headline Procrastinators Unite… Eventually! You can also think of it as “Delay loaded functions for Delphi”. function [...]

    ReplyDelete
  16. @Allen -> I would suggest that a global hook might work ok if you don't use any 3rd party components that have their own global hooks. It would require a lot of fairly complicated and hard to maintain code to get it right that makes duplication of logic look easier to write and maintain in comparison. In other words, it can be done, but it would be better if it was done right. Getting that level of control over it is more in your ball park than ours, wouldn't you say? Yes - I will enter it into QC as a future feature request.

    @David Heffernan -> yes, an object interface would concentrate the core logic, but then a set of unit implementation calls would probably do the same thing. Pretty much defeats the point of the entire 'delayed' keyword tho, don't you think (yes, I have written code just like that.) The point is to HIDE as much of the wrench and hammer code as possible while leaving entry points for flexibility where absolutely required.

    Right now, I do think the new delayed method is actually one step BELOW declaring your own set of procedures to work in place of the delayed load calls with all the logic hidden behind those calls. However, with the appropriate tweaks, that can change.

    ReplyDelete
  17. Xepol,

    The intent of a global hook is to allow some level of control over the functionality. Yes, you can run into problems if several third party component decided to hook it at the same time. That is the nature of any kind of global hook.

    It was not intended to be the do-all, be-all of delay loading. The intent was to leverage as much of the existing, proven functionality that already existed in the C++ RTL. Rather that focusing on what it *isn’t* good for, how about looking at what it *is* good for? All your points are valid and certainly things to consider. If this new functionality doesn’t work for your case, it’s presence in no way undermines any other existing mechanism.

    ReplyDelete
  18. @Allen

    I must say that I find your tone quite astonishing. You say, "Rather that focusing on what it *isn’t* good for, how about looking at what it *is* good for?" Are you implying that you don't appreciate constructive critical comment? Do you only wish to recieve fan messages here?

    You are in the wonderful position of having a large body of experienced an knowledgeable users of your company's product who are willing to give you the benefit of that experience for free. That's a fabuluous position to be in and you should cherish it rather than reject it. Most developers would love to be in your position - it's almost impossible to find customers of mine that would be so forthcoming.

    I would also say that when it comes to adding new language constructs I find it disappointing that you wait until after release before blogging. Surely it makes more sense to get this kind of feedback before the code is released. You'll now have to live with the decisions you have made. If you had got some feedback earlier surely it would have been more useful. Sure you may well have chosen to continue on this path but at least you would have had the option of taking on board some of these comments.

    You also said "All your points are valid and certainly things to consider. If this new functionality doesn’t work for your case, it’s presence in no way undermines any other existing mechanism." That's a perfectly valid comment but it just underlines to me how important it is to get feedback sooner. Why, oh why, didn't you blog about this months ago?

    OK, that's the general bits covered. Onto the specifics.

    I asked in an earlier comment about thread-safety. I'll ask again. Does the code which resolves the delayed imports (i.e. the bit that calls LoadLibrary and GetProcAddress) have any synchronisation? Or do I need to provide my own locks? In my humble opinion it would be better for the low-level framework to lock.

    I'd also add my support to all the comments that state that a mechanism to check for the availability of an import would be very useful (Hallvard Vassbotn's delay loading code allows for this).

    ReplyDelete
  19. David,

    You're absolutely right. My tone was certainly a little too harsh. For that I apologize. I do appreciate you bringing it up in a respectful and constructive manner. I will endeavor to do the same.

    In a perfect world, I would have liked to have blogged more prior to release. With all the demands on my time coupled with some significant negative events in my family life, that was just not possible this time.

    My follow-up post explains in a little more detail *why* it was implemented in the manner that it was. It was also done to address an immediate problem we were having with adding Windows Vista and Windows 7 support while continuing to support Windows XP and 2000. By all means, do log these suggested enhancements into QC (http://qc.embarcadero.com).

    To your question about thread safety; Yes, internally it does do proper thread synchronization. It does try to only hold the lock for as short of a period as possible. In reviewing the code, I do see some room for improvement that could either eliminate the lock entirely or vastly reduce the necessity for one.

    Finally, I am looking at what it would take to add a mechanism for checking the availability of an import. If I come up with a solution that works with the current system, I will be sure to publish it here, just like I did with the post about adding better exception support.

    ReplyDelete
  20. Allen,

    Thanks a lot for your response - it is much appreciated and reassures.

    On the matter of locks I presume that it only locks when the framework attempts to resolve the import. Once the import has been resolved one way or the other then the outcome is known and no further attempts to resolve the import are needed. All this presumes that the resolution of the import is done once and once only which is certainly possible (that's how Hallvard's code does it).

    If this is indeed the case then I wouldn't worry about the performance side of the lock. Certainly for all the use cases that I can think of then performance isn't a big deal. For the usage that I can think of there wouldn't be more than a few tens of delayed imports. I think you'd be fine with a single global critical section.

    If there are users that do (for some reason) need to have many multiple threads call many delayed imported functions simulataneously then users could deal with performance issues themselves.

    I guess what I'm trying to say is that I don't think that there will be any discernable contention on delayed imports and that a coarse, easy to verify, simple to implement, locking solution will suffice. Something more complex would probably not be worth the effort and your time would be better spent elsewhere.

    ReplyDelete
  21. David,

    "On the matter of locks I presume that it only locks when the framework attempts to resolve the import. Once the import has been resolved one way or the other then the outcome is known and no further attempts to resolve the import are needed."

    Yes, that is in fact what it does. It also supports "bound imports" if you used the bind tool from MS. In this case, if the delay import table has bound imports set and the timestamp (the timestamp field of the PE header) of the imported dll matches the information in the delay import table, then *all* bound imports are resolved at once.

    You can also explicitly "unload" a delay imported dll, in which case the delay import table is reset, en masse back to the original state. The whole process will then restart once you call the API again.

    ReplyDelete
  22. @Allen

    Thanks a lot for answering my queries.

    ReplyDelete
  23. @Allen -> I don't mean to be over critical, honestly I don't. The idea is a good one in principle.

    Where I am critical of the current solution is how far it falls short of a good solution. Again, don't take me wrong - there is definitely a problem to solve here. However, with so much left undone, the little bits that are done hardly make it worth using.

    If you use it, you look backward compatibility. That's fine as long as the trade off is fair. In this case, I'm not convinced that it is yet specifically because of how much you still have to do to use it.

    Look, I could write code like this:

    Var
    Module : HModule;
    _APICall : TAPICall;

    Function CheckLoaded : Boolean;

    Begin
    If (Module=0) And (OSVersion >=5.0) Then
    Begin
    Module := LoadLibrary('library.dll');
    _APICall := GetProcAddress(Module, 'APICall');
    End;
    Result := (Module0);
    End;

    Procedure APICall(Param1,Param2 : Integer);

    Begin
    If CheckLoaded Then
    Begin
    _APICall(Param1,Param2);
    End
    Else
    Begin
    { gracefully fallback }
    End;
    End;

    How much do I really save using the new method? Either I have to copy the test logic everywhere or write a wrapper routine for the test before the call - so the wrapper routine is a wash. I do not have to declare a call type and a holder variable in the new method, so there is that. As for the CheckLoaded routine - encapsulating logic like that is easy, so the CheckLoaded routine itself is no great shake.

    To sum up, compared to how it used to be done, in return for loosing reverse compatibility I am saved from a load library call, a GetProcAddress call, and a few variable declarations - but I still end up having to write most of the rest of the wrapper logic or risk forgetting a test somewhere and risking my code blowing up randomly.

    That's not a good swap in my mind. If could remove most of the wrapper logic and just have to write a test routine and a graceful handling routine for each, then I could see it.

    I guess my problem is that delayed DLL loading is such an easy bit of logic to write, any solution that limits my code's ability to work on older compilers for any reason really needs to be compelling.

    I hope that communicates my feelings on the topic more clearly.

    ReplyDelete
  24. There is one fact that not mention, Is it thread safe?

    ReplyDelete

Please keep your comments related to the post on which you are commenting. No spam, personal attacks, or general nastiness. I will be watching and will delete comments I find irrelevant, offensive and unnecessary.