Friday, November 2, 2007

The Life and Times of a Thread Pool

As promised in this post, I've uploaded a demo project to Code Central. This is an implementation of Conway's Game of Life using a thread pool and a parallel loop. It uses a 500x500 wrapping grid in which only a portion is displayed in the UI. No calculations of the next generation are done in the UI thread in either parallel or serial mode. While the implementation of the life algorithm is interesting, that is not the intent of this demo. The Parallel.pas unit is the main focus of this demo. This unit contains the thread pool and a couple of parallel for loop implementations. One implementation takes a method pointer as a callback and the other uses an old trick where you can pass in the address of a nested procedure. This is the technique used by default, but a simple modification will allow it to use the other version. This was developed using Delphi 2007+ (internal developer build). It should work fine with shipping D2007. It is also Win32 only.
Here's a simple screen-shot:

You can check and uncheck the "Parallel" check box while the application is processing generations to see the effect of going parallel. For those of you with single core systems, you will not see much improvement by checking or unchecking Parallel.
For more information on Conway's life and information of various other algorithms here's a few links:
Finally, here is a very good implementation of Life written in Delphi:


  1. I had to delete ConwaysLife.dproj to get the project to open in RAD Studio 2007. Before that I get the errors:

    Unable to load project ConwaysLife.dproj
    The imported project "c:\program files\codegear\rad studio\5.0\MSBuild\Targets\Borland.Delphi.Targets" was not found. Confirm that the path in the declaration is correct, and that the file exists on disk. ---------------------------

  2. After adjusting the tag in the .dproj file (*) or after completely removing the .dproj file I could open the project and compile it with Delphi 2007.

    "Parallel" increases the speed by about 60 percent on my Core2 Duo.



  3. I've fixed the .dproj and re-uploaded the project to CC.

  4. Thanks for a great article Allen,

    Gens per sec increases from ~17.5 to ~96 in parallel on my 8 processor (2 x quad core) machine; 5.5 times faster - very impressive.

  5. Mark,

    I *want* your machine! That's a nice sounding box :-).


  6. Thanks for the article and code.

    For the uninitiated: could you just extend the ASM blocks in "Parallel.pas" with brief comments on what they actually do or the equivalent Pascal statements?

    TIA Olaf

  7. This is interesting! Thanks for posting.

    I too found the assembly bits puzzling - I know only a small amount about both x86 assembly and threading, so two mysterious topics in one code sample isn't good for me :) If you can't rewrite them in Pascal, could you perhaps at least comment them please, so we know what they're doing?

    Generations / sec: from ~27 to ~45 on my Core 2 Duo machine (1 2/3 faster). Not bad!

  8. Mr Allan, I don't have D2007 yet. Could you publish the executable file too? Thanks.

  9. This thread pool can be used to implement thousands of connections with a high-performance Winsocket server? This may be more practical than life game:)

  10. @D2006 user: after commenting out this line

    Application.MainFormOnTaskbar := True;

    in the dpr-file, building the project under D2006 was no problem.

  11. I like to code and it gives me some headsup for my current research. Thanks a lot Alan!! Besides,I put some comments for the ASM blocks. and uploaded the commented version to

  12. Great example..... Just one little thing though.. i feel that it would make a lot more sense as an example of using thread pools IF IT WAS COMMENTED with what you are really doing in the code.

    btw.... i got 33 Gen per sec on my Core 2 Extreme in standard and 123 Gen per sec in paralell mode.... nice performance increase...IF i can make it work with some other threaded code I have :)

  13. Folks from cnPack implemented their own TThreadPool class. You may find it interesting:

  14. Hey Allen. Awesome thread pool class for parallelization of for-loops! I’ve compared the performance of this code to the Omni-Thread library and another ThreadPool class written by Vitali Burkov and this seems to be the fastest for my purpose. I know this post is really old, but has anyone tried to update this for 64 Bit Delphi? I’ve tried, but have failed, most likely due to an incorrect translation of the assembler code for some of the procedures (I don’t really understand the assembler code). Any help on this would be greatly appreciated!


Please keep your comments related to the post on which you are commenting. No spam, personal attacks, or general nastiness. I will be watching and will delete comments I find irrelevant, offensive and unnecessary.