Friday, February 22, 2008

Thread pools Task handling.

The Delphi Parallel Library (DPL) tends to occupy a lot of my otherwise idle "thought" time. DPL has been redesigned in my head more times than I can count.  The good news is that it is conceptually beginning to actually take some shape and is finding a direction.  As I've stated before, a lot of the concepts behind the DPL are taken from Intel's TBB and Microsoft's TPL.

When I started down this road, it all began as an investigation into various ways of implementing a thread pool.  It turns out that this simplistic mechanism really doesn't truly address parallelism and concurrent processing.  It is merely a way of shoveling off a task into another thread, which may or may not execute concurrently.  A thread pool doesn't do anything to "load-balance" the work in a constructive manner, nor does it purport to.  That is left as an exercise for the user.

While reading and researching all the various approaches to handling parallelism, a few light bulbs started coming on.  I've been approaching this whole thing from the wrong angle.  I'd been trying to shoehorn task handling (a task being the smallest unit of work that can be made parallel) concepts into the more general notion of a thread pool.  I admit that sometimes I'm not the sharpest knife in the drawer, but I can recognize when one approach is superior to another.

Rather than starting with the concept of thread pools, a parallel library should really begin at the most basic level, the task or just a unit of work.  A task is merely some chunk of code that may or may not execute concurrently.  Another thing to remember is that a parallel library is about getting these tasks done as fast as absolutely possible.  It is also about keeping as many of the CPU cores as busy as possible with as little idle time and overhead.  It is not about fairness. It is about speed.

To best do this, the internal scheduling and management of tasks should control how they are scheduled and processed on the CPU cores.  Remember, we're talking about the actual real, hardware CPU core (or virtual core, like Hyperthreading).  Not a thread.  A thread is merely a way to easily segregate tasks among the cores.  We'll rely on the operating system to make sure that of the available cores, then there are equally as many "threads" actually burning CPU cycles for real work.  Any other threads are stuck waiting for their chance at a core.  Because of this, a task manager should endeavor to keep the number of running threads as close to equal to the number of available cores.  It should also equally handle 1 or 100 cores by making sure as many cores are a busy as possible.

Using a task based approach also requires that the threading and scheduling is completely (or nearly completely) hidden.  The user should never think of threads.  The user only deals with Tasks and the extra abstractions built on top of them.  So it looks like DPL is headed for another rewrite, which is good because each time I do this it is only getting better. I've not spent a whole lot of time on the current iteration as it stands, so this isn't much of a setback. 

Microsoft's TPL is one of the few things to come out in recent years that looks great and performs well.  The concepts behind it are very well suited for Delphi.  To get an idea of some of the things you can do with it (and eventually the DPL ;-) you really should watch this video.  The really interesting bits come on toward the last 1/2 to 1/3 of the video where they go into the idea behind work stealing and work load balancing.  Really, really interesting stuff.  Joe Duffy also addresses how you need to think about parallelism and how it applies to your situation.

As part of this, I just ordered Joe's soon to be released book on Concurrent Programming on Windows Vista.  You can actually buy the book online ahead of the actual print release and download a PDF of the work in progress until then.  From reading only a few of the online excerpts, it looks to be one of those "must have" tomes for anyone doing any kind of parallel programming in native Windows and in .NET.  Even though a lot of the information I know, it is good to have it codified in a consistent organized manner.  As I stated in my first post about thread pools, as I dig deeper into this whole subject, I want to make sure that Delphi developers at all levels will find some value in such a library.  The best thing you can do at this point is to learn as much about these concepts as you can.  Hopefully this blog and others I've been referencing and will reference in the future will be an excellent source of information.  I'm finding that as I write about this, it give me a chance to review my assumptions and design ideas with a critical eye.  Hopefully, we can all learn something along the way as well.  I know I have so far.