Wednesday, September 19, 2007

Wading in the shallow end of the pool - Thread Pools

A few weeks ago I was following with interest the announcement by Intel where they open-sourced their Thread Building Blocks (TBB) library. OK, so it is written in C++, makes heavy use of templates, and is generally like nearly every C++ library out there with heavy macro usage and umpteen levels of nest macros... Besides all of that, either I've gotten way better at reading and understanding C++ or these libraries are simplifying things, I was able to understand it quite well. This got me thinking a bit and so I decided to take a quick look at Windows Thread Pools.

A thread pool is simply a way to more efficiently utilize the CPU(s) while incurring as little thread context switching overhead as possible. If your application needs to do lots of quick little tasks and can do them concurrently, the build-up and tear-down overhead of threads may actually cancel out any gains from trying to push these tasks into threads. Think of thread pooling as simply a way to queue up a bunch of work-items and allow them to run on an number of available threads that are created once and used over and over. Thus you can amortize the build-up and tear-down of the threads across all the work they do. Once a work item is assigned to a thread, that work items has exclusive access to that particular thread until it is finished and allows the thread to return to the pool. A thread pool is not the solution for everything. For instance, if you have a thread that is blocked most of the time, you should not do that on a thread pool thread.

Off I went on a discovery mission to see how Windows thread pooling works. I wanted to visually see what was going on as I scheduled or queued work items. I came up with a simple TThreadPool class. Here's the declaration of what I came up with:

type
  TThreadPool = class
  private
    type
      TUserWorkItem = class
        FSender: TObject;
        FWorkerEvent: TNotifyEvent;
      end;
    class procedure QueueWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent; Flags: ULONG); overload; static;
  public
    class procedure QueueWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent); overload; static;
    class procedure QueueIOWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent); static;
    class procedure QueueUIWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent); static;
  end;


You'll notice that this is designed to be a singleton class since all methods are class static. The main function of interest is the QueueWorkItem. What this does is simply schedule the WorkerEvent to be called from a thread pool thread whenever that is. It is up to you to make sure that the instance on which the WorkerEvent event is called is still valid at the time it is called. The other two methods simply correspond to some of the flags you can pass to QueueUserWorkItem. They're not used right now. Sender is passed through to the event handler specified by WorkerEvent, so that object should contain the context in which that task item is to work.

Now here's the implementation of that class:

function InternalThreadFunction(lpThreadParameter: Pointer): Integer; stdcall;
begin
  Result := 0;
  try
    try
      with TThreadPool.TUserWorkItem(lpThreadParameter) do
        if Assigned(FWorkerEvent) then
          FWorkerEvent(FSender);
    finally
      TThreadPool.TUserWorkItem(lpThreadParameter).Free;
    end;
  except
    // Eventually this will need to somehow synchronously notify the main thread and either reraise the exception over there or
    // otherwise provide some information about the exception to the main thread.
  end;
end;

{ TThreadPool }

class procedure TThreadPool.QueueWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent);
begin
  QueueWorkItem(Sender, WorkerEvent, WT_EXECUTEDEFAULT);
end;

class procedure TThreadPool.QueueIOWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent);
begin
  QueueWorkItem(Sender, WorkerEvent, WT_EXECUTEINIOTHREAD);
end;

class procedure TThreadPool.QueueUIWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent);
begin
  QueueWorkItem(Sender, WorkerEvent, WT_EXECUTEINUITHREAD);
end;

class procedure TThreadPool.QueueWorkItem(Sender: TObject; WorkerEvent: TNotifyEvent; Flags: ULONG);
var
  WorkItem: TUserWorkItem;
begin
  if Assigned(WorkerEvent) then
  begin
    IsMultiThread := True;
    WorkItem := TUserWorkItem.Create;
    try
      WorkItem.FWorkerEvent := WorkerEvent;
      WorkItem.FSender := Sender;
      if not QueueUserWorkItem(InternalThreadFunction, WorkItem, Flags) then
        RaiseLastOSError;
    except
      WorkItem.Free;
      raise;
    end;
 end;
end;


To see just what is going on I wrote this little application:




The numbers in the list box represent the thread ID for the thread that is currently running. The band of colors visually show how the threads are scheduled. What is interesting is that this is what it looks like after about the 3rd or 4th time it runs. The first time it runs, each color is painted in sequence in a clearly serialized manner. Subsequent iterations seem to interleave more and more. This is running on a dual core system. You can get the whole application in Code Central.

As multi-core systems become more and more mainstream (aren't they already??), your applications really should begin to take advantage of them. The problem is that multi-threaded, or concurrent, programming is not very easy since we humans tend to think serially and so it is conceptually a little tricky to understand all the various nuances of concurrency. This is where CodeGear is looking to help. By providing simple, easy to understand, tools and libraries we can help bring multi-core programming out of the realm of voodoo and black magic and into the hands of developers of all skill levels. This will involve providing both library and compiler/tool support.

16 comments:

  1. Mr Bauer said:
    "As multi-core systems become more and more mainstream (aren’t they already??),"

    No, they aren't. Well, at least here in Brazil.

    ReplyDelete
  2. Multi-threading and now thread pool are very interesting. And you approach using a TNotifyEvent instead of forcing the task to be member of a specific threading class is something that could be used on the already existing TThread class.
    I think it would be more usable this way.

    ReplyDelete
  3. Fabrico,

    Using an event is more consistent with the how thread pools are intended to be used. Short, atomic tasks, which are the very thing an event is generally used for. Using a TThread class descendant is geared more toward longer lived background tasks.

    At any rate, your point is certainly taken.

    Allen.

    ReplyDelete
  4. Well, not anymore, I guess. TThread now includes a pooling mechanism using queuing.
    (I believe since either D2005 or d2006)

    ReplyDelete
  5. As far as I can see from the code, you are calling the VCL without using Synchronize.

    Is this safe when using the TThreadPool class?

    ReplyDelete
  6. Hm, on my machine the first time the example executes it operates on a single thread. Only the second time you press the button it executes on separate threads.
    Weird, I can see no obvious things in the code.

    ReplyDelete
  7. I'm sorry to contradict what Fabricio sad about the multi-core not becoming mainstream - in Brazil. Maybe he doesn't have one of his own at this very moment. But there exist losts of offers of brand new multi-core (I understand that dual core is consireded as multi-core here) systems at very competitive prices. Although there still single core systems to buy as well.

    Anyway, very nice article on Thread pooling. I've been using Thread pooling for quite some time within a application that use sockets and accepts lots of TCP connections. Instead of creating a different Thread for each connection, Im using Thread pooling to process them. I'm using Synapse Library and the idea of Thread pooling was took from a sample from Andre Azevedo.

    ReplyDelete
  8. Fabricio: You can already do this with TThread; just add an event property and make the Execute method call it.

    Allen: Regarding this question in your post:

    "Eventually this will need to somehow synchronously notify the main thread and either reraise the exception over there or otherwise provide some information about the exception to the main thread."

    How about handling this similarly to how TThread does it: Acquire the exception object, and set it as a property or event argument. So, for example, in addition to passing WorkerEvent, you could pass an ExceptionEvent which would be called (with the exception itself as an argument) if there was an exception.

    ReplyDelete
  9. Thaddy,

    That is exactly the behavior I see on my system. I alluded to it in the post. Since I'm calling directly to the WinAPI, I can't see how the class has any affect on it. Just weird.

    Allen.

    ReplyDelete
  10. Craig,

    The problem is that the thread pool class isn't associated with just one thread. Also, by the time the main thread got around to figuring out that one of the tasks raised an exception, the context may be long gone and the task has long been done. You can't use the thread ID to identify which task caused the error, since it gets recycled. I've been trying to think of a more robust, easy to use, mechanism. The only thing I've really thought of is to block the thread until the exception is acknowledged, but that is still fraught with problems.

    Allen.

    ReplyDelete
  11. FYI
    http://cc.codegear.com/ doesn't work right now.

    All I get is "Server is too busy"

    David

    ReplyDelete
  12. Thank Allen. I'll play around with this and maybe give you feedback.

    ReplyDelete
  13. Allen,

    I think this:


    ...
    TWorkerColor.Create(Self, clBlue);

    TWorkerColor.Create(Self, clTeal);
    ...


    -is almost too crude an example to show concurrency (mainly because it looks too similar to TWorkerThread semantics we've all seen so much of). But on the whole I think its in the right direction.

    Btw, Eric Grange and I are having an interesting conversation regarding concurrency in non-tech in the thread entitled "16-core CPUs are nearly upon us".

    I give a brief sketch of what I think would make a good concurrency class design.

    I'd be interested in your input.

    -d

    ReplyDelete
  14. Dennis,

    Sure, the example is contrived and not truly representative of what you can do with pooling. Besides the setup of the tasks was not really what was the focus of the example. I was just looking at how the pooling and scheduling worked.

    Allen.

    ReplyDelete
  15. Allen:

    Pooling and scheduling are cool. Please keep going along this track.

    -d

    ReplyDelete
  16. Can you show us a fuller example, where a class calls a method that "gets data" and the upon return triggers the next method in the sequence, say, "process data", and then "present data"?

    In other words can you prove that this API is robust enough to string together a causal chain in a non-linear environment? I have my doubts.

    Much better to roll your own, I think, then use this API. But I appreciate the experimentation!

    ReplyDelete

Please keep your comments related to the post on which you are commenting. No spam, personal attacks, or general nastiness. I will be watching and will delete comments I find irrelevant, offensive and unnecessary.