5.9.13

Simple FIFO with Congestion Control - Why JDK don't consider it?

I think this is a common requirement:

I need to run some tasks in a certain amount of threads. If the tasks created too fast, I want them to wait at the queue. But if there are too many tasks waiting already, I need to slow down the caller and have it wait there until the task queue is available.

ThreadPoolExecutor support this feature except the slow down part. A similar implementation is to apply with CallerRunsPolicy so that the caller thread is used to run a task when the queue is full. However, if I have a queue of 1 million, and the core size/max pool size set as 1, then I probably have task number 1, and task 1 million and 1 finish first. We usually don't expect the tasks finished in a strict order, but having task 1 and task 1 million and 1 finish before others seems too radically, isn't it?

Similar thing happen to maxPoolSize. New threads will only be created when the Queue is full. Those new created threads will be assigned with those tasks that exceed the Queue, instead of assigning the oldest thread already in the Queue. It is the 1 million and 1 issue all over again.

According to those characteristics of ThreadPoolExecutor, I think we need to remember these rules/best practices:


  • Use Executors.* factory methods if they fit your requirements. The Executors.* factory methods crease less confusion:
    • newFixedThreadPool
      • use same size for core and max pool.
      • use unlimited blocking queue.
    • newSingleThreadExecutor
      • A special version of newFixedThreadPool
    • new CachedThreadPool
      • core size = 0
      • max pool = unlimited
      • 60 second alive time
      • SynchronousQueue for queuing, which means all task will just go directly to execution without waiting.
  • Executors.* factory methods will definitely create OOM problem in situation like Denial of Service attack. Then you might want to have your own solution out of these provided ones. In this case, you need to remember:
    • Don't rely on the FIFO concept unless you elaborately set up the thread pool. Make those tasks stateless and totally order independent if it is possible.
    • Avoid using a work queue and changeable pool size at the same time, because the new created threads will pickup the tasks failed to add to the Queue, which will totally mess up with the order in a dramatic way. If Executors.newCachedThreadPool() is not the option, you probably want a pool with same core and max pool size, a long with a limited Queue size.
    • If FIFO and congestion control are important for your system, and the tasks are based on some simple data, you probably want to create your own BlockingQueue to buffer the input. In this way, you can control the data queue easily and dispatch them to the  target thread.

No comments: