Quantcast
Viewing latest article 2
Browse Latest Browse All 10

Stopping the Future in Time

In Java 5, a new set of classes related to concurrency was added to the Java API, mostly authored by the famous Doug Lea. These include the ExecutorService (Java Doc) and the Future (Java Doc) classes to name a few, all part of the concurrency framework which was introduced in the same version of Java (Concurrency Utilities Overview). This article describes a problem related with stopping a group or Future in time. It first shows the problem and then proposes a simple solution.

All code listed below is available at: http://code.google.com/p/java-creed-examples/source/checkout. Most of the examples will not contain the whole code and may omit fragments which are not relevant to the example being discussed. The readers can download or view all code from the above link.

This article assumes that the readers have some knowledge of threads and the Java concurrency framework.

Problem Setup

Let’s first create some classes, which we will use in this example. The first class is a simple worker class that implements Runnable. This class will sleep for a given time and then finishes as illustrated below.

package com.javacreed.examples.concurrency.sfit;

import java.util.concurrent.TimeUnit;

public class MyWorker implements Runnable {

  private final int sleepTime;

  public MyWorker(final int sleepTime) {
    this.sleepTime = sleepTime;
  }

  @Override
  public void run() {
    final long startTime = System.nanoTime();
    try {
      Thread.sleep(TimeUnit.SECONDS.toMillis(sleepTime));
      Util.printLog("Finished");
    } catch (final InterruptedException e) {
      Thread.currentThread().interrupt();
      final long interruptedAfter = System.nanoTime() - startTime;
      Util.printLog("Interrupted after %,d nano seconds", interruptedAfter);
    }
  }
}

Observation

Note that we are interrupting the current thread again within the catch exception as highlighted below.
    } catch (final InterruptedException e) {
      Thread.currentThread().interrupt();
      final long interruptedAfter = System.nanoTime() - startTime;
      Util.printLog("Interrupted after %,d nano seconds", interruptedAfter);
    }

This is very important as otherwise, the caller of the run() method will miss the thread interruption. The caller will not be able to determine whether the thread was interrupted or not. When catching the InterruptedException, we are clearing the interrupted state of the thread. By interrupting it again from within the catch block, the caller of the run() method, will be able to determine that this thread was interrupted. Ideally we do not capture this exception, by the signature of the run() method does not permit that.

The above class makes use of a Util class, which has only one method, as shown next.

package com.javacreed.examples.concurrency.sfit;

public class Util {

  public static void printLog(final String message, final Object... params) {
    System.out.printf("[%tF %<tT] [%s] %s%n", System.currentTimeMillis(), Thread.currentThread().getName(),
        String.format(message, params));
  }
}

This class prints a formatted message to the command prompt. It should not be considered as a replacement of the log files and the loggers. But in order to keep things simple and focused, it was decided to take this simple path. We used a method so that we can have a consistent output formatted with key information such as time and thread information.

This concluded our setup. Next we will analyse the problem and then will propose a simple solution. In summary, the run() method within the MyWorker class will take approximately the value of sleepTime seconds to finish, unless interrupted. This value is set through the constructor.

The problem

Consider the following class.

package com.javacreed.examples.concurrency.sfit;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class Main {

  public static void main(final String[] args) {
    final ExecutorService executorService = Executors.newFixedThreadPool(5);
    try {
      final long startTime = System.nanoTime();
      final List<Future<?>> list = new ArrayList<>();
      for (int i = 0; i < 5; i++) {
        final Future<?> future = executorService.submit(new MyWorker(8));
        list.add(future);
      }

      Util.printLog("Waiting for the workers to finish");
      Main.method1(list, 5, TimeUnit.SECONDS);
      final long finishTime = System.nanoTime();
      Util.printLog("Program finished after: %,d nano seconds", finishTime - startTime);

    } finally {
      executorService.shutdown();
    }
  }

  public static void method1(final List<Future<?>> list, final long timeout, final TimeUnit timeUnit) {
    for (final Future<?> future : list) {
      try {
        future.get(timeout, timeUnit);
      } catch (final TimeoutException e) {
        future.cancel(true);
      } catch (final Exception e) {
        Util.printLog("Failed: %s", e);
      }
    }
  }
}

The above class performs the following:

  1. Creates an ExecutorService that allows 5 workers/threads to run in parallel.
  2. Creates 5 instances of MyWorker using a delay of 8 seconds. Therefore, each instance will be ready after approximately 8 seconds. Note that these will run in parallel and not in series and without a timeout the program will take approximately 8 seconds to complete.
  3. Submits the 5 instances of MyWorker to the ExecutorService created before, which in turn returns 5 instances of Future. These Future are then added to a list.
  4. The list of 5 Future is passed to the method called method1(), which is expected to cancel any tasks that takes longer than the given timeout, 5 seconds in this case.
  5. The program waits for method1() to finish and the terminates after shutting down the ExecutorService. Note that the method method1() takes a timeout value which indicates how much this method takes before it returns. In our case, this method should not take more that approximately 5 seconds.

Since all tasks take 8 seconds to finish (equivalent to 8,000,000,000 nano seconds) and the timeout used is 5 seconds, then we are expecting that all of these to be timed out and cancelled. If we run this code, the following will be produced.

[2013-01-05 08:40:58] [main] Waiting for the workers to finish
[2013-01-05 08:41:03] [pool-1-thread-1] Interrupted after 5,023,245,626 nano seconds
[2013-01-05 08:41:06] [pool-1-thread-4] Finished
[2013-01-05 08:41:06] [pool-1-thread-3] Finished
[2013-01-05 08:41:06] [pool-1-thread-5] Finished
[2013-01-05 08:41:06] [pool-1-thread-2] Finished
[2013-01-05 08:41:06] [main] Program finished after: 8,004,657,271 nano seconds

The program takes 8 seconds to finish and not 5 seconds as expected. Only the first worker/thread was cancelled after approximate 5 seconds, as expected. But all remaining 4 workers/threads finished in time, without being interrupted. We were expecting something like the following, were all workers/threads are cancelled and not just the first one.

[2013-01-05 08:46:55] [main] Waiting for the workers to finish
[2013-01-05 08:47:00] [pool-1-thread-3] Interrupted after 5,027,757,935 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-5] Interrupted after 5,010,777,755 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-4] Interrupted after 5,027,876,873 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-2] Interrupted after 5,027,821,247 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-1] Interrupted after 5,027,940,918 nano seconds
[2013-01-05 08:47:00] [main] Program finished after: 5,029,392,346 nano seconds

What went wrong?

The problem lies in how we are using the Future‘s get() method, highlighted below.

  public static void method1(final List<Future<?>> list, final long timeout, final TimeUnit timeUnit) {
    for (final Future<?> future : list) {
      try {
        future.get(timeout, timeUnit);
      } catch (final TimeoutException e) {
        future.cancel(true);
      } catch (final Exception e) {
        Util.printLog("Failed: %s", e);
      }
    }
  }

During the first iteration of the loop, the get() method waits 5 seconds and then fires a timeout and interrupts this worker/thread. In the second iteration, the get() method waits another 5 seconds before it times out. Therefore, the second worker/thread has an accumulative timeout of 10 seconds and not 5 as expected. Since our workers/threads have a sleep time of 8 seconds, they make it in time and are ready before the get() times out again. This method will take at most 5 seconds multiplied by the size of the list, and not just 5 seconds to complete.

In the next section we will how we can alter this method to provide a fair timeout.

The Solution

The problem experienced before was due to the fact that we were not deducting the wait time used by the previous workers/threads. For example, if the previous worker/thread uses 2 seconds out of 5, then the next thread should be allowed a maximum of 3 seconds and not more. Note that the iteration happens sequentially and therefore the second worker/thread inherits the time used for waiting the previous worker/thread.

The following example illustrates how we can obtain a fair timeout.

  public static void method2(final List<Future<?>> list, final long timeout, final TimeUnit timeUnit) {
    long globalWaitTime = timeUnit.toNanos(timeout);
    for (final Future<?> future : list) {
      final long waitStart = System.nanoTime();
      try {
        future.get(globalWaitTime, TimeUnit.NANOSECONDS);
      } catch (final TimeoutException e) {
        future.cancel(true);
      } catch (final Exception e) {
        Util.printLog("Failed: %s", e);
      } finally {
        final long timeTaken = System.nanoTime() - waitStart;
        globalWaitTime = Math.max(globalWaitTime - timeTaken, 0);
      }
    }
  }

Instead of using a constant timeout, the next worker/thread takes what is left, ensuring that all workers/thread will share one global timeout. This produces the following output to the command prompt.

[2013-01-05 08:46:55] [main] Waiting for the workers to finish
[2013-01-05 08:47:00] [pool-1-thread-3] Interrupted after 5,027,757,935 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-5] Interrupted after 5,010,777,755 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-4] Interrupted after 5,027,876,873 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-2] Interrupted after 5,027,821,247 nano seconds
[2013-01-05 08:47:00] [pool-1-thread-1] Interrupted after 5,027,940,918 nano seconds
[2013-01-05 08:47:00] [main] Program finished after: 5,029,392,346 nano seconds

Now all workers/threads are interrupted after 5 seconds and the program only take 5 seconds to complete, as it was expected.


Viewing latest article 2
Browse Latest Browse All 10

Trending Articles