If you are building modern backend applications or microservices, you have likely encountered a scenario where a single incoming API request requires your server to fetch data from multiple external services. For instance, rendering a user dashboard might require calling a billing API, a user profile API, and a recommendation engine API. When you initially build this endpoint, the simplest approach is to execute these network calls sequentially. You make the first HTTP request, wait for the response, make the second request, wait again, and so forth.

While this sequential design is easy to read and theoretically correct, it performs exceptionally poorly in a production environment because it can only handle one request step at a time [1, 2]. If the billing API takes 300 milliseconds, the profile API takes 200 milliseconds, and the recommendation API takes 400 milliseconds, your user is forced to wait for almost a full second just for the data retrieval phase to complete. In an era where user retention is strictly tied to application responsiveness, this cumulative latency is unacceptable.

To achieve optimal API performance optimization, we must eliminate this sequential bottleneck. By leveraging Java multithreading, we can issue these external API calls concurrently, effectively reducing the total wait time from the sum of all individual latencies to the duration of the single longest network call. Drawing upon real-world scenarios and established computer science principles, this guide will provide a deep dive into exactly why we use Java threads in API development, how to implement parallel external calls using the Executor framework, and the mathematical principles that dictate optimal thread pool sizing.

Understanding the Root Cause: Network I/O and Blocking

To truly appreciate why threading is necessary for external API calls, we must understand how the underlying operating system and the Java Virtual Machine (JVM) handle network communications.

Programs often have to wait for external operations, such as network input or output, and while they are waiting, they can do no useful work [3, 4]. If a program is completely single-threaded, the computer processor remains completely idle while it waits for a synchronous I/O operation to complete [5, 6]. This is highly inefficient. To put this into a real-world perspective, executing a sequential network call without multithreading is like reading the newspaper while waiting for the water to boil, rather than waiting for the water to boil before starting to read [5, 6].

When your Java application initiates a synchronous HTTP request to an external API, the executing thread is suspended and placed into a blocked state (specifically WAITING or TIMED_WAITING) [7]. The thread is temporarily removed from the operating system's scheduling queue until the network I/O completes [8, 9]. In a single-threaded server environment, blocking delays the completion of the current request and entirely prevents any other pending requests from being processed [10, 11].

However, if we assign a separate thread to each independent network request, the blocking of one thread does not affect the processing of the others [12, 13]. By utilizing multiple threads, the CPU can switch its attention to other active threads while waiting for the network responses, vastly improving the overall throughput and resource utilization of the system [14, 15].

The Mathematical Foundation: Amdahl's Law and Scalability

When we optimize an API by implementing parallel thread calls, we are relying on a fundamental principle of computer science known as Amdahl's Law. Amdahl's law describes how much a program can theoretically be sped up by adding additional computing resources, based on the strict proportion of parallelizable and serial components [16, 17].

The formula dictates that if F is the fraction of the calculation that must be executed serially, the maximum speedup on a machine with N processors converges to 1/F [16, 17]. This mathematical rule highlights a critical concept for API developers: your application's scalability is strictly limited by the amount of code that must be executed in a synchronized, sequential manner [18, 19].

Fortunately, when a single API endpoint needs to call multiple external APIs, these tasks are typically highly independent. Independent tasks do not depend on the timing, results, or side effects of other tasks [20, 21]. Because there is no shared mutable state being modified during the network fetch, these operations represent the perfectly parallelizable portion of Amdahl's equation. By shifting these independent I/O tasks from serial execution into parallel execution, we maximize our theoretical speedup and harness the true power of multicore processor architectures.

A Concrete Example: The Travel Reservations Portal

To provide a concrete example of why we use threads in an API, let us examine a highly applicable scenario: a travel reservations portal.

Imagine a user logs into a travel application and requests flight and car rental quotes for an upcoming vacation. To fulfill this single client request, the travel portal must solicit bids from dozens of different external airline and car-rental company APIs [22, 23].

Fetching a bid from one specific rental company is entirely independent of fetching bids from another company [23, 24]. Therefore, treating each individual bid retrieval as a separate task forms a sensible task boundary that allows the network requests to proceed concurrently [23, 24].

If the travel portal attempted to process these external API calls sequentially, the user would experience severe latency, as the total response time would be the accumulation of dozens of network round-trips. Furthermore, if even one external airline API was experiencing degraded performance and took five seconds to respond, the entire thread would be blocked, and the user's web page would appear to be completely frozen [25, 26]. By executing these API calls in parallel, the application acts as an efficient aggregator, returning the aggregated results to the user as soon as the slowest single external API responds, completely neutralizing the cumulative delay.

Implementation Strategy: Moving Beyond Basic Threads

Recognizing the need for parallelism is only the first step. How we implement this parallelism in Java determines the stability and reliability of our API.

The Danger of Unbounded Thread Creation

A junior developer might be tempted to solve the travel portal problem by simply creating a new Thread object for every external API call and calling the .start() method. While this thread-per-task approach offers better responsiveness than sequential execution under light loads, it introduces severe architectural flaws [27, 28].

First, thread creation and teardown are not free operations [28, 29]. Creating a new thread requires allocating memory for two execution stacks (one for Java code and one for native code) and requires processing activity from both the JVM and the underlying operating system [28-31]. If your API processes a high volume of requests, constantly creating new threads introduces heavy latency overhead [28, 29].

More importantly, active threads consume massive amounts of system memory [32, 33]. Having many idle threads sitting around waiting for external network APIs to respond can tie up gigabytes of memory, putting immense pressure on the Java Garbage Collector and starving the CPU [32, 33]. If nothing places a limit on the number of threads your application creates, a sudden spike in user traffic will lead to the creation of thousands of threads, ultimately causing the JVM to crash with a fatal OutOfMemoryError [34, 35].

The Executor Framework and Thread Pools

To safely optimize our API, we must decouple task submission from task execution using the Executor framework [36, 37]. The java.util.concurrent library provides flexible thread pool implementations that manage a bounded number of worker threads [37, 38].

By using a thread pool, we reuse existing threads instead of creating new ones for every external API call [39, 40]. This amortizes the thread creation costs across multiple requests and dramatically improves the application's responsiveness [39, 40]. Furthermore, by strictly tuning the core pool size and maximum pool size, we place a hard upper bound on the number of concurrent threads [41, 42]. If the arrival rate of external API tasks exceeds the thread pool's capacity, the excess tasks wait safely in a queue of Runnable or Callable objects, preventing the system from exhausting its memory resources [42-45].

Extracting Results: Callable, Future, and invokeAll

When optimizing an API that aggregates data from external sources, we do not just want to run background tasks; we critically need the return values from those tasks to formulate our final JSON response.

The standard Runnable interface is a fairly limiting abstraction for this scenario because its run method cannot return a value or throw checked exceptions [46, 47]. Instead, the Callable interface is a far better abstraction for deferred computations like fetching a resource over a network [48, 49]. The Callable interface expects that its main entry point, the call method, will return a specific value and anticipates that network errors might cause it to throw an exception [48, 49].

To execute multiple Callable tasks in parallel, the ExecutorService provides a highly convenient method named invokeAll. The invokeAll method takes a collection of tasks and returns a collection of Future objects [23, 24]. A Future represents the lifecycle of the task and acts as a handle for the asynchronous computation, allowing you to check if the task is done or to retrieve its result using the get() method [48-51].

Code Example: Parallel API Calls in Action

Let us look at how the travel reservations portal would implement this using Java's concurrency libraries.

import java.util.concurrent.*;
import java.util.*;

public class TravelAggregatorAPI {
    // 1. Initialize a thread pool optimized for network I/O
    private final ExecutorService executor = Executors.newFixedThreadPool(50);

    public List<TravelQuote> fetchQuotesConcurrently(List<String> airlines) {
        List<Callable<TravelQuote>> tasks = new ArrayList<>();
        
        // 2. Wrap each external API call in a Callable task
        for (String airline : airlines) {
            tasks.add(() -> {
                return fetchFromExternalAirlineAPI(airline); 
            });
        }

        List<TravelQuote> quotes = new ArrayList<>();
        try {
            // 3. Submit all tasks and enforce a strict timeout of 3 seconds
            List<Future<TravelQuote>> futures = executor.invokeAll(tasks, 3, TimeUnit.SECONDS);
            
            // 4. Extract the results
            for (Future<TravelQuote> future : futures) {
                if (!future.isCancelled()) {
                    try {
                        quotes.add(future.get());
                    } catch (ExecutionException e) {
                        // Handle specific API failure without crashing the whole process
                        logError("External API call failed", e);
                    }
                } else {
                    logWarning("External API call timed out and was cancelled.");
                }
            }
        } catch (InterruptedException e) {
            // Re-assert the thread's interrupted status
            Thread.currentThread().interrupt();
        }
        
        return quotes;
    }

    private TravelQuote fetchFromExternalAirlineAPI(String airline) {
        // Simulated synchronous network I/O
        return new TravelQuote(airline, 450.00); 
    }
}
Java
복사

In this well-organized text layout, the code cleanly demonstrates the orchestration of multiple threads. The API generates a list of Callable tasks, one for each external airline [22, 23]. By passing this list to invokeAll, the thread pool executes the HTTP requests concurrently. The method guarantees that the Futures are returned in the identical order generated by the task collection's iterator, mapping each response directly to its request [23, 24].

Advanced Performance Tuning: Timeouts and Sizing

While writing the multithreaded code is straightforward, building a highly reliable, people-first content platform means anticipating failure. External APIs are notoriously unpredictable; they may hang, drop packets, or throttle your IP address.

Bounding Resource Waits with Timeouts

If an external travel API goes offline and stops responding, and your application initiates an unbounded blocking wait for that resource, your thread pool will quickly become clogged [52, 53]. If enough requests pile up, all available threads in the ExecutorService will become permanently blocked waiting for the broken external API, leading to a condition known as thread starvation deadlock [52, 53].

To mitigate the ill effects of long-running tasks and ensure application stability, developers must use timed resource waits instead of unbounded waits [54, 55]. In our code example above, we passed a specific time budget to the invokeAll method (3, TimeUnit.SECONDS).

The timed version of invokeAll will automatically return when all the tasks have completed, the calling thread is interrupted, or the specified timeout naturally expires [23, 24]. Crucially, any tasks that are not complete when the timeout expires are automatically cancelled by the Executor framework [23, 24]. This defensive programming technique guarantees that your API will always respond to your user within an acceptable timeframe, seamlessly omitting the failed bids and falling back to default behavior [22, 23].

The Mathematics of Thread Pool Sizing

A final, vital component of optimizing thread utilization is configuring the exact size of your thread pool. Thread pool sizes should rarely be hard-coded arbitrarily; instead, they should be mathematically computed based on the specific characteristics of your deployment system and workload [55, 56].

If a thread pool is too big, threads will aggressively compete for scarce CPU and memory resources, resulting in excessive context switching and context destruction [55-58]. If the pool is too small, overall throughput suffers because the computer's processors sit idle despite having available work in the queue [55, 56].

The optimal pool size can be determined using a strict mathematical formula. If

N_{cpu}

is the number of CPUs available,

U_{cpu}

is the target CPU utilization (between 0 and 1), and

W/C

is the ratio of wait time to compute time, the optimal thread pool size is calculated as [59]:

N_{threads} = N_{cpu} * U_{cpu} * (1 + W/C)

When an API's primary job is fetching data from multiple external APIs, the tasks are heavily I/O-bound. The thread spends almost all of its time waiting for the network response (

W

) and very little time actually computing data on the CPU (

C

). Because the

W/C

ratio is exceptionally high in this scenario, the optimal thread pool size for your API will actually be significantly larger than the number of physical processors available on the server [59]. Understanding this formula provides the absolute highest level of reliability and prevents you from guessing your architectural configuration.

Conclusion: Mastering Concurrency for APIs

Optimizing an API that relies on external data sources requires much more than simply writing clean code; it requires a deep understanding of Java concurrency, memory management, and network I/O behavior.

As we have explored, processing external API calls sequentially creates a severe latency bottleneck, directly harming the end-user experience. By leveraging the ExecutorService, transitioning from basic Runnable to Callable tasks, and extracting concurrent results via Future and invokeAll, developers can aggregate massive amounts of external data in a fraction of the time.

However, mastering Java threads means respecting their complexity. By strictly avoiding unbounded thread creation, mathematically calculating optimal thread pool sizes, and enforcing rigid timeout budgets on all network calls, you guarantee that your application will scale gracefully under load. Ultimately, understanding why we use threads allows us to engineer APIs that are not only blazingly fast, but highly resilient and robust against the unpredictable nature of the internet.

References

[1] Java Concurrency in Practice — class SingleThreadWebServer { public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(80); while (true) { Socket connection = socket.accept(); handleRequest(connection); } } } Listing 6.1. Sequential web server. SingleThreadedWebServer is simple and theoret…

[2] Java Concurrency in Practice — class SingleThreadWebServer { public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(80); while (true) { Socket connection = socket.accept(); handleRequest(connection); } } } Listing 6.1. Sequential web server. SingleThreadedWebServer is simple and theoret…

[3] Java Concurrency in Practice — Several motivating factors led to the development of operating systems that allowed multiple programs to execute simultaneously: Resource utilization. Programs sometimes have to wait for external operations such as input or output, and while waiting can do no useful work. It is more efficient to use…

[4] Java Concurrency in Practice — Several motivating factors led to the development of operating systems that allowed multiple programs to execute simultaneously: Resource utilization. Programs sometimes have to wait for external operations such as input or output, and while waiting can do no useful work. It is more efficient to use…

[5] Java Concurrency in Practice — Using multiple threads can also help achieve better throughput on single-processor systems. If a program is single-threaded, the processor remains idle while it waits for a synchronous I/O operation to complete. In a multithreaded program, another thread can still run while the first thread is waiti…

[6] Java Concurrency in Practice — Using multiple threads can also help achieve better throughput on single-processor systems. If a program is single-threaded, the processor remains idle while it waits for a synchronous I/O operation to complete. In a multithreaded program, another thread can still run while the first thread is waiti…

[7] Java Concurrency in Practice — 5.4 Blocking and interruptible methods Threads may block, or pause, for several reasons: waiting for I/O completion, waiting to acquire a lock, waiting to wake up from Thread.sleep, or waiting for the result of a computation in another thread. When a thread blocks, it is usually suspended and placed…

[8] Java Concurrency in Practice — 3. Market update: at this writing, Sun is shipping low-end server systems based on the 8-core Niagara processor, and Azul is shipping high-end server systems (96, 192, and 384-way) based on the 24-core Vega processor. 230 Chapter 11. Performance and Scalability synchronized (new Object()) { // do so…

[9] Java Concurrency in Practice — 3. Market update: at this writing, Sun is shipping low-end server systems based on the 8-core Niagara processor, and Azul is shipping high-end server systems (96, 192, and 384-way) based on the 24-core Vega processor. 230 Chapter 11. Performance and Scalability synchronized (new Object()) { // do so…

[10] Java Concurrency in Practice — Processing a web request involves a mix of computation and I/O. The server must perform socket I/O to read the request and write the response, which can block due to network congestion or connectivity problems. It may also perform file I/O or make database requests, which can also block. In a single…

[11] Java Concurrency in Practice — Processing a web request involves a mix of computation and I/O. The server must perform socket I/O to read the request and write the response, which can block due to network congestion or connectivity problems. It may also perform file I/O or make database requests, which can also block. In a single…

[12] Java Concurrency in Practice — 1.2.3 Simplified handling of asynchronous events A server application that accepts socket connections from multiple remote clients may be easier to develop when each connection is allocated its own thread and allowed to use synchronous I/O. If an application goes to read from a socket when no data i…

[13] Java Concurrency in Practice — 1.2.3 Simplified handling of asynchronous events A server application that accepts socket connections from multiple remote clients may be easier to develop when each connection is allocated its own thread and allowed to use synchronous I/O. If an application goes to read from a socket when no data i…

[14] Java Concurrency in Practice — Since the basic unit of scheduling is the thread, a program with only one thread can run on at most one processor at a time. On a two-processor sys-tem, a single-threaded program is giving up access to half the available CPU resources; on a 100-processor system, it is giving up access to 99%. On the…

[15] Java Concurrency in Practice — Since the basic unit of scheduling is the thread, a program with only one thread can run on at most one processor at a time. On a two-processor sys-tem, a single-threaded program is giving up access to half the available CPU resources; on a 100-processor system, it is giving up access to 99%. On the…

[16] Java Concurrency in Practice — Most concurrent programs have a lot in common with farming, consisting of a mix of parallelizable and serial portions. Amdahl’s law describes how much a program can theoretically be sped up by additional computing resources, based on the proportion of parallelizable and serial components. If F is th…

[17] Java Concurrency in Practice — Designing and tuning concurrent applications for scalability can be very dif-ferent from traditional performance optimization. When tuning for performance, the goal is usually to do the same work with less effort, such as by reusing previ-ously computed results through caching or replacing an O(n2) …

[18] Java Concurrency in Practice — 11.4 Reducing lock contention We’ve seen that serialization hurts scalability and that context switches hurt per-formance. Contended locking causes both, so reducing lock contention can im-prove both performance and scalability. Access to resources guarded by an exclusive lock is serialized—only one…

[19] Java Concurrency in Practice — As a thought experiment, imagine what would happen if there was only one lock for the entire application instead of a separate lock for each object. Then execution of all synchronized blocks, regardless of their lock, would be serialized. With many threads competing for the global lock, the chance t…

[20] Java Concurrency in Practice — 8.1 Implicit couplings between tasks and execution policies We claimed earlier that the Executor framework decouples task submission from task execution. Like many attempts at decoupling complex processes, this was a bit of an overstatement. While the Executor framework offers substantial flexi-bili…

[21] Java Concurrency in Practice — Tasks that exploit thread confinement. Single-threaded executors make stronger promises about concurrency than do arbitrary thread pools. They guaran-tee that tasks are not executed concurrently, which allows you to relax the thread safety of task code. Objects can be confined to the task thread, th…

[22] Java Concurrency in Practice — Listing 6.16 shows a typical application of a timed Future.get. It generates a composite web page that contains the requested content plus an advertisement fetched from an ad server. It submits the ad-fetching task to an executor, com-putes the rest of the page content, and then waits for the ad unt…

[23] Java Concurrency in Practice — Fetching a bid from one company is independent of fetching bids from an-other, so fetching a single bid is a sensible task boundary that allows bid retrieval to proceed concurrently. It would be easy enough to create n tasks, submit them to a thread pool, retain the Futures, and use a timed get to f…

[24] Java Concurrency in Practice — Fetching a bid from one company is independent of fetching bids from an-other, so fetching a single bid is a sensible task boundary that allows bid retrieval to proceed concurrently. It would be easy enough to create n tasks, submit them to a thread pool, retain the Futures, and use a timed get to f…

[25] Java Concurrency in Practice — Modern GUI frameworks, such as the AWT and Swing toolkits, replace the main event loop with an event dispatch thread (EDT). When a user interface event such as a button press occurs, application-defined event handlers are called in the event thread. Most GUI frameworks are single-threaded subsystems…

[26] Java Concurrency in Practice — 1. The NPTL threads package, now part of most Linux distributions, was designed to support hun-dreds of thousands of threads. Nonblocking I/O has its own benefits, but better OS support for threads means that there are fewer situations for which it is essential. 1.3. Risks of threads 5 1.2.4 More re…

[27] Java Concurrency in Practice — Tasks can be processed in parallel, enabling multiple requests to be serviced simultaneously. This may improve throughput if there are multiple process-ors, or if tasks need to block for any reason such as I/O completion, lock acquisition, or resource availability. Task-handling code must be thread-…

[28] Java Concurrency in Practice — Under light to moderate load, the thread-per-task approach is an improvement over sequential execution. As long as the request arrival rate does not exceed the server’s capacity to handle requests, this approach offers better responsiveness and throughput. 116 Chapter 6. Task Execution 6.1.3 Disadva…

[29] Java Concurrency in Practice — 116 Chapter 6. Task Execution 6.1.3 Disadvantages of unbounded thread creation For production use, however, the thread-per-task approach has some practical drawbacks, especially when a large number of threads may be created: Thread lifecycle overhead. Thread creation and teardown are not free. The a…

[30] Java Concurrency in Practice — 2. On 32-bit machines, a major limiting factor is address space for thread stacks. Each thread main-tains two execution stacks, one for Java code and one for native code. Typical JVM defaults yield a combined stack size of around half a megabyte. (You can change this with the -Xss JVM flag or throug…

[31] Java Concurrency in Practice — 2. On 32-bit machines, a major limiting factor is address space for thread stacks. Each thread main-tains two execution stacks, one for Java code and one for native code. Typical JVM defaults yield a combined stack size of around half a megabyte. (You can change this with the -Xss JVM flag or throug…

[32] Java Concurrency in Practice — Resource consumption. Active threads consume system resources, especially memory. When there are more runnable threads than available process-ors, threads sit idle. Having many idle threads can tie up a lot of memory, putting pressure on the garbage collector, and having many threads com-peting for …

[33] Java Concurrency in Practice — Resource consumption. Active threads consume system resources, especially memory. When there are more runnable threads than available process-ors, threads sit idle. Having many idle threads can tie up a lot of memory, putting pressure on the garbage collector, and having many threads com-peting for …

[34] Java Concurrency in Practice — Up to a certain point, more threads can improve throughput, but beyond that point creating more threads just slows down your application, and creating one thread too many can cause your entire application to crash horribly. The way to stay out of danger is to place some bound on how many threads you…

[35] Java Concurrency in Practice — Up to a certain point, more threads can improve throughput, but beyond that point creating more threads just slows down your application, and creating one thread too many can cause your entire application to crash horribly. The way to stay out of danger is to place some bound on how many threads you…

[36] Java Concurrency in Practice — public interface Executor { void execute(Runnable command); } Listing 6.3. Executor interface. Executor may be a simple interface, but it forms the basis for a flexible and powerful framework for asynchronous task execution that supports a wide vari-ety of task execution policies. It provides a stan…

[37] Java Concurrency in Practice — 6.2. The Executor framework 117 6.2 The Executor framework Tasks are logical units of work, and threads are a mechanism by which tasks can run asynchronously. We’ve examined two policies for executing tasks using threads—execute tasks sequentially in a single thread, and execute each task in its own…

[38] Java Concurrency in Practice — 6.2. The Executor framework 117 6.2 The Executor framework Tasks are logical units of work, and threads are a mechanism by which tasks can run asynchronously. We’ve examined two policies for executing tasks using threads—execute tasks sequentially in a single thread, and execute each task in its own…

[39] Java Concurrency in Practice — 3. This is analogous to one of the roles of a transaction monitor in an enterprise application: it can throttle the rate at which transactions are allowed to proceed so as not to exhaust or overstress limited resources. 120 Chapter 6. Task Execution Executing tasks in pool threads has a number of ad…

[40] Java Concurrency in Practice — 3. This is analogous to one of the roles of a transaction monitor in an enterprise application: it can throttle the rate at which transactions are allowed to proceed so as not to exhaust or overstress limited resources. 120 Chapter 6. Task Execution Executing tasks in pool threads has a number of ad…

[41] Java Concurrency in Practice — 8.3.1 Thread creation and teardown The core pool size, maximum pool size, and keep-alive time govern thread cre-ation and teardown. The core size is the target size; the implementation attempts to maintain the pool at this size even when there are no tasks to execute,2 and will 2. When a ThreadPoolE…

[42] Java Concurrency in Practice — Bounded thread pools limit the number of tasks that can be executed concurrently. (The single-threaded executors are a notable special case: they guarantee that no tasks will execute concurrently, offering the possibility of achieving thread safety through thread confinement.) We saw in Section 6.1.…

[43] Java Concurrency in Practice — Bounded thread pools limit the number of tasks that can be executed concurrently. (The single-threaded executors are a notable special case: they guarantee that no tasks will execute concurrently, offering the possibility of achieving thread safety through thread confinement.) We saw in Section 6.1.…

[44] Java Concurrency in Practice — 8.3. Configuring ThreadPoolExecutor 173 handled, requests will still queue up. With a thread pool, they wait in a queue of Runnables managed by the Executor instead of queueing up as threads contend-ing for the CPU. Representing a waiting task with a Runnable and a list node is certainly a lot cheap…

[45] Java Concurrency in Practice — 8.3. Configuring ThreadPoolExecutor 173 handled, requests will still queue up. With a thread pool, they wait in a queue of Runnables managed by the Executor instead of queueing up as threads contend-ing for the CPU. Representing a waiting task with a Runnable and a list node is certainly a lot cheap…

[46] Java Concurrency in Practice — 6.3. Finding exploitable parallelism 125 public class SingleThreadRenderer { void renderPage(CharSequence source) { renderText(source); List<ImageData> imageData = new ArrayList<ImageData>(); for (ImageInfo imageInfo : scanForImageInfo(source)) imageData.add(imageInfo.downloadImage()); for (ImageDat…

[47] Java Concurrency in Practice — 6.3. Finding exploitable parallelism 125 public class SingleThreadRenderer { void renderPage(CharSequence source) { renderText(source); List<ImageData> imageData = new ArrayList<ImageData>(); for (ImageInfo imageInfo : scanForImageInfo(source)) imageData.add(imageInfo.downloadImage()); for (ImageDat…

[48] Java Concurrency in Practice — Many tasks are effectively deferred computations—executing a database query, fetching a resource over the network, or computing a complicated func-tion. For these types of tasks, Callable is a better abstraction: it expects that the main entry point, call, will return a value and anticipates that it…

[49] Java Concurrency in Practice — 6.3. Finding exploitable parallelism 127 6.3.3 Example: page renderer with Future As a first step towards making the page renderer more concurrent, let’s divide it into two tasks, one that renders the text and one that downloads all the images. (Because one task is largely CPU-bound and the other is…

[50] Java Concurrency in Practice — Future<ImageData> f = completionService.take(); ImageData imageData = f.get(); renderImage(imageData); } } catch (InterruptedException e) { Thread.currentThread().interrupt(); } catch (ExecutionException e) { throw launderThrowable(e.getCause()); } } } Listing 6.15. Using CompletionService to render…

[51] Java Concurrency in Practice — Future<ImageData> f = completionService.take(); ImageData imageData = f.get(); renderImage(imageData); } } catch (InterruptedException e) { Thread.currentThread().interrupt(); } catch (ExecutionException e) { throw launderThrowable(e.getCause()); } } } Listing 6.15. Using CompletionService to render…

[52] Java Concurrency in Practice — 8.1. Implicit couplings 169 waiting for the result of the second task. The same thing can happen in larger thread pools if all threads are executing tasks that are blocked waiting for other tasks still on the work queue. This is called thread starvation deadlock, and can occur whenever a pool task i…

[53] Java Concurrency in Practice — ThreadDeadlock in Listing 8.1 illustrates thread starvation deadlock. Render-PageTask submits two additional tasks to the Executor to fetch the page header and footer, renders the page body, waits for the results of the header and footer tasks, and then combines the header, body, and footer into the…

[54] Java Concurrency in Practice — One technique that can mitigate the ill effects of long-running tasks is for tasks to use timed resource waits instead of unbounded waits. Most blocking meth-ods in the plaform libraries come in both untimed and timed versions, such as Thread.join, BlockingQueue.put, CountDownLatch.await, and Select…

[55] Java Concurrency in Practice — One technique that can mitigate the ill effects of long-running tasks is for tasks to use timed resource waits instead of unbounded waits. Most blocking meth-ods in the plaform libraries come in both untimed and timed versions, such as Thread.join, BlockingQueue.put, CountDownLatch.await, and Select…

[56] Java Concurrency in Practice — 8.2 Sizing thread pools The ideal size for a thread pool depends on the types of tasks that will be submit-ted and the characteristics of the deployment system. Thread pool sizes should rarely be hard-coded; instead pool sizes should be provided by a configuration mechanism or computed dynamically b…

[57] Java Concurrency in Practice — 11.3 Costs introduced by threads Single-threaded programs incur neither scheduling nor synchronization over-head, and need not use locks to preserve the consistency of data structures. Scheduling and interthread coordination have performance costs; for threads to offer a performance improvement, the…

[58] Java Concurrency in Practice — Context switches are not free; thread scheduling requires manipulating shared data structures in the OS and JVM. The OS and JVM use the same CPUs your pro-gram does; more CPU time spent in JVM and OS code means less is available for your program. But OS and JVM activity is not the only cost of conte…

[59] Java Concurrency in Practice — 8.3. Configuring ThreadPoolExecutor 171 by running the application using several different pool sizes under a benchmark load and observing the level of CPU utilization. Given these definitions: Ncpu = number of CPUs Ucpu = target CPU utilization, 0 ≤ Ucpu ≤ 1 W C = ratio of wait time to compute time…