3    Optimizing Applications and CPU Performance

This chapter describes how to optimize CPU resources and applications for high performance.

3.1    Configuring CPU Resources

You must configure enough CPU power in your system to meet the performance needs of your users and applications. In addition, you may be able to improve performance by optimizing the CPU and your applications.

A system must be able to efficiently allocate the available CPU cycles among competing processes. In addition to single-CPU systems, DIGITAL supports multiprocessing systems and processors with different speeds.

Multiprocessing systems allow you to expand the computing power of a system by adding processors. Workloads that benefit most from multiprocessing have multiple processes or multiple threads of execution that can run concurrently, such as database management system (DBMS) servers, World Wide Web (WWW) servers, mail servers, and compute servers.

You may be able to improve the performance of a multiprocessing system that has only a small percentage of idle time by adding processors. However, increasing the number of processors may increase the demands on your I/O and memory subsystems and could cause bottlenecks. If your system is metadata-intensive (that is, it opens large numbers of small files and accesses them repeatedly), you may gain an additional performance benefit if you add Prestoserve or use a write-back cache when you add more processors. See Chapter 5 for information about Prestoserve and write-back caches.

Before you add processors, you must ensure that a performance problem is not caused by the virtual memory or I/O subsystems. For example, increasing the number of processors will not improve performance in a system that lacks sufficient memory resources.

The iostat and vmstat commands let you monitor the memory, CPU, and I/O consumption on your system. The cpustat extension to the kdbx debugger allows application developers to monitor the time spent in user mode, system mode, and kernel mode on each of the processors. This information can help application developers determine how effectively they are achieving parallelism across the system. See Chapter 2 for information about using tools to monitor performance.

3.2    Identifying CPU Bottlenecks

Use the vmstat command to determine CPU usage as follows:

Use the kdbx cpustat extension to display statistics about CPU use, especially for multiprocessing systems. Statistics include the percentages of time the CPU spends in the following states:

See Chapter 2 for information about monitoring systems.

3.3    Optimizing CPU Resources

After you configure the appropriate number of CPUs in your system, you may be able to improve system performance by optimizing your CPU resources. Before optimizing the CPU, ensure that the virtual memory or I/O subsystems are not the cause of poor performance. If optimizing the CPU does not solve the performance problem, you must upgrade your CPU to a faster processor or use multiprocessing.

To optimize your CPU resources, use the following methods:

3.4    Identifying Application Bottlenecks

If an application is degrading system performance, use profiling to identify sections of code that consume large portions of execution time. In a typical program, most execution time is spent in relatively few sections of code. To improve performance, concentrate on improving the coding efficiency of those time-intensive sections. See the Programmer's Guide for more information on profiling.

3.5    Improving Application Performance

Well-written applications use CPU, memory, and I/O resources efficiently. You may be able to improve system and application performance by following these recommendations:

3.6    Interprocess Communications Facilities

Interprocess communication (IPC) is the exchange of information between two or more processes. Some examples of IPC include messages, shared memory, semaphores, pipes, signals, process tracing, and processes communicating with other processes over a network. IPC is a functional interrelationship of several operating system subsystems. Elements are found in scheduling and networking.

In single-process programming, modules within a single process communicate with each other using global variables and function calls, with data passing between the functions and the callers. If you are programming by using separate processes with images in separate address spaces, use additional communication mechanisms.

The DIGITAL UNIX operating system provides the following facilities for interprocess communication:

You may be able to improve IPC performance by modifying the following attributes:

You can track the use of IPC facilities with the ipcs -a command (see ipcs(1)). By looking at the current number of bytes and message headers in the queues, you can then determine whether you need to increase the values of the msg-mnb and msg-tql attributes to diminish waiting.

You may also want to consider modifying several of the following IPC attributes: