Per-Platform Tuning

Next: 16.3 Tuning the Geometry Up: 16.2 Optimizing Your Application Previous: Minimizing State Changes.

Per-Platform Tuning

Many of the performance tuning techniques discussed here (e.g., minimizing the number of state changes and disabling features that aren't required) are a good idea no matter what system you are running on. Other tuning techniques need to be programmed for a particular system. For example, before you sort your database based on state changes, you need to determine which state changes are the most expensive for each system you are interested in running on.

In addition, you may want to modify the behavior of your program depending on which modes are fast. This is especially important for programs that must run at a particular frame rate. Features may need to be disabled in order to maintain the frame rate. For example, if a particular texture mapping environment is slow on one of your target systems, you may need to disable texture mapping or change the texture environment whenever your program is running on that platform.

Before you can tune your program for each of the target platforms, you will need to do some performance characterization. This isn't always straightforward. Often a particular device is able to accelerate certain features, but not all at the same time. Thus it is important to test the performance for combinations of features that you will be using. For example, a graphics adapter may accelerate texture mapping but only for certain texture parameters and texture environment settings. Even if all texture modes are accelerated, experimentation will be required to see how many textures you can use at once without causing the adapter to page textures in and out of the local memory.

An even more complicated situation arises if the graphics adapter has a shared pool of memory that is allocated to several tasks. For example, the adapter may not have a frame buffer deep enough to contain a depth buffer and a stencil buffer. In this case, the adapter would be able to accelerate both depth buffering and stenciling but not at the same time. Or perhaps, depth buffering and stenciling can both be accelerated but only for certain stencil buffer depths.

Typically, per-platform testing is done at initialization time. You should do some trial runs through your data with different combinations of state settings and calculate the time it takes to render in each case. You may want to save the results in a file so your program doesn't have to do this each time it starts up. You can find an example of how to measure the performance of particular OpenGL operations and save the results using the isfast program on the website.

Next: 16.3 Tuning the Geometry Up: 16.2 Optimizing Your Application Previous: Minimizing State Changes.

David Blythe
Thu Jul 17 21:24:28 PDT 1997