by Benjamin Lipchak
WHAT YOU'LL LEARN IN THIS CHAPTER:
How To | Functions You'll Use |
---|---|
Create and delete query objects |
|
Define bounding box occlusion queries |
|
Retrieve the results from an occlusion query |
|
Complex scenes contain hundreds of objects and thousands upon thousands of polygons. Consider the room you're in now, reading this book. Look at all the furniture, objects, other people or pets and think of the rendering power needed to accurately represent their complexity. Several readers will find themselves happily sitting on a crate near a computer in an empty studio apartment, but the rest will envision a significant rendering workload around them.
Now think of all the things you can't see: objects hidden behind other objects, in drawers, or even in the next room. From most viewpoints, these objects are invisible to the viewer. If you rendered the scene, the objects would be drawn, but eventually something would draw over the top of them. Why bother doing all that work for nothing?
Enter occlusion queries. In this chapter, we describe a powerful new feature included in OpenGL 1.5 that can save a tremendous amount of vertex processing and texturing at the expense of a bit of extra nontextured fill rate. Often this trade-off is a very favorable one. We explore the use of occlusion detection and witness the dramatic increase in frame rates this technique affords.
To show off the improved performance possible through the use of occlusion queries, we need an experimental control group. We'll draw a scene without any fancy occlusion detection. The scene is contrived so that there are plenty of objects both visible and hidden at any given time.
First, we'll draw the “main occluder.” An occluder is a large object in a scene that tends to occlude, or hide, other objects in the scene. An occluder is often low detail, whereas the objects it occludes may be much higher in detail. Good examples are walls, floors, and ceilings. The main occluder in this scene is a grid made out of six walls, as illustrated in Figure 17.1. Listing 17.1 shows how the walls are actually just scaled cubes.
Example 17.1. Main Occluder with Six Scaled and Translated Solid Cubes
// Called to draw the occluding grid void DrawOccluder(void) { glColor3f(0.5f, 0.25f, 0.0f); glPushMatrix(); glScalef(30.0f, 30.0f, 1.0f); glTranslatef(0.0f, 0.0f, 50.0f); glutSolidCube(10.0f); glTranslatef(0.0f, 0.0f, -100.0f); glutSolidCube(10.0f); glPopMatrix(); glPushMatrix(); glScalef(1.0f, 30.0f, 30.0f); glTranslatef(50.0f, 0.0f, 0.0f); glutSolidCube(10.0f); glTranslatef(-100.0f, 0.0f, 0.0f); glutSolidCube(10.0f); glPopMatrix(); glPushMatrix(); glScalef(30.0f, 1.0f, 30.0f); glTranslatef(0.0f, 50.0f, 0.0f); glutSolidCube(10.0f); glTranslatef(0.0f, -100.0f, 0.0f); glutSolidCube(10.0f); glPopMatrix(); }
In each grid compartment, we're going to put a highly tessellated textured sphere. These spheres are our “occludees,” objects possibly hidden by the occluder. We need the high vertex count and texturing to accentuate the rendering burden so that we can subsequently relieve that burden courtesy of occlusion queries. Just as we did in the preceding chapter where we were showing off our buffer objects, we need to lay on the vertices!
Figure 17.2 shows the picture resulting from Listing 17.2. If you find this workload too heavy, feel free to reduce the tessellation in glutSolidSphere
from the 100s to smaller numbers. Or if your OpenGL implementation is still hungry for more, go ahead and increase the tessellation.
Example 17.2. Drawing 27 Highly Tessellated Spheres in a Color Cube
// Called to draw sphere void DrawSphere(GLint sphereNum) { ... glutSolidSphere(50.0f, 100, 100); ... } void DrawModels(void) { ... // Turn on texturing just for spheres glEnable(GL_TEXTURE_2D); glEnable(GL_TEXTURE_GEN_S); glEnable(GL_TEXTURE_GEN_T); // Draw 27 spheres in a color cube for (r = 0; r < 3; r++) { for (g = 0; g < 3; g++) { for (b = 0; b < 3; b++) { glColor3f(r * 0.5f, g * 0.5f, b * 0.5f); glPushMatrix(); glTranslatef(100.0f * r - 100.0f, 100.0f * g - 100.0f, 100.0f * b - 100.0f); DrawSphere((r*9)+(g*3)+b); glPopMatrix(); } } } glDisable(GL_TEXTURE_2D); glDisable(GL_TEXTURE_GEN_S); glDisable(GL_TEXTURE_GEN_T); }
Listing 17.2 marks the completion of our picture. If we were happy with the rendering performance, we could end the chapter right here. But if the sphere tessellation is cranked up high enough, or if you were to introduce complex multitexturing or fragment shading to the sphere rendering, frame rates should be unacceptable. So read on!
The theory behind occlusion detection is that if an object's bounding volume is not visible, neither is the object. A bounding volume is any volume that completely contains the object. The whole point of occlusion detection is to cheaply draw a simple bounding volume to find out whether you can avoid drawing the actual complex object. So the more complex our bounding volume is, the more it negates the purpose of the optimization we're trying to create.
The simplest bounding volume is a cube, also called a bounding box. Eight vertices, six faces. You can easily create a bounding box for any object just by scanning for its minimum and maximum coordinates on each of the x-, y-, and z-axes. For our spheres with a 50-unit radius, a bounding box with sides of length 100 units will fit perfectly.
Be aware of the trade-off when using such a simple and arbitrary bounding volume. The bounding volume may have very few vertices, but it will touch many more pixels than the original object would have. With a few additional strategically placed vertices, you can turn your bounding box into a more useful shape and significantly reduce the fill rate overhead. Fortunately, the bounding box is drawn without any fancy texturing or shading, so its overall fill rate cost will often be less than the original object anyway. Figure 17.3 shows an example of how different bounding volume shapes affect pixel count and vertex count.
When we draw our bounding volumes, we're going to enable an occlusion query that will count the number of fragments that pass the depth test (and the stencil test if enabled). Therefore, we don't care how the bounding volumes look. In fact, we don't even need to draw them to the screen at all. So we'll shut off all the bells and whistles before rendering the bounding volume, including writes to the color buffer:
glShadeModel(GL_FLAT); glDisable(GL_LIGHTING); glDisable(GL_COLOR_MATERIAL); glDisable(GL_NORMALIZE); // Texturing is already disabled ... glColorMask(0, 0, 0, 0);
After all this talk about occlusion queries, we're finally going to create some. But first, we need to come up with names for them. Here, we request 27 names, one for each sphere's query, and we provide a pointer to the array where the new names should be stored:
// Generate occlusion query names glGenQueries(27, queryIDs);
When we're done with them, we delete the query objects, indicating there are 27 names to be deleted in the provided array:
glDeleteQueries(27, queryIDs);
Occlusion query objects are not bound like other OpenGL objects, such as texture objects and buffer objects. Instead, they're created by calling glBeginQuery
. This marks the beginning of our query. The query object has an internal counter that keeps track of the number of fragments that would make it to the framebuffer—if we hadn't shut off the color buffer's write mask. Beginning the query resets this counter to zero to start a fresh query.
Then we draw our bounding volume. The query object's internal counter is incremented every time a fragment passes the depth test, and thus is not hidden by our main occluder, the grid which we've already drawn. For some algorithms, it's useful to know exactly how many fragments were drawn, but for our purposes here, all we care about is whether the counter is zero or nonzero. This value corresponds to whether any part of the bounding volume is visible or if all fragments were discarded by the depth test.
When we're finished drawing our bounding volume, we mark the end of our query by calling glEndQuery
. This tells OpenGL we're done with this query and lets us continue with another query or ask for the result back. Because we're drawing 27 spheres, we can improve the performance by using 27 different query objects. This way, we can queue up the drawing of all 27 bounding volumes without disrupting the pipeline by reading back the query results in between.
Listing 17.3 illustrates the rendering of our bounding volumes, bracketed by the beginning and ending of our query. Then we proceed to redraw the main occluder and possibly draw the actual spheres. Notice the code for visualizing the bounding volume whereby we leave the color buffer's write mask enabled. This way, we can see and compare the different bounding volume shapes.
Example 17.3. Beginning the Query, Drawing the Bounding Volume, Ending the Query, Then Moving on to Redraw the Actual Scene
// Called to draw scene objects void DrawModels(void) { GLint r, g, b; if (occlusionDetection || showBoundingVolume) { // Draw bounding boxes after drawing the main occluder DrawOccluder(); // All we care about for bounding box is resulting depth values glShadeModel(GL_FLAT); glDisable(GL_LIGHTING); glDisable(GL_COLOR_MATERIAL); glDisable(GL_NORMALIZE); // Texturing is already disabled if (!showBoundingVolume) glColorMask(0, 0, 0, 0); // Draw 27 spheres in a color cube for (r = 0; r < 3; r++) { for (g = 0; g < 3; g++) { for (b = 0; b < 3; b++) { if (showBoundingVolume) glColor3f(r * 0.5f, g * 0.5f, b * 0.5f); glPushMatrix(); glTranslatef(100.0f * r - 100.0f, 100.0f * g - 100.0f, 100.0f * b - 100.0f); glBeginQuery(GL_SAMPLES_PASSED, queryIDs[(r*9)+(g*3)+b]); switch (boundingVolume) { case 0: glutSolidCube(100.0f); break; case 1: glScalef(150.0f, 150.0f, 150.0f); glutSolidTetrahedron(); break; case 2: glScalef(90.0f, 90.0f, 90.0f); glutSolidOctahedron(); break; case 3: glScalef(40.0f, 40.0f, 40.0f); glutSolidDodecahedron(); break; case 4: glScalef(65.0f, 65.0f, 65.0f); glutSolidIcosahedron(); break; } glEndQuery(GL_SAMPLES_PASSED); glPopMatrix(); } } } if (!showBoundingVolume) glClear(GL_DEPTH_BUFFER_BIT); // Restore normal drawing state glShadeModel(GL_SMOOTH); glEnable(GL_LIGHTING); glEnable(GL_COLOR_MATERIAL); glEnable(GL_NORMALIZE); glColorMask(1, 1, 1, 1); } DrawOccluder(); // Turn on texturing just for spheres glEnable(GL_TEXTURE_2D); glEnable(GL_TEXTURE_GEN_S); glEnable(GL_TEXTURE_GEN_T); // Draw 27 spheres in a color cube for (r = 0; r < 3; r++) { for (g = 0; g < 3; g++) { for (b = 0; b < 3; b++) { glColor3f(r * 0.5f, g * 0.5f, b * 0.5f); glPushMatrix(); glTranslatef(100.0f * r - 100.0f, 100.0f * g - 100.0f, 100.0f * b - 100.0f); DrawSphere((r*9)+(g*3)+b); glPopMatrix(); } } } glDisable(GL_TEXTURE_2D); glDisable(GL_TEXTURE_GEN_S); glDisable(GL_TEXTURE_GEN_T); }
DrawSphere
contains the magic where we decide whether to actually draw the sphere. Our query results are waiting for us inside our 27 query objects. Let's find out which are hidden and which we have to draw.
The moment of truth is here. The jury is back with its verdict. We want to draw as little as possible, so we're hoping each and every one of our queries resulted in no fragments being touched. But if you think about this grid of spheres, you know that's not going to happen.
No matter what angle we're looking at our grid, unless we zoom way in, there will always be at least 9 spheres in view. Worst case is you'll see all the spheres on three faces of our grid: 19 spheres. Still, in that worst case, we save ourselves from drawing 8 spheres. That's almost a 30% savings in per-vertex costs. Best case, we save 66%, skipping 18 spheres. If we zoom in on one sphere, we could conceivably avoid drawing 26 spheres!
So how do you determine your luck? You simply query the query object. That sounds confusing, but this is a regular old query for OpenGL state. It just happens to be from something called a query object. In Listing 17.4, we call glGetQueryObjectiv
to see whether the pass counter is zero, in which case we won't draw the sphere.
Example 17.4. Checking the Query Results and Drawing the Sphere Only If We Have To
// Called to draw sphere void DrawSphere(GLint sphereNum) { GLboolean occluded = GL_FALSE; if (occlusionDetection) { GLint passingSamples; // Check if this sphere would be occluded glGetQueryObjectiv(queryIDs[sphereNum], GL_QUERY_RESULT, &passingSamples); if (passingSamples == 0) occluded = GL_TRUE; } if (!occluded) { glutSolidSphere(50.0f, 100, 100); } }
That's all there is to it. Each sphere's query is checked in turn, and we decide whether to draw the sphere. We've included a mode where we can disable the occlusion detection to see how badly our performance suffers. Depending on how many spheres are visible, you may see a boost of two times or more thanks to occlusion detection.
In addition to the query result, you can also query to find out whether the result is immediately available. If we didn't render the 27 bounding volumes back to back, and instead asked for each result immediately, the bounding box rendering might still have been in the pipeline and the result may not have been ready yet. You can query GL_QUERY_RESULT_AVAILABLE
to find out whether the result is ready. If it's not, querying GL_QUERY_RESULT
will stall until the result is available. So instead of stalling, you could find something useful for your application to do while you wait for the results to be ready. In our case, we planned ahead to do a bunch of work in between to be certain our first query result would be ready by the time we finished our 27th query.
Other state queries include the currently active query name (which query is in the middle of a glBeginQuery
/glEndQuery
, if any) and the number of bits in the implementation's pass counter. An implementation is allowed to advertise a 0-bit counter, in which case occlusion queries are useless and shouldn't be used. In Listing 17.5, we check for that case during an application's initialization right after checking for extension strings and entrypoints.
Example 17.5. Ensuring Occlusion Queries Are Truly Supported
// Make sure required functionality is available! version = glGetString(GL_VERSION); if ((version[0] == '1') && (version[1] == '.') && (version[2] >= '5') && (version[2] <= '9')) { glVersion15 = GL_TRUE; } if (!glVersion15 && !gltIsExtSupported("GL_ARB_occlusion_query")) { fprintf(stderr, "Neither OpenGL 1.5 nor GL_ARB_occlusion_query" " extension is available! "); Sleep(2000); exit(0); } // Load the function pointers if (glVersion15) { glBeginQuery = gltGetExtensionPointer("glBeginQuery"); glDeleteQueries = gltGetExtensionPointer("glDeleteQueries"); glEndQuery = gltGetExtensionPointer("glEndQuery"); glGenQueries = gltGetExtensionPointer("glGenQueries"); glGetQueryiv = gltGetExtensionPointer("glGetQueryiv"); glGetQueryObjectiv = gltGetExtensionPointer("glGetQueryObjectiv"); glGetQueryObjectuiv = gltGetExtensionPointer("glGetQueryObjectuiv"); glIsQuery = gltGetExtensionPointer("glIsQuery"); } else { glBeginQuery = gltGetExtensionPointer("glBeginQueryARB"); glDeleteQueries = gltGetExtensionPointer("glDeleteQueriesARB"); glEndQuery = gltGetExtensionPointer("glEndQueryARB"); glGenQueries = gltGetExtensionPointer("glGenQueriesARB"); glGetQueryiv = gltGetExtensionPointer("glGetQueryivARB"); glGetQueryObjectiv = gltGetExtensionPointer("glGetQueryObjectivARB"); glGetQueryObjectuiv = gltGetExtensionPointer("glGetQueryObjectuivARB"); glIsQuery = gltGetExtensionPointer("glIsQueryARB"); } if (!glBeginQuery || !glDeleteQueries || !glEndQuery || !glGenQueries || !glGetQueryiv || !glGetQueryObjectiv || !glGetQueryObjectuiv || !glIsQuery) { fprintf(stderr, "Not all entrypoints were available! "); Sleep(2000); exit(0); } // Make sure query counter bits is non-zero glGetQueryiv(GL_SAMPLES_PASSED, GL_QUERY_COUNTER_BITS, &queryCounterBits); if (queryCounterBits == 0) { fprintf(stderr, "Occlusion queries not really supported! "); fprintf(stderr, "Available query counter bits: 0 "); Sleep(2000); exit(0); }
The only other query to be aware of is glIsQuery
. This command just checks whether the specified name is the name of an existing query object, in which case it returns GL_TRUE
. Otherwise, it returns GL_FALSE
.
When rendering complex scenes, sometimes we waste hardware resources by rendering objects that will never be seen. We can try to avoid the extra work by testing whether an object will show up in the final image. By drawing a bounding box, or some other simple bounding volume, around the object, we can cheaply approximate the object in the scene. If occluders in the scene hide the bounding box, they would also hide the actual object. By wrapping the bounding box rendering with a query, we can count the number of pixels that would be hit. If the bounding box hits no pixels, we can guarantee that the original object would also not be drawn, so we can skip rendering it. Performance improvements can be dramatic, depending on the complexity of the objects in the scene and how often they are occluded.
glIsQuery | |
---|---|
Purpose: | |
Include File: |
|
Syntax: | |
GLboolean glIsQuery(GLuint id);
| |
Description: | This function queries whether the specified name is the name of a query object. |
Parameters: | |
| |
Returns: |
|
See Also: |
|