by Benjamin Lipchak
WHAT YOU'LL LEARN IN THIS CHAPTER:
How To | Functions You'll Use |
---|---|
Create, bind to, and delete buffer objects |
|
Send data into a buffer object indirectly |
|
Write data into a buffer object directly |
|
Graphics cards today have nearly as much memory as the rest of the system they're plugged into. The amount of graphics card memory tends to at least be within an order of magnitude, say 25%, of the amount of system memory. That's quite a resource to exploit—or to waste by not making the best use of it.
Video memory has traditionally been used for storing the following:
Front buffers (what you see on the screen)
Back buffers (what you don't see when double-buffering)
Depth buffers (for hidden surface removal)
Other per-pixel storage, such as stencil planes, overlay planes, and so on
Even at high resolutions and color depths, such as 1920×1280 and 32 bits per pixel, graphics consume only in the ballpark of 25 to 40MB. Depending on your available video memory, that can leave hundreds of megabytes at your disposal.
This extra space is most often used to cache texture maps so they don't have to be continually transmitted from system memory to the graphics card every time a new texture is used. Instead, they are kept locally in video memory so they're ready when needed. When the texture cache becomes full, old textures that haven't been used recently are evicted to make room for new textures.
Some OpenGL implementations also attempt to cache geometry data in video memory, such as that present in display lists or vertex arrays. Unfortunately, the driver doesn't know how often the geometry is going to change or how much total geometry there's going to be. For vertex arrays, it doesn't know when the data in the arrays has actually been changed by the application. Only the application knows all this information, so if the driver bothers to try at all, the best it can do is guess, and that's not enough to guarantee optimal performance.
Extensions such as GL_EXT_compiled_vertex_array
, GL_EXT_draw_range_elements
, GL_NV_vertex_array_range
, and GL_ATI_vertex_array_object
have been introduced over the years to attempt to supply the driver with some of this information. This progress has culminated in a single extension, GL_ARB_vertex_buffer_object
, which hands over full control to the application when it comes to storing its geometry in local video memory for optimal rendering performance. This extension was promoted into OpenGL 1.5 as a core feature.
Figure 16.1 illustrates the different types of data that share, and in fact compete for, local video memory on the graphics card.
Buffer objects are repositories for storing data in local video memory. You can store anything you want in there and read it back later. If you want to store your grocery list in there, you're free to do that. But the only useful things to store in there are vertex arrays and array indices. You can clue OpenGL in on the fact that your vertex arrays live in a buffer object, at which point they become blazingly fast vertex arrays.
First, though, you need your vertex arrays. If your application uses immediate mode (glBegin
/glEnd
pairs), you can't take advantage of buffer objects without first switching over to the vertex array paradigm, discussed in Chapter 11, “It's All About the Pipeline: Faster Geometry Throughput.” When you have vertex arrays working, putting them into buffer objects is relatively easy. It also makes before and after performance comparisons straightforward and gratifying!
For our sample program, we'll construct vertex arrays with the geometry for some sphere-shaped particle clouds. The more of a geometry burden we can introduce, the more improvement we'll see when we get around to accelerating them. So let's lay on the vertices!
The number of particles per sphere is configurable. If your OpenGL implementation cannot handle the number of spheres defined here, or if it eats them for breakfast and wants more, just change this constant:
GLint numSphereVertices = 30000;
We need some geometry for this program, but we don't want to waste space in our code to load anything fancy, nor do we want to waste time explaining it. So we'll settle for something simple to generate, but moderately interesting: particle cloud spheres.
We've already decided how many vertices we want, set by the constant shown in the preceding section. We'll just scatter these points randomly across the surface of a sphere. Sounds complicated, right? Not really. All we have to do is generate a random point in space. We take the vector between this random point and the origin (0,0,0) and normalize it to a unit vector. It now represents a point 1 unit away from the origin in some random direction. Figure 16.2 illustrates the normalization of the random vectors. Repeat 30,000 times, and we have a sphere-shaped cloud of particles.
Here's the code:
for (i = 0; i < numSphereVertices; i++) { GLfloat r1, r2, r3, scaleFactor; // pick a random vector r1 = (GLfloat)(rand() - (RAND_MAX/2)); r2 = (GLfloat)(rand() - (RAND_MAX/2)); r3 = (GLfloat)(rand() - (RAND_MAX/2)); // determine normalizing scale factor scaleFactor = 1.0f / sqrt(r1*r1 + r2*r2 + r3*r3); sphereVertexArray[(i*3)+0] = r1 * scaleFactor; sphereVertexArray[(i*3)+1] = r2 * scaleFactor; sphereVertexArray[(i*3)+2] = r3 * scaleFactor; }
We have the data prepared. Now we must enable the arrays and set the array pointers so that OpenGL will know where to find the geometry when rendering:
glNormalPointer(GL_FLOAT, 0, sphereVertexArray); glVertexPointer(3, GL_FLOAT, 0, sphereVertexArray); ... glEnableClientState(GL_NORMAL_ARRAY); glEnableClientState(GL_VERTEX_ARRAY);
Notice that we're enabling two arrays: one for the vertex position, but also one for the vertex normal. Normals make lighting possible, and it just so happens that for a unit sphere (where radius is 1) at the origin, the position is the same as the normal! So we can reuse the same data for both arrays.
Figure 16.3 visually depicts data in our vertex array.
Thirty thousand vertices might sound like a lot, but to bring our OpenGL implementation to its knees, we're going to have to throw it a bit more geometry still. So let's draw a 3×3×3 cube of spheres and set a different color for each cube. As demonstrated in Listing 16.1, we can reuse the same vertex arrays, just changing the modelview matrix to individually resize and locate each sphere in between calls to glDrawArrays
.
Example 16.1. Sphere Vertex Array Drawn 27 Times
// Called to draw scene void RenderScene(void) { static GLTStopwatch stopWatch; static int frameCounter = 0; // Get initial time if (frameCounter == 0) gltStopwatchReset(&stopWatch); frameCounter++; if (frameCounter == 100) { frameCounter = 0; fprintf(stdout, "FPS: %f ", 100.0f / gltStopwatchRead(&stopWatch)); gltStopwatchReset(&stopWatch); } // Track camera angle glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluPerspective(45.0f, 1.0f, 10.0f, 10000.0f); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); gluLookAt(cameraPos[0], cameraPos[1], cameraPos[2], 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f); // Clear the window with current clearing color glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); if (animating) { RegenerateSphere(); SetRenderingMethod(); } // Draw objects in the scene DrawModels(); // Flush drawing commands glutSwapBuffers(); glutPostRedisplay(); } } frameCounter++; if (frameCounter == 100) { long thisTime; frameCounter = 0; _ftime(&timeBuffer); thisTime = (timeBuffer.time * 1000) + timeBuffer.millitm; fprintf(stdout, "FPS: %f ", 100.0f * 1000.0f / (thisTime - lastTime)); lastTime = thisTime; } // Track camera angle glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluPerspective(45.0f, 1.0f, 10.0f, 10000.0f); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); gluLookAt(cameraPos[0], cameraPos[1], cameraPos[2], 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f); // Clear the window with current clearing color glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); if (animating) { RegenerateSphere(); SetRenderingMethod(); } // Draw objects in the scene DrawModels(); // Flush drawing commands glutSwapBuffers(); glutPostRedisplay(); }
Two things to notice from this listing are the performance measurement code and the animation code. We need a way to test dynamically changing geometry, so when the animation toggle is on, we regenerate a new sphere vertex array for every frame. The random animated points sort of look like static.
Because this chapter is all about squeezing more performance out of geometry processing, we would be remiss not to measure that performance in some way. For every 100 frames we render, we look at the time that has elapsed since we started those 100 frames. Divide the 100 frames across the elapsed time, and we have a rough count of frames per second. This number lets us compare vertex array performance against buffer object performance. It is printed to stdout
, so look for it in the console window, not in the sample program's graphics window.
Believe it or not, we've done the hard part. Generating or loading vertex array data still remains the same burden it always was. All we're going to do differently now is tell OpenGL to store the vertex array data inside a buffer object. Same stuff, different wrapper.
Before getting our hands dirty, we need to take care of one minor detail. Buffer objects are relatively new to OpenGL. The feature was first introduced as the GL_ARB_vertex_buffer_object
extension and was promoted as a core feature quickly thereafter when OpenGL 1.5 was ratified. Some new features, such as depth textures and shadows, are easily integrated into applications because all they need are some new token definitions from a header file. Unfortunately, buffer objects need a bit more to make them start working. They introduce new API entrypoints that you need to latch onto before you can start using them.
As you do with any feature, you must make sure the appropriate extension or version of OpenGL is available before trying to use it. Here, we're checking for either OpenGL 1.5, which includes buffer objects, or for the extension that also provides the equivalent functionality:
// Make sure required functionality is available! version = glGetString(GL_VERSION); if ((version[0] == '1') && (version[1] == '.') && (version[2] >= '5') && (version[2] <= '9')) { glVersion15 = GL_TRUE; } if (!glVersion15 && !gltIsExtSupported("GL_ARB_vertex_buffer_object")) { fprintf(stderr, "Neither OpenGL 1.5 nor GL_ARB_vertex_buffer_object" " extension is available! "); Sleep(2000); exit(0); }
Now that we know the feature is supported, we need the function pointers for its entrypoints. On Windows platforms, the function wglGetProcAddress
queries for the function pointers based on a string containing the entrypoint name. Other platforms have other means of providing these pointers, so we've abstracted them into a tool library function, gltGetExtensionPointer
. Note that if only the extension is available, the function names have the ARB
suffix. If OpenGL 1.5 is available, we don't need the suffix:
// Load the function pointers if (glVersion15) { glBindBuffer = gltGetExtensionPointer("glBindBuffer"); glBufferData = gltGetExtensionPointer("glBufferData"); glBufferSubData = gltGetExtensionPointer("glBufferSubData"); glDeleteBuffers = gltGetExtensionPointer("glDeleteBuffers"); glGenBuffers = gltGetExtensionPointer("glGenBuffers"); glMapBuffer = gltGetExtensionPointer("glMapBuffer"); glUnmapBuffer = gltGetExtensionPointer("glUnmapBuffer"); } else { glBindBuffer = gltGetExtensionPointer("glBindBufferARB"); glBufferData = gltGetExtensionPointer("glBufferDataARB"); glBufferSubData = gltGetExtensionPointer("glBufferSubDataARB"); glDeleteBuffers = gltGetExtensionPointer("glDeleteBuffersARB"); glGenBuffers = gltGetExtensionPointer("glGenBuffersARB"); glMapBuffer = gltGetExtensionPointer("glMapBufferARB"); glUnmapBuffer = gltGetExtensionPointer("glUnmapBufferARB"); } if (!glBindBuffer || !glBufferData || !glDeleteBuffers || !glGenBuffers || !glMapBuffer || !glUnmapBuffer) { fprintf(stderr, "Not all entrypoints were available! "); Sleep(2000); exit(0); }
Buffer objects are treated similarly to other objects in OpenGL, such as texture objects. They are created and their state initialized when first bound with the glBindBuffer
command. glGenBuffers
can be called first to get a list of available names, but this command doesn't actually create the buffer objects. You still need to bind the object name before it's created.
When you're finished with your buffer objects, you delete them with glDeleteBuffers
. If a deleted buffer is currently bound, that binding is undone, and the null buffer object (name zero) is implicitly bound, telling OpenGL to go back to using traditional (nonbuffer object) vertex arrays:
// Generate a buffer object glGenBuffers(1, &bufferID); ... glBindBuffer(GL_ARRAY_BUFFER, bufferID); ... glDeleteBuffers(1, &bufferID);
When you have a buffer object bound, it tells OpenGL to source its vertex array data from the buffer object's data store instead of the vertex array pointers set via commands like glVertexPointer
. But those pointers still are used. Instead of being pointers to data in client memory, when a buffer object is bound, these pointers are interpreted as offsets within the buffer object's data store.
In our buffer object program, the data used for both the vertex position array and the normal array begins right at the beginning of the buffer object's data store, so an offset of zero is used. Our data is tightly packed without any padding or interlacing, so the stride parameter is also zero:
glBindBuffer(GL_ARRAY_BUFFER, bufferID); // No stride, no offset glNormalPointer(GL_FLOAT, 0, 0); glVertexPointer(3, GL_FLOAT, 0, 0); ... glDrawArrays(GL_POINTS, 0, numSphereVertices);
We've described how to create buffer objects and how to tell OpenGL to use them as the source for rendering geometry. But we're still missing one important piece of the puzzle: how to load data into the buffer object. A buffer object starts out its life with an empty data store, so we need to take care of that before doing anything useful with them. The next two sections describe your two choices for loading up your buffer object's data store.
The first option for loading your buffer object's data store is analogous to the one you use for loading texture image data into a texture object. You give glTexImage
a pointer to your texel data, and OpenGL copies it into its internal texture storage. If you give it a null pointer, glTexImage
still creates the texture with the size you want but leaves the texels uninitialized. If you want to respecify a portion of the texture, you can call glTexSubImage
and tell it where and how much data to replace.
The procedure is the same with buffer objects. You can call glBufferData
to establish the size of your data store and to supply a hint about how it will be accessed. Data is copied from the pointer you provide, unless the pointer is null, in which case the data remains uninitialized. glBufferSubData
can be called to respecify a portion of the data store.
glTexImage
and glTexSubImage
entrypoints accept a target parameter to indicate which texture target is being specified, such as GL_TEXTURE_2D
or GL_TEXTURE_3D
. Similarly, glBufferData
, glBufferSubData
, and other buffer object entrypoints also accept a target parameter. This target reflects which type of buffer object is being operated on, and can either be GL_ARRAY_BUFFER
or GL_ELEMENT_ARRAY_BUFFER
. The former is used to store vertex array data, including colors, normals, texture coordinates, and positions. The latter is used to store array indices as used by glDrawElements
.
In our sample program, if we're not animating, we simply create the data store by calling glBufferData
, providing a usage hint that the data will be static. All buffer sizes are measured in terms of “basic machine units,” or bytes:
glBufferData(GL_ARRAY_BUFFER, sizeof(GLfloat) * numSphereVertices * 3, sphereVertexArray, GL_STATIC_DRAW);
However, if we are animating, we don't want to incur the expense of re-creating the data store during every frame of animation, when in fact the size of the data remains the same. Instead, we'll create the data store once when entering animation mode, with a null pointer so no data is copied yet, and providing a usage hint that the data will be streaming (used only once). Then, for each frame, we update the data with glBufferSubData
:
// Establish streaming buffer object // Data will be loaded with subsequent calls to glBufferSubData glBufferData(GL_ARRAY_BUFFER, sizeof(GLfloat) * numSphereVertices * 3, NULL, GL_STREAM_DRAW); ... glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(GLfloat) * numSphereVertices * 3, sphereVertexArray);
The usage hints that we've been passing into glBufferData
are really just that—hints. One OpenGL implementation may ignore them completely, whereas another implementation may be able to base crucial decisions on a hint that will greatly impact the performance of your buffer objects. If your data never changes, the driver may decide to put the data in local video memory. If you hint that your data is dynamic, the driver may place your data somewhere it is cheaper to constantly update it, such as AGP memory. Table 16.1 shows your three hint options.
Table 16.1. Buffer Object Usage Hints
Note that there are actually READ
and COPY
variants of these hints in addition to the DRAW
variants, but they are just in place for future use by extensions that build on buffer object functionality. Drawing from buffer object data is currently the only usage model available.
This indirect copy method for loading buffer objects works well and follows similar paradigms already used by texture objects. However, after we set up our data in client memory, this method requires OpenGL to perform a copy into the buffer object memory. What if we could cut out the middleman and generate our geometry data right into the buffer object memory?
We can. This procedure is called mapping your buffer object. By calling glMapBuffer
, you can get a pointer to the buffer object's data store mapped into the client's address space. This means you can write to it, read from it, whatever you want. The only string attached is that you can't use this memory as a source or destination parameter for other OpenGL entrypoints, such as glTexImage
, glLightfv
, glDrawPixels
, and glReadPixels
. Here we map the buffer object:
glBindBuffer(GL_ARRAY_BUFFER, bufferID); // Avoid pipeline flush during glMapBuffer by // marking buffer object's data store as empty glBufferData(GL_ARRAY_BUFFER, sizeof(GLfloat) * numSphereVertices * 3, NULL, animating ? GL_STREAM_DRAW : GL_STATIC_DRAW); sphereVertexArray = (GLfloat *)glMapBuffer(GL_ARRAY_BUFFER, GL_WRITE_ONLY);
Before calling glMapBuffer
, we first call glBufferData
with a null pointer. We do this even if we've already created the buffer object because it helps prevent a performance degradation during mapping. If there is any rendering still in the pipeline using old data in the buffer object, glMapBuffer
has to wait for the pipeline to flush out before it can return the mapped buffer pointer to the application. (Otherwise, the application might alter data that still needs to be used by previous drawing commands, corrupting that rendering.) Emptying the buffer object's data store with the null pointer tells OpenGL that any data previously loaded into the buffer object is no longer needed. Mapping will then issue a new unused data store, avoiding the implicit pipeline synchronization that would otherwise occur.
When calling glMapBuffer
, you must provide an access flag that tells OpenGL in which ways you'll be accessing the buffer object's data store while it's mapped. Table 16.2 describes the three access modes.
Table 16.2. Mapped Buffer Object Access Modes
You can't start drawing from the buffer object until it has been unmapped again with the glUnmapBuffer
command. This command lets OpenGL know that you've made your changes to the data, and you're ready to hand control of the data store back to OpenGL. But things can go wrong during the time you have the buffer object mapped. Various system events, such as video mode changes or power-saving suspend/hibernation, can put your buffer object into a state where its integrity is unknown. The memory could have been temporarily reclaimed or powered off, leaving its contents corrupted or otherwise unknown. In this rare case, glUnmapBuffer
returns GL_FALSE
, meaning that the application is responsible for resupplying all the data. This is an unavoidable burden if you want your application to be robust. In our program, all we have to do is call back into the generation routine to try again:
if (!glUnmapBuffer(GL_ARRAY_BUFFER)) { // Some window system event has trashed our data... // Try, try again! RegenerateSphere(); }
Figure 16.4 shows the output from Listing 16.2. The output looks the same from both the traditional vertex array code and the buffer object code. But if you look at the frames per second output, that's where you'll see the difference. Roughly a 25% to 200% improvement or more may be observed in mapped buffer object mode, depending on your OpenGL implementation and the number of vertices you're throwing at it.
Example 16.2. Code for Regenerating and Loading Sphere Vertex Array Data
// Called to regenerate points on the sphere void RegenerateSphere(void) { GLint i; if (mapBufferObject && useBufferObject) { // Delete old vertex array memory if (sphereVertexArray) free(sphereVertexArray); glBindBuffer(GL_ARRAY_BUFFER, bufferID); // Avoid pipeline flush during glMapBuffer by // marking buffer object's data store as empty glBufferData(GL_ARRAY_BUFFER, sizeof(GLfloat) * numSphereVertices * 3, NULL, animating ? GL_STREAM_DRAW : GL_STATIC_DRAW); sphereVertexArray = (GLfloat *)glMapBuffer(GL_ARRAY_BUFFER, GL_WRITE_ONLY); } else if (!sphereVertexArray) { // We need our old vertex array memory back sphereVertexArray = (GLfloat *)malloc(sizeof(GLfloat) * numSphereVertices * 3); if (!sphereVertexArray) { fprintf(stderr, "Unable to allocate memory for vertex arrays!"); Sleep(2000); exit(0); } } for (i = 0; i < numSphereVertices; i++) { GLfloat r1, r2, r3, scaleFactor; // pick a random vector r1 = (GLfloat)(rand() - (RAND_MAX/2)); r2 = (GLfloat)(rand() - (RAND_MAX/2)); r3 = (GLfloat)(rand() - (RAND_MAX/2)); // determine normalizing scale factor scaleFactor = 1.0f / sqrt(r1*r1 + r2*r2 + r3*r3); sphereVertexArray[(i*3)+0] = r1 * scaleFactor; sphereVertexArray[(i*3)+1] = r2 * scaleFactor; sphereVertexArray[(i*3)+2] = r3 * scaleFactor; } if (mapBufferObject && useBufferObject) { if (!glUnmapBuffer(GL_ARRAY_BUFFER)) { // Some window system event has trashed our data... // Try, try again! RegenerateSphere(); } sphereVertexArray = NULL; } } // Switch between buffer objects and plain old vertex arrays void SetRenderingMethod(void) { if (useBufferObject) { glBindBuffer(GL_ARRAY_BUFFER, bufferID); // No stride, no offset glNormalPointer(GL_FLOAT, 0, 0); glVertexPointer(3, GL_FLOAT, 0, 0); if (!mapBufferObject) { if (animating) { glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(GLfloat) * numSphereVertices * 3, sphereVertexArray); } else { // If not animating, this gets called once // to establish new static buffer object glBufferData(GL_ARRAY_BUFFER, sizeof(GLfloat) * numSphereVertices * 3, sphereVertexArray, GL_STATIC_DRAW); } } } else { glBindBuffer(GL_ARRAY_BUFFER, 0); glNormalPointer(GL_FLOAT, 0, sphereVertexArray); glVertexPointer(3, GL_FLOAT, 0, sphereVertexArray); } }
The sample program did not touch upon a couple of things; thus, they have not been discussed yet. The first is state queries. As with all OpenGL state, the new state related to buffer objects can be queried. See the reference section for details, but here's a brief breakdown of the state-querying entrypoints introduced for buffer objects:
glGetBufferParameteriv
—. Query a buffer object's usage hint, mapped access flag, mapped status, and size.
glGetBufferPointerv
—. Query a buffer object's mapped address.
glGetBufferSubData
—. Query data from a buffer object's data store.
glIsBuffer
—. Query if a buffer object name corresponds to an existing buffer object.
Our sample also did not use array indices. The benefit of using glDrawElements
is that it removes redundancy from a pool of vertices. If a particular vertex is used more than once (for example, on a corner shared by several triangles), the vertex need only be present and processed once in the vertex array, but can be referenced many times in the array indices. Our spheres were composed of thousands of individual points, each used only once per sphere. So there was no benefit to be gained from array indices. However, should you use them in your application, it's good to know that they can also be placed in buffer objects. You just need to use the GL_ELEMENT_ARRAY_BUFFER
target instead of GL_ARRAY_BUFFER
.
OpenGL implementations have traditionally been crippled in terms of their ability to efficiently take geometry from an application and render it. This has not been due to any technical obstacle. The application had no way of telling the driver how large a data set would be used or how often it would be updated—crucial information for the driver to know where it should store the data. The problem has just been a lack of communication, cleared up by the introduction of buffer objects.
Buffer objects are created and deleted just like texture objects. In fact, there's a way to copy data into buffer objects that is also similar to texture objects. But buffer objects also provide a powerful mechanism for mapping buffer object memory to the application's address space so the data store can be updated directly without an additional copy required.
glIsBuffer | |
---|---|
Purpose: | |
Include File: |
|
Syntax: | |
GLboolean glIsBuffer(GLuint buffer);
| |
Description: | This function queries whether the specified name is the name of a buffer object. |
Parameters: | |
| |
Returns: |
|
See Also: |
|