GPUbench (under construction)

GPUbench is a 3D performance benchmark for very early OpenGL accelerators. It was created to get raw performance numbers for pixel fill-rate (how many pixels are drawn per second) and triangle-rate (how many triangles are drawn per second) under various conditions. We were able to benchmark many professional 3D accelerators for PC/Windows and also several UNIX workstations (SGI, Sun…), and the results are directly comparable. The oldest tested UNIX workstation is from 1992 (SGI Indigo) and the oldest tested PC 3D accelerator is from 1995 (Matrox Millennium MGA-2064W).

I will make a proper page explaining this benchmark later. At the moment, you can check the original page: http://swarm.cz/gpubench/ (the download links contain the whole project including the source code and result/spec files for every tested card)

All the results are consolidated here (the same result page is also included in the benchmark archive as a Microsoft Excel file). I know that this is not the most user friendly form for result browsing, but I am too lazy to make it better. Use the tick boxes and buttons to select only the results, you want to see. Certain cards may have multiple results if they are measured in multiple color depths or if multiple versions of the card were available. If more graphics cards are selected, they are sorted based on the date of release.

This benchmark does not provide a result as a single number. My goal is to provide data that help to better understand how these old chips behave under different conditions. Pixel fill-rate tests are measured in Mpix/s (million pixels per second) and they are designed the way not to stress other parts of the hardware – thus, the triangles are large and the whole scene is just ~10 draw calls. The triangles are placed over each other to decrease the frame-rate, so the buffer flipping and other maintenance tasks do not affect the fill-rate much. Likewise, triangle-rate tests are designed a similar way – the triangles are very small to ensure that the test is not pixel fill-rate limited.

Each result line tests different effects on polygons (opaque/semi-transparent polygons, hi-res/low-res/no textures, enabled/disabled texture filtering…). These are the pixel fill-rate tests:

shaded onlyVertex-colored polygons
+blendVertex-colored polygons + added per-polygon alpha-blending
+TxVertex-colored polygons + bilinear-filtered texture
+Tx(point)Vertex-colored polygons + point-sampled texture (no filtering)
+blend+TxVertex-colored polygons + added per-polygon alpha-blending + bilinear-filtered texture
+blend+Tx(point)Vertex-colored polygons + added per-polygon alpha-blending + point-sampled texture (no filtering)
+LowResTxVertex-colored polygons + bilinear-filtered low-resolution texture
+LowResTx(point)Vertex-colored polygons + point-sampled low-resolution texture (no filtering)
multi-textureVertex-colored polygons + two bilinear-filtered textures drawn using multi-texturing OpenGL extensions (multi-pass is used instead if a cards doesn’t support multi-texturing)
multi-passVertex-colored polygons + two bilinear-filtered textures drawn using multi-pass rendering (each multitextured polygon is drawn as two single-textured polygons – one of them with blending)

Just to make it clear – for example: +blend+Tx means that the test scene contains multiple large semi-transparent polygons, each with a bilinear-filtered high-resolution texture. Such a test stresses the texture unit and requires to read an original value from the frame buffer, blend it with the currently processed pixel and overwrite it with the result. You would see a big performance hit in this test on any card that does not have its local memory fast enough.

The pixel fill-rate tests are run with and without Z-Buffer, to see the Z-Buffer performance hit caused by the low memory bandwidth or a chip architecture requiring more clock cycles to process the Z testing.

The triangle-rate results tell how fast a graphics card is in processing geometry:

Triangle StripsDrawing small triangles formed in triangle-strips (1 vertex calculated per triangle)
Independed TrianglesDrawing small independent triangles (3 vertices calculated per triangle)
Hi-Res Textured I. TrianglesDrawing small independent textured triangles (3 vertices calculated per triangle)
One Triangle per Draw CallDrawing one triangle per draw call (stresses command buffers)

Please note that if a graphics chip does not have a true geometry unit, its geometry performance is affected heavily by a CPU (both ways). When combined with fast (and much newer) Pentium III/IV CPUs, certain cheap graphics accelerators can provide better geometry performance than hi-end cards with dedicated geometry units.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.