NT4 OpenGL Mini-Client Driver Model, Matrox Millennium I/II and ATI Rage II/Pro

Long time ago, Microsoft was among the ones, who wanted to increase popularity of the OpenGL 3D API and bring it to the world of PCs. This was before Direct3D and the effort came from the Windows NT team. Microsoft wanted to make Windows (NT) a competitive operating system in the workstation segment – something that could compete with UNIX systems running on Sun, HP, IBM and SGI workstations. The MCD (Mini-Client Driver) was an extra step to make OpenGL acceleration easier to implement for hardware vendors.

OpenGL support and Windows NT 3.5x

With Windows NT (at least starting with the version 3.5; 1994), Microsoft finally had a sophisticated multitasking, multithreading and multiprocessing operating system that was robust a stable enough to be used in the technical workstation segment. Everything was even better in 1995 when Intel released the Pentium Pro processor with out-of-order execution. Pentium Pro was a big step for professional PCs, because it provided performance comparable with UNIX/RISC workstations. However, one piece was still missing – support for a universal widely adopted 3D API.

Microsoft bet on SGI’s OpenGL (the open implementation of IRIS GL from SGI workstations), which was a lucky choice. Windows NT 3.5 was among the first operating systems (after SGI IRIX) to implement this 3D API. At first, there were no hardware 3D accelerators available, so Microsoft decided to create an OpenGL 1.0 software renderer and include it with the operating system for free (giving away this for free was not a common practice in the UNIX world). This allowed every Windows NT user to run programs written in OpenGL (albeit slowly) without any special hardware. It also made porting 3D programs from UNIX workstations to PCs easier.

Windows NT 3.5x allowed hardware vendors to create a hardware accelerated OpenGL implementation for their graphics hardware using a driver model called ICD (Installable Client Driver). ICD is a complete OpenGL driver (connected directly to a display driver) that replaces the Microsoft’s software implementation and a hardware vendor can accelerate the whole 3D pipeline the way that is the best for the hardware. On the other side, the complexity and months of work required to create such a driver were a barrier in bringing affordable OpenGL accelerators to PCs.

I found only one 1994 3D accelerator that listed OpenGL among other 3D APIs (mostly drivers for specific CAD packages like Autodesk AutoCAD) – Matrox Impression Plus. Matrox claimed support for double-buffering, Z-Buffer, wireframe and Gouraud (smooth) shading (so no textures, no transparency effects). Such hardware supported less functions that the Microsoft’s software renderer, but it was for sure faster and CAD packages were ok with this set of features. I cannot take as granted that the OpenGL driver existed as I didn’t find anyone mentioning it and I don’t have the card to test it by myself. There were many vendors claiming support for multiple 3D APIs, but their claims had never been transformed into a real product.

A 4-MB version of the first Matrox Millennium card

However, the cheaper, newer (1995) and more integrated version (with the same set of accelerated features) called Matrox Millennium MGA-2064W did have hardware OpenGL driver (ICD) in Windows NT 3.5x. I plan to test it one day because there are indications that it was very crude and Matrox abandoned it in Windows NT 4.0 just a year later.

Windows NT 4.0 OpenGL Mini-Client Driver (August 1996)

Windows NT 4.0 got an improved version of the software renderer with full support for OpenGL 1.1 (including support for textures). This was not long before Microsoft changed its strategy and started pushing Direct3D instead. Sadly, the software renderer bundled with NT 4.0 is identical to what Microsoft uses in Windows operating systems to this day.

The Windows NT team knew that mass-market adoption of the OpenGL API in PCs was limited by the effort 3D accelerator vendors needed to put into creating the full ICD driver – a proper OpenGL ICD driver usually required almost a year of development. They wanted to overcome the chicken & egg problem by making the OpenGL implementation easier for the hardware vendors. The solution was called MCD (mini-client driver). It was expected that writing an MCD driver for a new 3D accelerator would require just three months.

Finding a low-cost 3D accelerator with the support for OpenGL was tough even in the first quarter of 1997 (GLQuake was released somewhere between December 1996 and January 1997). As far as I know, the only low-cost 3D accelerator with support for alpha/textures and a working OpenGL driver was 3Dlabs Permedia (1). Others (S3, ATI, 3Dfx) just promised universal OpenGL drivers in future, but nothing really happened for many months. Mini-Client Driver model seemed to be the right way for vendors to deliver something good enough before a full ICD could be programmed.

MCD architecture

Unlike ICD, MCD does not require vendors to write a complete OpenGL driver. MCD is always connected to the Microsoft’s OpenGL software renderer. An OpenGL application calls the system-supplied MCD32 library that provides connection between the user mode and kernel mode. It sends OpenGL commands to MCDSRV32 that is also system-supplied and does all the complicated tasks like object management and parameter validation and provides connection to a hardware vendor-supplied MCD driver.

The system-supplied part of the MCD interface handles geometry and lighting and does all the rasterization functions in software by default. A hardware vendor routes color and Z buffers into video card’s memory (together with implementing memory management functions) and decides which rasterizer functions should be replaced with hardware accelerated versions. So, a hardware vendor is isolated from OpenGL related troubles and focuses only on acceleration routines. If something is left unaccelerated or the accelerated code fails, the drawing operation is finished using the software renderer. Accelerated and unaccelerated operations can be combined in a single frame and this is decided per primitive drawn (line, triangle) based on requested parameters (transparency, texture) and environment variables (fog…).

Almost invisible

If you are a developer of an OpenGL program, you can ask the system, what is responsible for the OpenGL rendering. A 3D accelerator with full ICD OpenGL implementation can respond with its name and features. It can look like this:

  • GL_VENDOR = 3Dlabs
  • GL_RENDERER = Oxygen 402
  • GL_VERSION = 1.1 ST
  • GL_EXTENSIONS = GL_WIN_swap_hint GL_KTX_buffer_region GL_EXT_bgra GL_EXT_DrawPixels

You can also ask whether the driver is the ICD type. If an ICD is loaded, the PFD_GENERIC_FORMAT flag is set to zero. This tells you for sure that you don’t use Microsoft’s software renderer. Among other things, you can see the OpenGL version supported by the driver and all available extensions (the concept of extensions allows OpenGL drivers leveraging any additional functionality beyond the OpenGL version and standardized features).

The output for the built-in software renderer reports following (since Windows 95 / NT4.0) and the PFD_GENERIC_FORMAT flag is set to one:

  • GL_VENDOR = Microsoft Corporation
  • GL_RENDERER = GDI Generic
  • GL_VERSION = 1.1.0
  • GL_EXTENSIONS = GL_WIN_swap_hint GL_EXT_bgra GL_EXT_paletted_texture

For many, seeing “Microsft GDI Generic” in these values is a solid proof, that the system does not have any OpenGL acceleration, but that’s wrong. MCD drivers do report the same values. Even the PFD_GENERIC_FORMAT flag is set to one. The only difference is in the PFD_GENERIC_ACCELERATED flag (which is relevant only for the GDI Generic renderer). This flag is set to zero for the software renderer and set to one for MCD.

The fact that only one flag is different, and that the flag is typically not reported by diagnostic tools causes that people are misled into an impression that the 3D acceleration does not work on cards with MCD drivers. This is a bigger issue these days, when vintage hardware collectors test old graphics cards in powerful machines (Pentium 4, Athlon XP), where the software renderer is not much slower than early 3D accelerators.

Limitations

To my surprise, the MCD driver model has very little overhead (geometry, rasterizing). On a Pentium II machine, you will not see any performance difference between MCD and ICD if both are properly done (if we talk about a consumer ~1998 3D accelerator). However, MCD is not a good option, if a 3D accelerator has a geometry unit (this was relevant only for hi-end workstation cards) because the MCD system always does all geometry transformations and lighting calculations in software (and this cannot be altered by an MCD driver).

MCD driver does not allow to remove or add any feature from/to the set of features leveraged to OpenGL applications. If your card does not support certain types of transparency effect or textures, the hardware vendor cannot disable these features (for example to prevent programs from slowing down by saving CPU power at the expense of worse output quality) and OpenGL applications cannot see which features are hardware accelerated. Also, if your card supports multitexturing, you cannot use it in the MCD driver, because the software renderer does not support it.

These are the reasons, why MCD could be considered only as a support in the early stages of OpenGL adoption in PCs. On the other side, the available features were enough for many consumer 3D accelerators released even in 1998.

Matrox Millennium I/II, Mystique, Mystique 220, G100 and G200

Matrox Millennium I (MGA-2064W) was the first card to support OpenGL through MCD. Matrox cooperated with Microsoft on the driver and its source code was included in the Windows NT 4.0 DDK (Driver Development Kit) so other vendors could better understand how to program the MCD support for their hardware.

MGA-2064W was a great 2D graphics card. It was fast in both DOS and Windows, supported all major 2D acceleration features and had a superb RAMDAC that allowed the card to provide sharp image in high resolutions with ergonomic refresh rates. Long time ago, I used that card too to drive my 21” sync-on-green SGI CRT, once I tweaked the RAMDAC behavior. Most people used this card only for 2D.

Its 3D support fits more into the first half of the 1990s. There is no hardware support for alpha-blending or textures, so the card is clearly useless for games even though it has the Direct3D support under Windows 9x. The early releases of Motoracer should work because the game can run without textures and the background & UI are handled using BitBlt outside the 3D pipeline. You can also run certain games that don’t care about missing features – Turok and Revolt will run without textures with just smooth-shaded polygons (affecting the gameplay). The rest just fire an error that you don’t have a 3D accelerator or crash on one of the missing features. I assume that the Direct3D driver was added just for VRML or similar usage.

Having some support for OpenGL made way more sense. Many professional programs including LightWave 3D, 3D Studio Max R2/R3 and multiple CAD packages were designed to run on 3D accelerators that cannot work with textures. Given the decision to use MCD, this card supported OpenGL only on Windows NT 4.0 and not on Windows 9x (where MCD was not supported by the OS architecture). The driver was ready and the OpenGL MCD support for the card was present even in the driver that had been bundled with Windows NT 4.0. This is the only graphics card that supports OpenGL on Windows NT4.0 out of the box with built-in drivers.

The card improves speed by using hardware Z testing and double-buffering. Only lines and triangles without transparency or textures are processed directly by the graphics chip. Unlike with the Direct3D driver, if a program draws a triangle with a texture or transparency, the effects are drawn properly, but the rasterization of that particular drawing primitive is done by the CPU (even though the result is written in the card’s own back-buffer memory).

If you try to run GLQuake on this card, the speed will not be better than with the Microsoft’s software renderer (thus, running in seconds per frame rather than frames per second). On the other side, a 3D/CAD program drawing a combination of wireframe lines and Z-buffered smooth-shaded polygons with optional stipple-“alpha” can draw pixels ten times faster (compared to 3Mpix/s on a 150-MHz Pentium machine without any 3D acceleration) using this card thanks to its relatively high pixel fill-rate of ~30Mpix/s (smooth-shading + Z-Buffer).

Better versions of the card have 4MB of WRAM memory and can be expanded using a special (now rare) module to 8MB, which allows 3D acceleration in higher resolutions (up to 1280×1024 in 16-bit colors or 800×600 in 32-bit colors in case of the 8MB version).

Matrox struggled to respond to a quick change in the 3D market and increased competition. They released a new Matrox Millennium II (MGA-2164WA) in the mid-1997. The card is faster in drawing pixels without using a Z-Buffer (that can be beneficial in games that don’t insist on the Z-Buffer) and adds support for textures. On the other side, the textures are supported only as point-sampled and there is still no alpha-blending support. The Direct3D driver for this card automatically removes bilinear filtering and replaces alpha-blending with ugly stipple patters, resulting in horrible effects in many games.

A town and a character partially covered with haze reveal the ugly stipple alpha used by the Matrox Millennium II board. The picture on the right is rendered by the S3 ViRGE GX2 which supports true alpha-blending and bilinear texture filtering.

The OpenGL MCD driver has been improved to handle textured polygons in hardware, but that happens only when bilinear filtering is not requested. As a result, you will see a big difference in performance, once the texture filtering is manually disabled in an OpenGL program. Given the fact, that Matrox Millennium II was released in about the same time as cheaper (but much better) 3Dlabs Permedia2, this was clearly not a good choice for anyone looking for a cheap OpenGL 3D accelerator.

The Matrox OpenGL MCD driver later included support also for Mystique and low-end Productiva G100 cards. The Productiva G100 is a chip from the mid-1998 and its 3D core is similar to that in Millennium II (although faster).

The last Matrox card that has support for OpenGL using MCD is Millennium G200. That is a completely redesigned 3D core with all the features expected in 1998. Matrox struggled to deliver a complete ICD driver until the end of 1999, so NT4.0 users had to wait with the MCD and Windows 9x users could only use Direct3D-to-OpenGL wrappers.

Matrox was one of the companies that could have been among leaders in 3D thanks to the good start in the early days, but they quickly lost their position and let others to steal their share in the “general-purpose” 3D market.

ATI 3D Rage II / II+DVD / Pro

3D Rage II (1996) was released in less than half a year and fixed all the major issues of the first 3D Rage. As a result, the new card combined a well-supported 2D core and advanced video playback acceleration with finally usable low-end 3D acceleration in a cheap consumer product. I would say that this was a direct competitor to higher-clocked S3 ViRGE DX/GX. It has similar 3D performance and worse visual quality (especially texture filtering, crude mipmapping and no true-color rendering) but has a small texture cache and produces fewer visual glitches in later 3D games.

Unlike S3 (with ViRGE), ATI created an OpenGL (MCD) driver for the 3D Rage II. This happened somewhere around the summer of 1997, when there was still very little competition among OpenGL consumer cards (but both NVIDIA Riva128 and 3Dlabs Permedia2 with ICD drivers were already announced).

The MCD driver again requires Windows NT 4.0 and cannot run on Windows 9x. If you bought a Rage II (or II+/II+DVD) card in a store, you didn’t find this OpenGL MCD driver on a bundled CD with drivers. The bundled driver supported only 2D acceleration under Windows NT 4.0. The driver that included OpenGL MCD was available only on-line on the ATI’s website (and a copy was hard to find, when I looked for it a few years ago).

Rage II accelerates most of the features that are possible using the MCD driver model including most common alpha-blending operations and bilinear-filtered textures. Thus, the chance that the driver will fall back into the software mode due to a missing feature is way lower than with cards from Matrox. These are the features that ATI mentioned as supported by hardware in their FAQ:

Q6. What graphics functions are supported with the ATI OpenGL driver for Windows NT?

A6. Stencil test, Polygon and line stippling, Anti-aliasing points/lines/polygons, Polygon offset, Alpha test, Alpha blending, Texture clamp mode, Gouraud Shading, Perspective correct texture mapping, Z-buffering, Fog, Mip mapping, bilinear filtering, and modulation.

Most OpenGL functions in the OpenGL 1.1 spec are supported. However, certain features such as Stencil test, Polygon and line stippling, Anti-aliasing points/lines/polygons, Polygon offset, nicest fog, are currently not accelerated by our hardware.

For the 3D RAGE II family, alpha test, trilinear, alpha blending which uses destination alpha are also not accelerated.

If you are unsure whether the triangles are rendered using your Rage II chip in OpenGL, just focus on the texture perspective correction. Rage II cheats a bit in doing this properly and textures often look wavy:

The texture filtering and dithering quality is surprisingly better than with Direct3D under Windows 9x. Many Direct3D games skip using dithering with this chip, which results in extremely low visual quality. This doesn’t happen under OpenGL:

ATI Rage II and ugly texture filtering in Direct3D

The OpenGL (MCD) performance of Rage II is slightly below the best ViRGE DX/GX boards in terms of pixel fill-rate. The chip’s texture cache is not so beneficial under OpenGL because of higher color depth used for textures (at least 16-bit instead of 4/8-bit in early Direct3D games), but the pixel fill-rate performance is acceptable for OpenGL programs of the era.

A bigger issue is caused by a driver overhead related to draw calls per second. This is not the same thing as the geometry performance. A program can draw many triangles in a single draw call, but if you want to move them mutually independently, you need to put them into separate draw calls. MCD usually suffers from the higher driver overhead (even in comparison with the pure software rendering), but this is noticeable only in very dynamic 3D scenes. The Rage II driver, however, is worse than other MCD implementations – just 26 thousand draw calls per second on a fast Pentium III machine (compared to 139 with Brian Paul’s ICD driver for ViRGE VX). This is noticeable in games like GLQuake and maybe it’s an issue of the hardware itself rather than the MCD implementation.

Anyway, a 4-MB Rage II could be a good low-cost speed-up for 3D modeling tools under Windows NT 4.0 for resolutions up to 1024×768 (if OpenGL viewports are not over the whole screen). The performance for simple smooth-shaded polygons with a Z-Buffer is 25-30% lower than with the first Matrox Millennium. On the other side, once you start drawing semi-transparent object, Rage II can be 30x faster than Matrox or the Microsoft’s software renderer.

OpenGL MCD driver timing was presumably related to the release of ATI 3D Rage Pro. The third generation of ATI’s 3D accelerators and the first with performance comparable to 3Dfx Voodoo Graphics. The Rage Pro is not much better in terms of visual effects – it doesn’t support certain blending types, mipmapping is broken and 3D rendering is supported only in 16-bit colors (the texture perspective correction is fixed though). On the other side, the core is equipped with a full triangle-setup engine and is very fast in rendering textured triangles. The performance increase over Rage II is extreme.

Unlike Rage II, Rage Pro is equipped with an OpenGL MCD driver (for Windows NT 4.0) since day one and you could find it on driver CDs that were bundled with the cards. In a year since the release of the card, ATI delivered a full ICD OpenGL driver for Rage Pro (but not Rage II-series chips). The full ICD and MCD Rage Pro drivers deliver almost equal rasterizer and geometry performance, but ICD peaks at ~90 thousand draw calls per second (instead of 31 with MCD). ICD also allowed ATI to support OpenGL on Windows 9x operating systems so gamers could finally experience native OpenGL support in games based on Quake 3D engines.

Intel i740

I don’t know much about this card except that it was much better than what people say these days. The first version of its OpenGL driver under Windows NT 4.0 was apparently done using the MCD architecture. John Carmack mentioned it in its .plan file in 1998:

“Their current MCD OpenGL on NT runs quake 2 pretty well. I have seen their ICD driver on ’95 running quake 2, and it seems to be progressing well. The chip has the potential to outperform voodoo 1 across the board, but 3DFX has more highly tuned drivers right now, giving it a performance edge. I expect intel will get the performance up before releasing the ICD. It is worth mentioning that of all the drivers we have tested, intel’s MCD was the only driver that did absolutely everything flawlessly. I hope that their ICD has a similar level of quality (it’s a MUCH bigger job). An 8MB i740 will be a very good setup for 3D development work.

S3 ViRGE and Windows 2000

As I already explained on the S3 ViRGE page, there was in fact OpenGL MCD support for these cards in Windows 2000, because Microsoft used ViRGE for a template driver in Windows 2000 DDK (Driver Development Kit). The driver was included in the retail version of this operating system. It supported all features of the chip (textures, transparency effects), but was not very performance optimized. Anyway, nobody cared about OpenGL on ViRGE series cards in 2000, so users did not even notice this driver.

The end of the MCD

With the increasing speed of 3D accelerators, the MCD was less and less usable. The new hardware added features like multi-texturing and a hardware geometry unit, that could not be leveraged by MCD drivers, so eventually all vendors switched to the ICD driver model, which is still used to this day.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.