UPDATE: There has been a lot of discussion in the comments of the differences between the page flipping method we are discussing in this article and implementations of a render ahead queue. In render ahead, frames cannot be dropped. This means that when the queue is full, what is displayed can have a lot more lag. Microsoft doesn't implement triple buffering in DirectX, they implement render ahead (from 0 to 8 frames with 3 being the default).
The major difference in the technique we've described here is the ability to drop frames when they are outdated. Render ahead forces older frames to be displayed. Queues can help smoothness and stuttering as a few really quick frames followed by a slow frame end up being evened out and spread over more frames. But the price you pay is in lag (the more frames in the queue, the longer it takes to empty the queue and the older the frames are that are displayed).
In order to maintain smoothness and reduce lag, it is possible to hold on to a limited number of frames in case they are needed but to drop them if they are not (if they get too old). This requires a little more intelligent management of already rendered frames and goes a bit beyond the scope of this article.
Some game developers implement a short render ahead queue and call it triple buffering (because it uses three total buffers). They certainly cannot be faulted for this, as there has been a lot of confusion on the subject and under certain circumstances this setup will perform the same as triple buffering as we have described it (but definitely not when framerate is higher than refresh rate).
Both techniques allow the graphics card to continue doing work while waiting for a vertical refresh when one frame is already completed. When using double buffering (and no render queue), while vertical sync is enabled, after one frame is completed nothing else can be rendered out which can cause stalling and degrade actual performance.
When vsync is not enabled, nothing more than double buffering is needed for performance, but a render queue can still be used to smooth framerate if it requires a few old frames to be kept around. This can keep instantaneous framerate from dipping in some cases, but will (even with double buffering and vsync disabled) add lag and input latency. Even without vsync, render ahead is required for multiGPU systems to work efficiently.
So, this article is as much for gamers as it is for developers. If you are implementing render ahead (aka a flip queue), please don't call it "triple buffering," as that should be reserved for the technique we've described here in order to cut down on the confusion. There are games out there that list triple buffering as an option when the technique used is actually a short render queue. We do realize that this can cause confusion, and we very much hope that this article and discussion help to alleviate this problem.