Deferred Shading Shines. Deferred Lighting? Not So Much.

As I indicate in the subtitle of this blog, there is no single way to develop games.  The techniques used in game development are as many and as varied as the games themselves–what’s best for one game is not necessarily best for another.  The phrase YMMV (your mileage may vary) is pretty much a staple of game technology discussions.  On the other hand, few teams have the money or stamina to try every technology on every game, so I hope people won’t hold it against me when I choose to take sides.

I’ve noticed an increase recently in game developers promoting a technique called deferred lighting.  Unfortunately this technique is old enough that not everyone remembers it by that name.  Wolfgang Engel reintroduced it in ShaderX7 under the name light pre-pass rendering, and for many that name seems to be sticking.  The most recent advocate of deferred lighting is Crytek.  Martin Mittring divulged in a presentation at the Triangle Games Conference that Crytek will be utilizing deferred lighting in version 3 of CryENGINE.

Now I get to tell you why that’s a bad idea.

Deferred lighting is similar to a better-known technique called deferred shading.  In deferred shading all the attributes necessary to completely shade a 3-D scene are rendered into off-screen textures called a G-Buffer.  The G-Buffer for a scene contains, per pixel, things like the surface normal, material abedos, and Phong specular exponent.  Shading can then be done in screen-space per light by reading back the necessary data from the G-Buffer.  This has the distinct advantage of decoupling the geometry processing in the scene from the lighting and shading calculations.  It is generally assumed that one can construct the G-Buffer in a single pass over the scene’s geometry and that one can constrain the light rendering in such a way that no more pixels are processed for a given light than are actually affected by the light.  From an algorithmic complexity standpoint this sounds great.  Meshes are rendered only once and no extraneous lighting or shadowing calculations are performed.  There is a drawback however.  The G-Buffer can be quite heavyweight, containing all those shading attributes, and consequently deferred shading consumes a lot of memory and bandwidth in constructing and reading back the G-Buffer.  Deferred lighting attempts to address that problem.

In deferred lighting only the lighting, not the shading, computations are deferred.  In the initial pass over the scene geometry only the attributes necessary to compute per-pixel lighting (irradiance) are written to the G-Buffer.  The screen-space, “deferred” pass then outputs only diffuse and specular lighting data, so a second pass must be made over the scene to read back the lighting data and output the final per-pixel shading (radiant exitance).  The apparent advantage of deferred lighting is a dramatic reduction in the size of the G-Buffer.  The obvious cost, of course, is the need to render the scene meshes twice instead of once.  An additional cost is that the deferred pass in deferred lighting must output diffuse and specular irradiance separately, whereas the deferred pass in deferred shading need only output a single combined radiance value.

Five years ago, when I was designing the renderer for the Despair Engine, I thought deferred lighting was the ideal choice.  Details on the Playstation 3 were sketchy at that time, but we already knew that render target memory on the Xbox 360 would be severely limited.  The G-Buffer for a deferred shading system wouldn’t fit in EDRAM and consequently it would have to be rendered in two tiles.  With deferred shading on the Xbox 360 requiring two passes over the scene meshes, the primary disadvantage of deferred lighting appeared nullified.

Despair Engine utilized deferred lighting for over two years, and we were generally very happy with the results.  It was implemented initially on the Xbox 360 and PC, but when the Playstation 3 was released it was extended to that platform as well.  Unfortunately our initial implementation on the Playstation 3 yielded significantly worse performance than we were seeing on the Xbox 360.  We had multiple projects well into development at that point, however, so scaling back our expectations on the content side wasn’t a viable option.  Instead the performance deficit on the Playstation 3 motivated our very talented PS3 programmer, Chris McCue, to look for alternate solutions.  From extensive profiling he identified two bottlenecks unique to the Playstation 3.  First, the PS3 struggled far more with vertex processing costs and consequently both the attributes and shading stages of deferred lighting were more frequently vertex bound on the PS3 than on the other platforms.  Second, the PS3 was sometimes ROP bound during the deferred lighting pass itself, a problem that is all but impossible on the Xbox 360 due to the massive bandwidth to EDRAM.

Based on this data, Chris proposed to switch to classical deferred shading on the Playstation 3.  Deferred shading would reduce the number of geometry passes from two to one and reduce the output bandwidth during the deferred pass.  I agreed, and sure enough the move to deferred shading was a success.  It helped narrow the gap between the Playstation 3 and the Xbox 360 to the point where we could ship the same content on both platforms and provide nearly identical play experiences on each.

The move to deferred shading on the PS3 prompted me to take a closer look at my decision to use deferred lighting on the other platforms.  If deferred shading was a win on the PS3, it seemed likely to have some advantages on the PC and maybe even the Xbox 360.  Although I’ve never been a proponent of settling for the least-common-denominator in cross-platform development, if we could move all platforms to the same deferred process without sacrificing performance, I knew it would save us some headaches in maintaining platform compatibility later on.

I implemented deferred shading on the Xbox 360 and PC a few months later and profiled the results.  On the Xbox 360, much to my surprise, deferred shading performed within a few percent of deferred lighting.  I could literally toggle back and forth between the two technique and barely notice the difference in GPU utilization.  Deferred lighting was a few percent faster in that initial implementation, but considering that we’d been optimizing the deferred lighting pipeline for years, I wasn’t about to be quibble over less than a millisecond of GPU time.  Doing head-to-head comparisons on the PC is a little more difficult because of the wide range of PC graphics hardware, but on the high-end DX9 cards and the low-end DX10 cards that I had access to at the time, the difference in rendering performance between the two techniques on the PC was similarly small.  More importantly, on the PC we suffered far more from CPU-side batch overhead and deferred shading handily cut that cost in half.

Having lived with deferred shading for a couple years now, I’ve come to appreciate the many ways in which it is superior to deferred lighting.  Although deferred lighting sounds great in theory, it can’t quite deliver in practice.  It does, in my experience, offer marginal GPU performance advantages on some hardware, but it does so at the expense of a lot of CPU performance and some noteworthy feature flexibility.  To understand this, consider the implementation of a traditional Phong lighting pipeline under deferred shading and deferred lighting.

Deferred shading consists of two stages, the “attributes stage” and the “deferred stage.”

  • The attributes stage:
    • Reads material color textures
    • Reads material normal maps
    • Writes depth to a D24S8 target
    • Writes surface normal and specular exponent to an A8R8G8B8 target
    • Writes diffuse albedo to an X8R8G8B8 target
    • Writes specular albedo to an X8R8G8B8 target
    • Writes emissive to an X8R8G8B8 target
  • The deferred Stage:
    • Reads depth, surface normal, specular exponent, diffuse albedo, and specular albedo
    • Blends exit radiance additively into an X16R16G16B16 target.

Deferred lighting, on the other hand, consists of three stages: the “attributes stage”, the “deferred stage,” and the “shading stage.”

  • The attributes stage:
    • Reads material normal maps
    • Writes depth to a D24S8 target
    • Writes surface normal and specular exponent to an A8R8G8B8 target
  • The deferred stage:
    • Reads depth, surface normal, and specular exponent
    • Blends specular irradiance additively into an X16R16G16B16 target.
    • Blends diffuse irradiance additively into an X16R16G16B16 target
  • The shading stage:
    • Reads material color textures
    • Reads diffuse and specular irradiance
    • Writes exit radiance into an X16R16G16B16 target

First let’s consider the memory requirements of the two techniques.  Deferred shading uses a G-Buffer that is 20 bytes per pixel and a radiance target that is 8 bytes per pixel for a total of 28 bytes per pixel.  Deferred lighting requires only 8 bytes per pixel for the G-Buffer and 8 bytes per pixel for the radiance target, but it also requires 16 bytes per pixel for two irradiance targets.  So in this configuration deferred lighting actually requires 8 bytes more memory per pixel.  I am assuming that both approaches are using appropriate bit-depth targets for high dynamic range rendering with tone reproduction handled as a post-processing step.  If you assume LDR rendering instead, I would argue that deferred lighting still requires deeper than 8-bit targets for irradiance, because the range of values for irradiance in a scene is typically far greater than the range of values for exit radiance.  In any case, there are a few variations on the layout described above and a number of options for overlapping or reusing targets on the more flexible console architectures that reduce the per-pixel costs of each technique to an equivalent 20-24 bytes per pixel.

Now let’s take a look at bandwidth usage.  The bandwidth required for “material color textures” and “material normal maps” is content dependent, but it is also exactly the same between the two techniques so I can conveniently factor it out of my calculations.  Looking at the layout described above, bandwidth consumed during the attributes and shading stages is measured per pixel and bandwidth consumed during the deferred stages is measured per lit pixel.  Adding everything up except the material color textures and normal maps, we see deferred shading writes 20 bytes per pixel plus an additional 8 bytes per lit pixel and reads 24 bytes per lit pixel.  Deferred lighting, however, writes 16 bytes per pixel plus an additional 16 bytes per lit pixel and reads 16 bytes per pixel plus an additional 24 bytes per lit pixel.  What this means is that if the average number of lights affecting a pixel is greater than 0.5, deferred lighting consumes more write bandwidth than deferred shading.  Furthermore, no matter how many lights affect each pixel, deferred shading consumes 16 fewer bytes of read bandwidth per pixel.

The last thing to consider when comparing the two techniques is feature flexibility.  So far I’ve looked at how traditional Phong lighting might be implemented using the rival deferred techniques.  Proponents of deferred lighting will sometimes argue that handling only the irradiance calculation in screen-space affords more flexibility in the choice of lighting models.  Once the diffuse and specular irradiance buffers have been constructed, each material is free to use them however it sees fit.  Unfortunately there isn’t as much freedom in that as one would like.  Most of the interesting variations in lighting occur in the irradiance calculation, not in the exit radiance calculation.  Anisotropic lighting, light transmission, and subsurface scattering all require additional attributes in the G-Buffer.  They can’t simply be achieved by custom processing in the shading stage.  When you consider the cost of adding additional attributes to each technique, the advantages of deferred shading really come to light.  The 8 byte G-Buffer layout for deferred lighting is completely full.  There is no room for an ambient occlusion or transmissive term without adding an additional render target at the cost of at least 4 bytes per pixel.  The deferred shading layout I’m using for this comparison, however, has unused channels in both the diffuse and specular albedo targets that can be read and written without adding anything to the space and bandwidth calculations above.

To be fair, there is one important detail I should mention.  Most proponents of deferred lighting recognize the excessive cost in generating separate diffuse and specular irradiance buffers and consequently adopt a compromise to the Phong lighting model.  They assume that specular irradiance is either monochromatic or a scalar factor of diffuse irradiance, and consequently it can be stored in the alpha channel of the diffuse irradiance target instead of requiring a full target of its own.  This configuration dramatically improves the results calculated above.  Again in the interests of fairness, when evaluating this form of deferred lighting, a similar compromise should be made for deferred shading.  The specular albedo can be considered monochromatic or a scalar factor of diffuse albedo (or both with sufficient packing).  With these modifications to both techniques deferred lighting does, indeed, have an advantage.  Deferred lighting will now require as little as 16 bytes of memory per pixel on some platforms whereas deferred shading will require 20.  Deferred lighting also ends up having equal write bandwidth requirements to deferred shading and lower read bandwidth requirements as long as the average number of lights per pixel is greater than 2.

Nevertheless, the differences are never huge, and ultimately there are a number of subtleties regarding how the bandwidth is distributed across the various stages and whether the stages are typically bandwidth bound that further muddy the waters.  The most damning evidence against deferred lighting remains that in a direct comparison across the content of two games and three platforms it only provided at best a few percent GPU performance advantage over deferred shading at the cost of nearly doubling the CPU-side batch count.  If further evidence is needed, consider that Killzone 2 experimented with deferred lighting early on in its development and also ultimately settled on a classical deferred shading architecture.

So as I said at the start, YMMV, but I for one don’t expect to be returning to deferred lighting anytime soon.


  1. […] Deferred Shading Shines. Deferred Lighting? Not So Much. – Game Angst […]

    1. Michael says:

      How much does this article still apply today? It was written back in 2009. Has hardware changed enough so that deferred lighting is a more viable option? Or has the hardware changed actually made deferred shading even more attractive? It seems like deferred shading actually might be even more attractive now that memory bandwidths are even larger, but I’d like to hear it from you (or something else) to confirm.

    2. Adrian Stone says:

      For a certain class of content and hardware, I think deferred shading is still the clear winner for performance and functionality. The class of content is still very relevant today, which is moderate display resolutions (say up to 1920×1080) and complex scenes (hundreds of meshes, dozens of lights). What has changed is that the class of hardware is less relevant. This article is based on DX9-level hardware. If you are targeting DX10-level hardware, then there are many alternate approaches to rendering that must be considered.

      In general I think there are still single-pass variations on all the shading techniques that show more promise than their multi-pass equivalents (tiled deferred vs tiled forward, “practical” clustered shading vs clustered shading, etc). These more modern shading pipelines have a number of benefits over classic deferred shading, but picking between them requires a more detailed understanding of the composition of your scenes.

      There is no clear winner among the current options unless you consider how many passes your opaque geometry requires (think decals), the distribution of material types in your scenes, the amount of transparency, the number and size of lights, and probably many other factors. I don’t feel that I have enough experience with the range of techniques available to DX11 hardware to draw any firm conclusions yet.

  2. Anonymous says:

    Interesting read, A+

  3. Anonymous2 says:

    You just talked about performance, what bout flexibility?

  4. Adrian Stone says:

    Flexibility is one of my primary arguments in favor of deferred shading. As I point out, deferred shading has a much fatter G-Buffer, which leaves a lot more room for squeezing in extra attributes. I haven’t really found much use for varying the way the second pass of deferred lighting uses the irradiance buffers. All the interesting variations on lighting (ambient occlusion, anisotropic effects, transmission) require more data than simple irradiance. If you try to forward more attributes from the attributes stage of deferred lighting to the deferred stage or from the deferred stage to the shading stage, you bloat the bandwidth and memory requirements and remove the one advantage deferred lighting needs to be competitive at all.

  5. Nice write up, I’m glad to see you’re blgging.

    I can’t recal if it was in his presentation or just in person, but Martin Mittring pointed out that deferred lighting helped ease some of the heavy memory transactions that the deferred approaches incur. Specifically, deferred lighting allows them to avoid requiring tile based rendering on the Xbox 360 – since they can fit into EDRAM.

    I’m curious if your approach in Despair slimmed down g-buffer requirements to fit into EDRAM, or if you were using tiled rendering?

    1. Adrian Stone says:

      Trying to avoid predicated tiling on the 360 was a big consideration in choosing deferred lighting over deferred shading in the first implementation. We do not use predicated tiling in our deferred lighting implementation and we do use it for the deferred shading implementation. I think that’s the biggest reason why the deferred lighting implementation is ever-so-slightly faster on the 360 despite the platform’s plentiful framebuffer bandwidth.

  6. […] Adrian Bentley talks about the (non-)use of stenciling to accelerate same. Lastly, Adrian Stone makes an argument for deferred shading and against deferred […]

  7. Andrew Lauritzen says:

    Interesting stuff, thanks for the post.

    To some extent I think it’s a question of whether it requires more bandwidth/work to reconstuct/re-rasterize the scene from raw data (vertex buffers, constant buffers, etc) vs. just storing that data when it’s available in the first pass. For small-ish triangles or complex vertex shaders (skinning for instance) it’s clearly cheaper to just write it out in the first pass, while for very large triangles it probably isn’t.

    One other advantage of deferred shading though is that you have more ability to schedule the hardware SIMD units during the “expensive” pass, while with deferred lighting you end up paying the cost for small triangles in wasted SIMD lanes. Conversely, with more of the shading done “up front” with deferred shading, you lose some of the benefit of eliminating overdraw for that pass, although I’ve yet to see the G-buffer generation pass be a bottleneck in any renderer that I’ve worked with.

    Anyways, good read… thanks again.

  8. JamesD says:

    Thanks for the useful info. It’s so interesting

  9. Bryan McNett says:

    One advantage of deferred lighting is that, because its first pass writes only normal+depth, this pass likely completes more quickly than deferred shading’s single pass, which writes every fragment attribute.

    This means that in deferred lighting, post-processing jobs like SSAO or DOF can launch sooner, which increases parallelism and reduces latency for jobs that depend on their output.

    This is an advantage in heterogeneous shader/compute environments, such as PS3 and modern PC GPUs.

    1. Adrian Stone says:

      Good point, although I think you’d still be better off with a deferred shading solution that constructed the G-Buffer in two passes–laying down depth and normals in one pass, starting your post-processing jobs, and then laying down the albedo buffers before shading.

    2. Bryan McNett says:

      That sounds interesting – could you explain why you feel that way? On the surface, this new idea sounds like deferred lighting, except after depth+normal, there is a full-screen write of everything-else followed immediately by a full-screen read of everything-else. If my interpretation is correct, memory bandwidth would be saved by snipping out the write+read and keeping the work inside a single shader. That does sound like deferred lighting again, however.

    3. Adrian Stone says:

      Breaking the deferred shading G-Buffer into two passes doesn’t change the render target read / write bandwidth. So if you accept the argument I made in this article about deferred shading and deferred lighting having comparable bandwidth requirements, the 2-pass deferred shading implementation is still competitive with deferred lighting. It adds a pass to deferred shading, which I admit removes the biggest advantage that deferred shading has over deferred lighting, but it means the depth and normal buffers are available as early in the frame as with deferred lighting. The only small advantage I see to this over deferred lighting is that the two geometry passes are back-to-back rather than being separated by the deferred pass. I think that lends itself to easier CPU-side optimization when construction the command buffers.

  10. Vim says:

    Nice post, I’m been thinking that deferred shading and deferred lightning was the same thing :).

  11. imogenenar says:

    And you so tried?

  12. vince says:

    Great entry which certainly gives some food for thought. I’ve been playing around with deferred lighting on the side and the point that most of the interesting stuff you want to do happens on the irradiance side really clicked.

    One technique I’ve seen that might be more difficult to pull off with full deferred shading vs deferred lighting is this paper on Inferred lighting ( in case it eats the html).

    This technique allows for the lighting to be calculated on a smaller render target than the frame buffer, using depth/contiguous region filtering to select lighting samples in the 2nd geometry pass. It allows for MSAA, and up to 4 layers of lit translucency without a separate forward rendering pass.

    Although to be honest I haven’t really explored whether or not this same technique could be implemented with deferred shading — everything I can think of either requires a 2nd geometry pass for the full-resolution depth buffer or perhaps a discontinuity-aware downsampling of the depth buffer. I’d be interested in hearing people’s thoughts.

    1. Adrian Stone says:

      Thanks for the link to the paper; I hadn’t seen that before. I’m a little skeptical of the value of this approach, however. I’ve investigated multi-resolution rendering with deferred shading and lighting with (to my surprise) disappointing results. There is some overhead to rendering anything at a different resolution because you must construct a depth buffer at the low resolution and then you must composite the low resolution target with the full resolution target. It doesn’t sound like much, but that overhead puts extra pressure on the savings you need to realize from reducing the resolution.

      Everyone assumes that lighting is a low frequency operation and therefore very amenable to rendering at a lower resolution. I did not find that to be the case in my scenes. Shadows can be quite high frequency and normal maps, depending on the art style, can also be very high frequency. I compared our games with and without mixed-resolution rendering of various elements of lighting and found the visual quality loss shockingly high compared to the net performance increase.

      I haven’t completely given up on the idea, but it is not the performance silver bullet that I was hoping for.

  13. Pal Engstad says:

    Hi Adrian,

    I think you’re pretty much spot on that Deferred Lighting requires many passes – and that might cause strain on your engine in more ways than just bandwidth. However, there’s a couple of things that are possible that is beneficial with Deferred Lighting:

    1) The results of the first pass (or two first passes) of depth-only and world-normal accumulation can later be used for occlusion. Typically, the second pass is much slower – hence reducing the cost of for the main pass is very beneficial.
    2) The light accumulation pass can be optimized in a number of ways. First off, you can use stenciling as well as scene knowledge to reduce the number of draws. Finally, on PS3 – it makes perfect sense to do this on SPUs – they are much faster than the GPU for this kind of task.
    3) You can experiment with using different render-target sizes and formats for the different passes. With Deferred Shading, you are limited to using the same buffer size, due to this era’s architectures.
    4) Beware the hardware’s depth culling mechanisms – Deferred Lighting will induce some nasty problems that you will have to solve.
    5) And finally, as Bryan mentioned – you can certainly make use of the depth-buffer in some very cunning ways – including: SSAO, shadow-resolve pre-processsing as well as particle-system optimizations.



  14. Hey,
    great article. There are three ways to implement deferred lighting / Light Pre-Pass on the PS3.

    1. You do everything on the GPU: use depth bounds to save bandwidth; maybe sort lights so that closer lights are drawn in batches
    2. You divide the framebuffer in tiles (Pal actually did this). Then you compare the tile frustum with the light bounding volume on the SPE and keep track in which tile is which light. Then you render with the GPU all lights per tile. That soves substantial bandwidth and ROP.
    3. You write with the GPU into a normal and depth buffer. Then you do all the lighting on the SPU with 2. as a first step. You can also MSAA on the SPU. Check out Matt Swoboda’s slides and then there is a ShaderX8 / GPU Pro article on this.

    Which version did you guys implement? I can imagine that 1. is slower than the 360. 2. should be at least on par and 3. should be faster. No. 3 favors deferred lighting over deferred shading because you have only two buffers. In other words doing lighting on the SPE is probably only feasible with deferred lighting and not with deferred shading.

    – Wolf

    1. Adrian Stone says:

      I agree that deferred lighting is more attractive if you hope to move a portion of the work onto the SPUs. The pipeline for rendering on the SPUs isn’t nearly as mature as the pipeline for rendering on the GPU, so keeping the “shader” count and complexity low is important.

      At the time that we were implementing this (a few years ago now), our goal was to get something up and running on the PS3 that was comparable to the PC and Xbox 360. We had hoped the GPU alone would be sufficient for rendering and that we could reserve the SPUs as counterparts to CPU threads on the 360 and PC.

      As it turned out, of course, the SPUs can handle the normal CPU workload and still have time left to assist the GPU. We do now offload some GPU work onto the SPUs, but our renderer isn’t as SPU-centric as I would probably make it if we didn’t have to worry about parity between multiple platforms. I think the ideal PS3 rendering pipeline is something different than deferred lighting or deferred shading, something unencumbered by the limitations of current GPUs.

  15. <<<
    Breaking the deferred shading G-Buffer into two passes doesn’t change the render target read / write bandwidth. So if you accept the argument I made in this article about deferred shading and deferred lighting having comparable bandwidth requirements, the 2-pass deferred shading implementation is still competitive with deferred lighting. It adds a pass to deferred shading, which I admit removes the biggest advantage that deferred shading has over deferred lighting, but it means the depth and normal buffers are available as early in the frame as with deferred lighting.
    This is not correct. Let’s say you have 200 lights. It is a difference if 200 lights fetch four render targets instead of 2 render targets. The read memory bandwidth is lower. Additionally the cost per light is lower. You do not render the whole lighting equation by fetching each texture … only the light properties.

    1. Adrian Stone says:

      I think you misunderstood me. I was talking about building the deferred shading G-Buffer in two passes, not doing deferred lighting. Bryan was asking about making the depth and normal buffer available to the SPUs earlier in the pipeline, and I was saying that with deferred shading you could build the G-Buffer in two passes. That technique doesn’t really have a name distinct from deferred shading because it is how deferred shading was always done in the days before MRTs.

  16. mattnewport says:

    Deferred rendering also opens up some interesting possibilities for tricks with decals that don’t require tesselating geometry as you can go in and modify the normal map and diffuse and spec without regard for the original geometry. Another idea I’ve been toying with is adding procedural grime and dirt with SSAO type techniques which would be easy to try out with deferred rendering. I think there are probably a number of interesting opportunities when you have the full G buffer information available that are not options with diff and spec lighting buffers. This relates to your point about most of the interesting lighting variations occuring on the irradiance side.

    1. Adrian Stone says:

      Yes, very good point. Deferred techniques (lighting, shading, etc) excel in scenes with high levels of overdraw. The performance benefits of deferred shading are particularly great when you have large numbers of lit decals.

  17. Alec Miller says:

    Isn’t the fundamental problem of deferred shading one of material selection per pixel? Stalker had to go through all sorts of atlases to select between different materials. If you just want to run phong everywhere that’s one thing, but if you have complex materials in each deferred shader then I would argue that deferred lighting wins out.

    1. Adrian Stone says:

      Not in my experience. The parameters that you typically want to change per material are part of the lighting equation, not the shading. Stalker, for example, varied glossiness or specular exponent per pixel, and that is needed during the deferred pass of both deferring lighting and deferred shading. I haven’t personally encountered any useful variations in shading that can be achieved with deferred lighting but not with deferred shading.

  18. […] a deferred rendering setup (see Game Angst for a good discussion of deferred shading & lighting), lights are applied using data from […]

  19. Daniel Brauer says:

    Another benefit of deferred shading is that it makes it easy to cull lights on a per-object basis. Even just four bits of layer information per pixel are probably enough for most uses. Because deferred lighting accumulates irradiance first, however, there is no way of telling how much light should go to each layer.

  20. Ola Olsson says:

    Somewhat late to the party, perhaps.

    I was just wondering if you would care to provide a reference to the original description of “deferred lighting”. I have found the exact technique described in a paper from 2003:
    “Optimized Shadow Mapping Using the Stencil Buffer”, JGT 2003, Jukka Arvo and Timo Aila,
    but they dont refer anywhere for the technique in particular, and do not give it a name.

    Hope you still get notifications from this thread 🙂

    1. Adrian Stone says:

      The earliest references I’m aware of are Dean Calver’s 2003 article for Beyond 3d ( and Matt Pritchard and Rich Geldreich’s excellent presentation at GDC in 2005 ( I don’t know where the term “Deferred Lighting” originates, however, since both of those sources refer to it as an established technique.

    2. Ola Olsson says:

      Thank you for replying!

      Correct me if Im wrong, but doesn’t Calvers article actually describe *Deferred Shading*, with full G-Buffers and only one geometry pass? To further fuel the confusion 🙂

      Pritchard and Geldreich’s presentation is the first place I’ve seen name used. Might see if I can get hold of either author.

      Cheers again

    3. Adrian Stone says:

      You are right. Calver acknowledges the possibilities of deferring different portions of the pipeline (“Deferred Rendering or Deferred Shading or Deferred Lighting?”) but then goes on to describe what is traditionally called “deferred shading” under the heading of “deferred lighting.”

      I was in Matt and Rich’s talk, however, and they definitely distinguished between deferred lighting and deferred shading, and described them much as I’ve described them in this article.

  21. Keith Yerex says:

    In my opinion, light prepass rendering is much more flexible.
    Though you can’t modify the lighting equations, you can add things like emissive, lightmapping, colored specular etc. to a subset of objects at little cost. Where the deferred rendering approach would require additional g-buffer layers, and any effect you add is paid for at every pixel in the lighting pass even if only a few objects use it.

    Also, with light prepass it is possible to use hardware multisample anti-aliasing on current console hardware (where msaa is not allowed with multiple render targets)

    In my experience light prepass was faster on ps3 than on 360, partly because of the “depth bounds test” which only exists on nvidia hardware and can cull many of the light volume pixels before shading. However, we used 32 bpp render targets for the light buffer, (in rgb-exponent format on ps3, where blending can be handled in pixel shaders by reading the framebuffer before writing, and flushing the texture cache occasionally)

    1. Adrian Stone says:

      I understand what you’re saying, but I would argue that these techniques can be mixed and matched with equal flexibility. If only a portion of your objects use features that are orthogonal to shading (for example emissive lightmapping), you can chose not to defer that aspect of their rendering in either deferred lighting or deferred shading.

      The second geometry pass is what allows you to do MSAA in deferred lighting, and again, that can be done with deferred shading as well.

      On the question of performance, I make an argument that “deferred lighting” has not been more efficient than “deferred shading” in my experience, but people frequently disagree with me using the term “light pre-pass.” I’m not sure if it was originally intended this way, but I now consider the term “light pre-pass” to imply a monochromatic or otherwise limited specular model. If that is true than I can absolutely see the performance advantages of light pre-pass over traditional deferred shading, but I’d also want to make a more fair comparison using a variant of deferred shading which doesn’t implement true Phong shading.

  22. Pavel Umnikov says:

    Maybe my comment is very outdated from post date, but I want to add my two cents.

    When I was developing our 4DVision Engine, I tried both Deferred Lighting and Deferred Shading techniques.
    Pre-Lighting/Light Pre-Pass/Deferred Lighting good approaches because of attribute named “Low Callorie Deferred Technique”. Firstly I thought that will best for our project… I was really stupid thinking this way.
    On latest videocards bandwidth is not worst thing, the worst thing is veetex processing.

    Testing showed interesting results: DL is better than in DS when scene not so complex.

    But what about tesselation with Direct3D 11/OpenGL 4.1? Two geometry passes with tesselated objects are quite nightmare(beleave me I tried). Deferred Shading with one geometry pass has advantage over Deferred Lighting with two geometry passes in that way.

  23. […] an interesting post mortem by one guy who implemented both light pre-pass and deferred renderers in their engine on 3 […]

  24. […] 关于两种方法更加详细的对比,这里有篇文章做了具体的分析,而且有Wolfgang参与到其中的讨论,可以看看(但是感觉作者可能对DL存在偏见)。 […]

    1. Chris Hoo says:

      HI Adrian,

      Great article. When people talking about the memory bandwidth consumption between Deferred Shading(1 geometry pass) and Deferred Lighting(2 geometry passes), the read and write operations on frame buffers were counted. But why nobody talks about the bandwidth consumed by the geometry pass itself ? A drawcall that takes all the vertices along with their attributes to the GPU pipeline from the video memory should consume bandwidth somehow. Why nobody takes this into account?
      Sorry for my poor English, hope you can get my idea.
      Thank you!


Leave a Reply