Thursday, 6 September 2012

2.5D XNA RPG Engine - Some Technical Details

I keep getting questions regarding some of the more technical aspects of my 2.5D shading implementation in my earlier YouTube videos like this one, so I want to dedicate an entry here to explaining a little bit what's going on behind the scenes there and fielding questions.

First things first, the original codebase at the moment is very outdated (still on XNA 3.1) and very tangled. It was one of my first projects, it's very badly organised and, at the same time, I do have plans for it in the future, so I won't be releasing the whole source code for the time being.
That said, let's take a look at the basics.


The idea was very simple - to do away with drawing polygons for the most part and, instead, deal with geometry in screenspace, as an extension of the sprites you'd normally use in a 2D engine.


Right off the bat, I'll come out and say that this isn't necessarily as great a performance-saving idea as it seemed to me back then. You're still doing a lot of work across the whole screen with pixel shaders, which is where the majority of the computation time comes in anyway. Moreover, you're essentially loading every sprite three times over, so you have a pretty high memory usage too (and unlike in 3D rendering, where you can easily have much smaller resolution normal textures with high-resolution diffuse maps, you can't really do that here without sacrificing quality dramatically).

I'm pretty sure something similar could implemented in say, Unity3D, with the right scripting and customised shaders, but I don't think this approach will ever be much of a performance saver - certainly not as much of one as I thought back at the time, when my understanding of computational expenses was still very limited. And while on more modern handheld devices you could probably get something like it to run, it'll probably need to be a fairly pared-down version due to restraints on how much per-pixel computation you can do and due to memory restraints.

Bear in mind you could easily get a very similar effect with a fully 3D engine, rendered out with an isometric perspective. If you don't make the geometry too complicated, your memory usage would likely be significantly lower as you'd be able to share textures between objects, while the per-pixel computation overhead would be the same as with the 2.5D engine, if not lower.

By going 2.5D you give up quite a few things that would come naturally to a properly 3-dimensional engine - things like having a proper zoom function.


With that in mind, here's what I see as some of the advantages. The main reason to go down this route, I think, mostly comes down to content creation and stylistic considerations since, as I've pointed out above, there's no real reason to do so for performance or graphics, with one exception.

I think, graphically, the only real advantages here are the potential capacity for dealing with really huge polycounts by baking them into your sprites and getting a limited sort of multisampling for free, as your assets are likely going to come out of 3DS Max nicely antialised (I say 'limited' because geometrical intersections are not going to antialias themselves and, indeed, can end up looking quite rough). Even so, they're a bit dubious - you don't really need huge polycounts and even if you do:
  1. They're not all that expensive, relatively speaking.
  2. It's actually pretty easy to combine pre-rendered elements with an otherwise fully 3D isometric engine.
But here's a few boons contentwise. The main, really, I think is that you can do a lot of the texturing work much more easily on flat sprites. While you need a 3D mesh initially to generate a rudimentary colour map, normal map and heightmap, you can then do a lot of extra detailing in an image editor like Photoshop. You can bake in very high-quality ambient lighting from your 3D program without any overhead and you can also potentially save time on a lot of UV mapping and texture creation.

You also get a little bit more freedom with how you go about creating assets in the first place - while you can start off making a simple mesh for calculating normal and height data, you could use a flat photo, or a hand-drawn sprite for the colour, essentially giving it extra depth.


Quite briefly, this is how I generated the assets I used in my videos and here's a sample of one. I'm going with the Utah teapot on this:

To save on memory a bit, only the colour map actually has an alpha - the rest borrow the colour map's alpha channel when they're being drawn. Also, as you can see, the colour map was rendered with some basic 3DS Max ambient occlusion.

All three I rendered in 3DS Max in my case, though it might be easier to create an XNA-based tool for generating the maps more reliably via custom shaders. Or, perhaps, you could just Maxscript it.

Here's what the normal map material looks like:

You're just adding together a red, green and blue object-space falloff map, it's as simple as that.

As far as the height map goes, you've got a few ways to go with that one. In my case, I set up a vertical gradient with a bit of extra dithering and worked out what the top value should be for every given object. The heightmap, like the normal map, is of course in worldspace.

The issue with the heightmap though is that you end up being restricted by the range of colours in it quite badly. The purpose of the heightmap is to, literally, state the worldspace height of every pixel - and that means you need a range of values as large as your object is tall, in pixels. So if you stick with a grayscale 256-colour bitmap, you run out of values pretty damned quickly - if your image is more than 256 pixels high, it's going to run out of height.

My solution, a fairly hacky one, was to displace the gradients in the RGB channels separately by a unit and, when calculating the height of a pixel, just add them for the result. That gave me a maximum height of 768 pixels (once you get really high, you can actually let the gradient clamp, as long as you don't have point-lights that high up to mess things up).

Alternatively, if you make your own content preprocessor, you could do more elaborate things, such as multiplying the RGB values to get the output. That's just something that's harder to do with Max materials on the fly (and you'd need to think about how you render it if you want to change the object's height).


Last but not least, I'll go over what actually happens in the render pipeline in some broad strokes.

The way my original code works is that you set up three separate RenderTargets that you draw the entire scene onto - once with the diffuse sprites, once with the normal maps and once with the heightmaps. The normal maps and heightmaps, as mentioned earlier, all borrow their alphas from the diffuse map.

A neat trick you get to do at this stage is that you can use the heightmap for depth testing. Now, the heightmap is in worldspace, so it doesn't exactly tell you how 'deep' into a scene a given pixel is, but it's good enough - if we have two objects occupying the same bit of space, one of them's grey, the other's white, that means the white object is higher up in worldspace than the darker object, which in turn means it has to be in front of it. So that's the basis of our depth query.

This is the shader that does it - as an extra parameter, it scales the heightmap by the scale of the object, allowing you to adjust for, well, scaling:

sampler AlphaSampler : register(s0);
sampler ColorSampler : register(s1);

float Scale = 1;

struct PS_OUTPUT
float4 color : COLOR0;
float depth : DEPTH;

PS_OUTPUT DepthMapPS(in float2 texCoord : TEXCOORD0)
 float4 alpha = tex2D(AlphaSampler, texCoord);
 float4 tex = tex2D(ColorSampler, texCoord);
 Out.depth = ((tex.r + tex.g + tex.b) / 3) * Scale;
 Out.color = float4(Out.depth, Out.depth, Out.depth, alpha.a);
 return (Out);

technique DepthMapping
    pass Pass0
        PixelShader = compile ps_2_0 DepthMapPS();

Next up, let's look at the lighting code. The game had two lighting shaders - one for directional light (i.e. sunlight) and another for point lights. Since the directional light shader is just a subset of the point light shader and overall quite simple, I'll just look at the point light shader.

Now, this implementation is, again, far from being anywhere near optimal. For starters, each light involves a whole new pass being rendered across the entire screen, with no compensation for falloff (it also uses a really goofy light falloff curve I came up with, which never really peters out). As you can see below, it uses three texture samplers to feed in the screen as it was rendered with colour, normal and heightmap sprites, then uses all three to figure out the position of any given pixel. That's a lot of crap going on on a per-pixel basis.

sampler ColorSampler : register(s0);
sampler NormalSampler : register(s1);
sampler DepthSampler : register(s2);

float3 LightPosition;
float LightIntensity;
float LightRange;
float4 LightColor = float4(2, 0, 0, 1);

float ScreenWidth;
float ScreenHeight;

float4 NormalMappingPointPS(float4 color : COLOR0, 
                                 float2 texCoord : TEXCOORD0) : COLOR0
 float4 tex = tex2D(ColorSampler, texCoord);
 float3 normal =(2.0 * (tex2D(NormalSampler, texCoord))) - 1.0;
 float3 depth = tex2D(DepthSampler, texCoord);
 float Z = ((depth.r + depth.g + depth.b) / 3) * 1024;
 float3 pixelPosition = float3(ScreenWidth * texCoord.x, 
                                  (ScreenHeight * texCoord.y) + (Z * 0.7547), 
    float3 lightDir = (LightPosition - pixelPosition) * float3(0.75, 1.0, 1.0);
    float lightDistance = length(lightDir);    
    float distModifier = LightIntensity / (max(lightDistance * LightRange, 1.0 / LightRange));
 float lightAmount = max(dot(normal, normalize(lightDir)), 0.0) * distModifier;
 float4 output = tex * lightAmount * LightColor;
 return output;

technique Deferred2DNormalMapping
    pass Pass0
        PixelShader = compile ps_2_0 NormalMappingPointPS();

I think overall it's pretty self explanatory otherwise. Between these two functions, that's the main chunk of the work being done, really.

Final Notes

To cap off, a few ways to improve on what I've got so far. There's definitely much that can be done to improve the efficiency of my shaders from, what was it, two? No, three years ago, goodness. It's been a while. But yeah, they're not great and drawing point lights in particular is a massive drain at the moment. If you wanted to do this for a handheld, your best bet would be to limit the number of lights on screen and do them with a single shader pass for all of them.

Secondly - and this is an important one - don't render more than you have to. I mean, the vast majority of the time, all you're going to be doing is moving the camera around while everything else remains static. So the best course of action is to do all the expensive pixel computations once and just keep a large chunk of the playable area in memory, if you've got the memory for it.

If a given area is just made up of lots of static sprites, you could even render your diffuse/normal/heightmap layers for the whole area and then dispose of all those other sprites to free up space, since you don't need them. You can then render dynamic objects on top of it all using depth testing.

Needless to say, when a dynamic object updates, you can draw it separately with lighting computed just for that object. So really, you only need to do the expensive light computation when the lights are actually moving around, and you'll probably want to keep that to a minimum (if you're concerned about performance or battery drain, anyway).

As it stands, the engine isn't doing any of that, so as I say, overall it's in actually a pretty basic state.


  1. I think your ideas for depth/height testing are brilliant. The thing I'm most curious about is how you would handle things like 3D world-space position, collision, etc. Having the depth calculated during rendering without an actual Z position in 3d space seems to make it awkward to say the least.

    And thanks for this by the way...

    -sublm66 from youtube

  2. Can you elaborate a bit?

    As the game's isometric, the way the engine was set up was that collision between, say, the player and other objects is just calculated with polygons drawn out on a flat plane.

    If you mean collision in the sense of just having objects intersect, that's just done through the depth testing shader. The one I've got quoted in the code block, by the way, seems to just draw the depth map out into the RGB output, but you can replace Out.color with something like Out.color = alpha and have it output the colour map (which you'd pass into the 'alpha' sampler) which is Z-buffer tested according to the heightmap.

  3. Ok, so by "polygon drawn out on a flat plane" do you basically mean like a quad? So as objects pass each other you can see if they collide (horizontally), but you have to make sure they are at the same depth, right? Otherwise, they're not colliding. But, if I'm understanding correctly, you don't calculate the depth of (each pixel of) an object until the rendering stage.

    So, I guess what I'm asking is how would you tell the objects that they've collided? I'm sure there's something I'm just not understanding. Do you do your rendering and then pass on that information to the objects? I'm imagining an example of say, a player walking behind a tall skinny building. Their polys would be colliding, but in reality (world-space) they shouldn't be.

    Also, the weather system looks pretty incredible. I really like the snow laying on the ground. Is that some sort of particle system you created?

    Thanks for the answers by the way...I'm just really interested and I want to make sure I understand what's happening :)

    1. Yeah, they're just quads rendered with the XNA SpriteBatch (SpriteBatch is pretty flexible about letting you add shaders to it).

      When you're talking about collision though, are you talking about geometrical intersection - such as if I pass one sphere through another - or do you mean gameworld collision, as in the case when a character walking around might bump into a building, or find her way blocked by a fence?

      The former is handled purely at render time and that's entirely visual. The latter, like I said, is done completely separately on a flat, 2D plane. You can see in the videos that every static object in the world has a polygonal 'collision' area around it, and this is what the player character tests their collision against when they walk around.

      If the character, say, walks behind a tree, the tree leaves would be rendered over the top of the character during the draw call because we're writing both the character's and the tree's heightmap to the ZBuffer, in which case the depth testing is done per-pixel (in the shader above) and that determines which object gets drawn on top.

      Snow laying on the ground is a screenspace effect that basically takes the all that same data in the heightmap, normal map and colour map samplers to work out how much snow every pixel has. A basic implementation would be to work out lambertian reflection as if the scene is being lit up from a directional light source at the sky's zenith, then clamp it down so values over 0.5 or something return white and everything else is transparent black. My shader adds a bit more noise to it than that by using the brightness of the underlying colour map (based on the idea that dark patches of the colour retain more heat, while bright and reflective colours are cooler and hence get more snow), but it's the same in principle.

      The light passes are rendered as additive quads across the whole screen and the snow works the same, except it's an alpha-blended quad instead.

    2. I was talking about gameworld collision, I understand your rendering pipeline pretty well. Dude, I just watched your video again and realize exactly what you're saying. I never noticed the red polys you have at the base of each object **doh**. I had assumed you meant that each object has a bounding box (quad). So, in that case, it would be difficult to determine when an object went behind another without colliding.

      Sorry about the confusion. I really appreciate your thorough explanation though. I'm sure it'll help out some people interested in your engine. I'm in the process of designing a 2.5D XNA engine myself at the moment, that's why I have so much interest in your methods. I've been a 2D/3D gamedev for years, and I'm finally taking the leap to "independent".

      I really love the weather system...the way that the patches under the trees remain dark during snow, etc. Even the cloud effects are really smooth. I've been doing my own research, but if you know of any specific places I should look (articles, books, etc.) it would be really appreciated. Now I've just come across your Capucine engine...I've got some reading to do ;) Thanks for your work.

    3. Ah! Awesome, glad I could clear that up. Yeah, the collision's all polygon based, the the characters colliding with a circle around the base.

      A terrible, wonky, but somehow kinda-working collision system nonetheless.

      I can't recommend much reading material, most of the stuff I did was pretty ad-hoc. The weather system was fun to do, but it was really more about just tying lots of separate little systems (bloom controls, a fog shader, all the different lighting colours and intensities) to work together and look nice. Good luck!

  4. Hi 9of9,

    It's amazing.
    Thanks to this post I understood where was my mistake.

    I am very grateful to you.

    Best regards,

  5. I added the bloom, glow and blur effects to this lightning system and its looks amazing and sweet :)

    Thank you for idea about fog shader I forgot about it.

  6. Being not much of a 3d artist at all myself ><, I keep ending up with a baby blue material when trying to duplicate your normal mapping method. By any chance, can you elaborate on how you ended up with your normal mapping material the way that you did? Also, did you do any modifying to your color map texture that you got after you rendered it out in 3ds max? The reason I'm asking is because I cannot seem to get my shadows as dark as yours with ambient occlusion -


    1. Well, the way I did ambient occlusion was just with a plain 3DS Max Skylight and a plane under the model that was set to be invisible to the camera (since the skylight generally lights everything around 360 degrees). So part of it might be that you don't have anything shadowing from below, or that your light's too bright, or maybe there's something going on with the exposure settings.

      As for the normal map material, the above picture is really pretty much the extent of it. You plug into the diffuse slot the 'Composite Layers' material and add three layers, each set to be additive. On each of these layers, you set it up like it is on the right panel: give it a falloff map with a primary colour on top and black on the bottom. Set it to 'Towards/Away' and the falloff type to WorldSpace. Then just match the different falloff axes to the different colours. Red to X-Axis, Green to Y-Axis etc.

    2. Awesome, ty, that helped me fixed the problem I was running into. The only thing is it seems what I ended up with for the normal mapping material, coloration wise, is somewhat off ending me up with normal maps that aren't as prominent in color they should be. Any idea as to what may be causing it? - - I uploaded my 3ds max scene so maybe if you get the time to take a look at it to get a better idea of what may be causing this issue with color and the ao shadowing, ty! -

    3. Will take a look at the scene. With normal maps, make sure to to turn self-illumination all the way up, so you don't have pesky lighting interfering.

    4. I've been messing with the settings a bit but still cannot seem to get my normal map colors and ao shadows as dark as yours. Have you been able to check out my scene to see what I may be missing?, I know you are probably busy >< but this is driving me nuts, lol. ty!

    5. I'm afraid I can't open your .max scene, it just throws me some random error. From your screenshot, your normal map looks just fine though. You just need to turn up self-illumination and you'll be good. As for shadows, like I say, try placing a camera-invisible plane under the teapot.

    6. Darn, well I appreciate you trying! I'll definitely use the tips you gave me when it comes time to creating more assets for myself. Thanks for all the help on the matter!

  7. You also take another performance hit by outputting depth from the pixel shader, since this disables early-Z testing. That means you end up running the pixel shader for every pixel, even if it's hidden behind another object that you've already drawn!

    In addition, dynamic shadows don't work with this method.

    On the other hand, your "models" ended up being extremely detailed and beautiful, so it turns out looking really nice.

    1. Yep, that's a downside. It's not a terribly intensive shader, since all it does is pluck things out of the texture samplers and test depth, but you do end up doing a lot of it (especially since you render all the objects in triplicate).

  8. Please advise how best to make the treatment more than one source of light?

    Provide an array sources to the shader or cause shader for each source?

    Best regards,

    1. Yeah, definitely go with the array. You'd be limited to a 'maximum' number of lights and possibly if you're using older versions of Pixel Shader it might be more efficient to have a few different variations on the lighting shader (one for one light, one for two lights etc.) but that's not something I can give any definite advice on. It's definitely much cheaper to do everything in one shader pass if you can though.

  9. Thank you for advice, I'll test different techniques.

    This is a link where is an example how to calculate multiple lights in the pixel shader 2_0 , in this article author suggest create more of 3 lights with help of several passes of the shader.

    Here is a link:
    ( sorry, this article in russian but you can see the code )

    1. That's fine, I can read the Russian =) And yeah, in PS 2_0 it's the kind of thing I meant. In newer versions I think you could get away with a variable-sized array, but perhaps not in PS 3_0

  10. This comment has been removed by the author.

    1. Hey 9of9, I really like these posts you have on your blog.

      If you could lend a few hints as to your general implementation of your depth testing shader into the xna code, as I am having quite a few issues myself. I can in a sense see some type of depth happening after a lot of trial and error, but I am still far from the desired result and getting some very weird affects. I really appreciate your help.


      Seems I posted too early, as my trial and error eventually lead me in the right direction, lol. Only issue now seems to be that my depth is inverted in its calculation, with black being the highest up and white being the lowest. Here is a picture with a normal height map ( vs with an inverted height map ( Any idea what may be causing this?

    2. Post your pixel shaders and SpriteBatch.Begin calls.

    3. You probably just need to set the DepthBufferFunction to make the correct comparison (draw the pixel if higher value than what's in the buffer, discard if otherwise). Check out

    4. Ahh, yes that may be it, but upon looking into it further and implementing it, my sprites end up not being rendered at all. Basically this is the drawing of my height in short:

      Is there anything that I may be missing from what you see in this short blurb of code overviewing my height drawing that is causing the inversion of depth, or what I am doing wrong with depthbufferfunction that is making my sprites invisible? Thank You!

    5. Forgot to add, that previously before creating a depthstencilstate as you can see in the code above, I was simply using the default one, which was when I was getting the inversion of depth problem.

    6. @WhtsTheDeal I ran into the same issue. Inverting the depth buffer resulted in exactly what I was looking for. I still prefer the 'x-ray' look of the original depth buffer so I modified the pixel shader to do the inversion for me.

      //TODO: change this calculation to provide more unique depth values
      Out.depth = ((1.0 - (tex.r + tex.g + tex.b) / 3)) * Scale;

      This was enough to meet my objectives. As for why this is happening, it could be something to do with XNA using a right-handed coordinate system and DirectX being left-handed by default.

  11. Thanks for posting this. I've implemented your technique in a small sample application. I have a question about render passes.

    My first instinct was to completely render the height map first, and then use that for depth testing when I created the normal and diffuse layers.

    However, since additions to the height map require testing against the existing height map, this led to switching render targets for each sprite that needed to go into the height map. (A texture cannot be both the input and output of a pixel shader in XNA 4.0)

    Is there a better way to generate the composite height map? My understanding is that switching render targets is fairly expensive.

    1. Basically, don't forget you've got a depth buffer. When you're drawing the heightmap pass, you're not testing it against the RGB render target that you're in the middle of drawing to: you already have a depth buffer for that very reason. The shader code I have up above shows precisely how to set it up so it actually outputs a depth value from the sprite and that is what performs the depth test.

  12. Thanks for your reply to my message, it was helpful.

    I'm also having some doubts about the method, and I wonder whether it would be better to go with a 3D engine, although in my case that involve throwing away a lot of code which I always hate. Originally I was making a simple 2D RPG with a similar look and feel to the infinity engine games. Then I started looking for some kind of dynamic lighting solution, this was after seeing this done in the Eschalon series to very atmospheric effect. (of course those games are tile based so they have an easier time with this.)

    The file size issue is one thing I think might be significant for a serious project, there's another indie RPG called Underrail which released an alpha demo recently that came to 500 MB almost all of which was the animated sprites, and this isn't using a method like this.

    I had a question about the depth map, wouldn't it be better to use 16 bit greyscale image inside of adding the channels as you do? 16 bits is probably overkill, but I think it would be nice in principle to be able to support large vertical structures like say the D'arnise keep in BG2:

    Most of the areas on the walls of the keep are actually walkable by the player, which is kind of tricky to do. (I imagine in the IE games it wasn't an issue since you didn't have to worry about dynamic lights in the courtyard area.)

    Another I'm not sure about is exactly what settings to use when creating the colour textures for your tiles in the renderer. I'm not really an art guy but I suppose there's a fair bit of freedom here. My first thought was just to set self-illumination to 100, in which case the lighting contribution would be entirely derived from in game dynamic lights. I was looking at some sprites from Icewind Dale:

    and I see a lot of self-shadowing. They generally look great, but they might be made more for a static lighting set-up, or maybe there's a good way to combine the two things.

  13. This comment has been removed by the author.

  14. This comment has been removed by the author.

  15. thanks for the normal map materia, was exactly what I needed!:)

  16. Thx!!!!!!
    I'm amazed what you can do with xna +c#+hlsl.....!
    Keep up the good work!

  17. how do i apply this shader on 2d sprite?

  18. Can you Share the Engine?

  19. Hello,
    First I want to express my level of sympathies to you. As I can see in comments your work inspires a lot of game developers or just simple gamers. Thanks man for your devotion to your work. Secondly I wish to introduce myself since it is related to the topic (:. I consider myself devoted to my work too, for example spent a lot of time on university degrees, learning from internet and doing projects. Thing is that I am an artist designer, who knows about art but nothing about programming languages. I always dreamed about creating my game with my own world but it was always to much to understand how to do it. I can see that I could probably deal with engine which you designed. If you would share the engine with me I could credit you as a co creator. Or maybe you have better ideas for me. Maybe you want to work on something together... Sincerely, Kris

  20. The lighting and weather effects is the best I've seen along with the asset layering works spot on. The only thing that is missing is shadows.
    Dude, seriously, I'd be willing to pay a decent amount to use this engine but you do you of course. I wander how hard it is to recreate in unity

  21. I wonder if Obsidian Entertainment would be interested in this.

    They use a fusion of 2d and 3d elements in pillars of eternity, in unity. They probably want to make Pillars of Eternity 3 after their current project, they now have funding from Microsoft (they were acquired), and no doubt they want to make PoE 3 as pretty as humanly possible.

    There might be something here for them. If you are not going to use this, you should try and reach out.