Exponential Cascaded Shadow Mapping with WebGL

Update: my implementation worked well on “native” OpenGL configuration but suffered from a GLSL-to-HLSL bug in the ANGLE shader cross-compiler used by Firefox and Chrome on Windows. It should now be fixed.

Update 2: you can toggle each cascade frustum using “L” and the camera frustum using “C”.

TL;DR

shadowmappingWebGL Cascaded Shadow Mapping demo powered by Minko

  • arrow keys: rotate around the scene or zoom in/out
  • C: toggle the debug display of the camera frustum
  • L: toggle the debug display of each cascade frustum
  • A: toggle shadow mapping for the 1st light
  • Z: toggle shadow mapping for the 2nd light

Motivations

Lighting is a very important part of rendering real-time convincing 3D scenes. Minko already provides a quite comprehensive set of components (DirectionalLight, SpotLight, AmbientLight, PointLight) and shaders to implement the Phong reflection model. Yet, without projected shadows, the human eye can hardly grasp the actual layout and depth of the scene.

My goal was to work on directional light projected shadows as part of a broader work to handle sun-light and a dynamic outdoor environment rendering setup (sun flares, sky, weather…).

Full code after the break…

Anamorphic Lens Flare

Update: I’ve just pushed a new SWF with a much better enhanced effect. I’ve tweaked things like the number of vertical/horizontal blur passes – which are now up to 3/6 – but also the flares’ brightness, contrast and dirt texture. I think it looks way better now!

Tonight’s experiment was focused on post-processing. My goal was to implement a simple anamorphic lens flare post-processing effect using Minko. It was actually quite simple to do. Here is the result:

minko_anamorphic_lens_flare_vipermarkII_2

The 1st pass applies a luminance threshold filter:

Then I use a multipass Gaussian blur with 4 passes: 3 horizontal passes and 1 vertical passes. The trick is to apply those 5 passes (1 luminance threshold pass + 4 blur passes) on a texture which is a lot taller than wide (32×1024 in this case). This way, everything gets streched when the flare are composited with the rest of the backbuffer.

JIT Shaders For Better Performance

The subject is really vast and complex and I’ve been trying to write an article about this for quite some time now. Recently, I made a small patch to enhance this technique and I thought it was a good occasion to try to summarize how it works and the benefits of it. In order to talk about this new enhancement, I would like to draw the big picture first.

The Problem

That might look like a complicated post title… but this is rather complex than really complicated. Here is how it starts: rendering a 3D object require to execute a graphics rendering program – or “shader” – on the GPU. To make it simple, let’s just say this program will compute the final color of each pixel on the screen. Thus, the operations performed by this shader will vary according to how you want your object to look like. For example rendering with a solid flat color requires different operations than rendering with per-pixel lighting.

Any programming beginner will understand that such program will test conditions – for example whether to use lighting or not – and perform some operations according to the result of this test. Yes: that’s pretty much exactly what an “if” statement is. It might look like programming trivia to you. And it would be if this program was not meant to be executed on the GPU…

You see, the GPU does not like branching. Not one bit (literally)! For the sake of parallelization, the GPU expects the program to have a fixed number of operations. This is the only efficient way to ensure computations can be distributed over a large number of pipelines without having to care too much about their synchronization. Thus, the GPU does not know branching and each program has a fixed number of instructions that will always be executed in the same order.

Conclusion: shader programs cannot use “if” statements. And of course, loops are out of the game too since they are pretty much pimped out “if” statements. Can you imagine what such logic would imply on your daily programming tasks? If you simply try to, you will quickly understand that instead of writing one program that can handle many different situations you will have to write many different programs that will handle a single situation. And then manually choose which one should be launched according to your initial setup…

Workarounds…

Mutables

The simplest workaround is to find “some way” to make sure useless computations do not affect the actual rendering operations. For example, you can “virtually disable” lighting by setting all lights diffuse/specular components to 0.

As you can imagine, this is really a suboptimal option. Performance wise, it’s actually the worst possible idea: a lot of computations happen and most of them are likely to be useless in most cases.

If/else shader intrinsic instructions

After a few years, shaders evolved and featured more and more instructions. Those instructions are now usable through higher level languages such as CG or GLSL. Those languages feature “if” statements (and even loops too). How are they compiled into shader code that can run on a GPU? Do they overcome the challenges implied by parallelization?

No. They actually fit in in a very straight forward and simple way. As a shader program must feature a single fixed list of instructions, the two parts of a if/else statement will both be executed. The hardware will then decide which one should be muted according to the actual result of the test performed by the conditional instructions.

The bright side is that you can use this technique to have a single shader program that handles multiple scenarios. The dark side is that this shader is still very inefficient and might eventually break the limit number of instructions for a single program. On some older hardware, the corresponding hardware instructions simply do not exist…

So even this “brand new” feature that will be introduced in Flash 11.7 and its “extended” profile is far from sufficient.

Pre-compilation

Some engines will use high level shader programming languages (like CG or GLSL) and a pre-compilation workflow to generate all the possible outcomes. Then, the right shader is loaded at runtime according to the rendering setup. This is the case of the Source Engine, created by Valve and used in famous games like Half Life 2, Team Fortress 2 or Portal.

This solution is efficient performance wise: there is always a shader that will do exactly and strictly the required operations according to the rendering setup. Plus it does not have to rely on some hardware features availability. But pre-compilation implies a very heavy and inefficient assets workflow.

Minko’s Solution

We’ve seen the common workarounds and each of them has very strong cons. The most robust implementation seems to be the pre-compilation option despite the obvious workflow issues. Especially when we’re talking web/mobile applications! But the past 10 years have seen the rise of a technique that could solve this problem: Just In Time (JIT) compilation. This technique is mostly used by Virtual Machines – such as the JVM (Java Virtual Machine), the AVM2 (Actionscript Virual Machine) or V8 (Chrome’s JavaScript virtual machine). It’s purpose is to compile the virtual machine bytecode into actual machine opcodes at runtime in order to get better performances.

How would the same principle apply to shaders? If you consider your application as the VM and your shader code as this VM execution language, then it all falls into place! Indeed, your 3D application could simply compile some higher level language shader code into actual machine shader code according to the available data. For example, some shader might compile differently according to whether lighting is enabled or not or even according to the number of lights.

With Minko, we tried to keep it as simple as possible. Therefore, we worked very hard to find a way to be able to write shaders using AS3. As the figure above explains, the AS3 shader code you write is not executed on the GPU (because that’s simply not possible). Instead, the application acts as a Virtual Machine and as it gets executed at runtime, this AS3 shader code transparently generates what we call an Abstract Shader Graph (ASGs). You can see it as an Abstract Syntax Tree for shaders (you can even ask Minko to output ASGs in the consoleas they get generated using a debug flag). This ASG in then optimized and compiled into actual shader code for the GPU.

For example: everytime you call the add() method in your AS3 shader code, it will create a corresponding ASG node. This very node will be linked with the rest of the ASG as you use it in other operations until it is finally used as the result of the shader. This result node becomes the “entry point” of the ASG.

Here is what a very simple ASG that just handles a solid flat color rendering looks like:

Here is what a (complicated) ASG that handle multiple lights looks like:

Your AS3 shader code is executed at runtime on the CPU to generate this ASG that will be compiled into actual shader code that will run on the GPU (in the case of Flash it will actually output AGAL bytecode that will be translated into shader machine code by the Flash Player). As such, you can easily perform “if” statements that will shape the ASG. You can even use loops, functions and OOP! You just have to make sure the shader is re-evaluated anytime the output might be different (for example when the condition tested in a “if” changes). But that’s for another time…

Using JIT shaders, Minko can efficiently dynamically compile shaders shaped by the actual rendering settings occuring at runtime. Thus, it combines the high performance of a pre-compilation solution while leveraging all the flexibility of JIT compilation. In my next articles, I will explain how JIT shaders compilation can be efficiently automated and how multi-pass rendering can also be made more efficient thanks to this approach.

If you have questions, hit the comments or post in the Minko category on Aerys Answers!

New Minko 2 Features: Normal Mapping And Parallax Mapping

One of Aerys’ engineers – Roman Giliotte – is the most active developer on Minko. He is the one behind the JIT shaders compiler, the Collada loader and the lighting engine. This last project received a special attention in the past few days with a lot of new features. Among them: normal mapping and parallax mapping.

The following sample shows the difference between (from left to right) classic lighting, normal mapping and parallax mapping:

The 3 objects are the exact same sphere mesh: they are just rendered with 3 different shaders. You can easily see that the sphere using parallax mapping (on the right) appears to have a lot more details and polygons. And yet it’s just the same sphere rendered with a special shader that will mimic the volume effect and details on the GPU.

Parallax mapping can be used to add details and volumes on any mesh. This technique is used in many modern commercial games such as Crysis 2 or Battlefield 3. It makes it possible to load and display a lot less polygons but with a high-polygon level of details.

And of course, thanks to Minko and Flash 11/AIR 3, it works just as well on Android and iOS!

The only thing you need is a normal map and a heightmap. And those two assets are very easy to generate from any actual 3D asset. The technique we use is called “steep parallax mapping”. And thanks to Minko’s exclusive JIT AS3 shaders compiler, you can now use parallax mapping in any of your custom shaders! The code is available on github :

One of the future optimizations include storing the height in the w/alpha component of the normal map. This way, the memory usage will be the same than with normal mapping but with a much better rendering.

If you have questions or suggestions, you can leave a comment or post on Aerys Answers.

Spritesheets With Minko

Update: the full source code for this tutorial is available in minko-examples on GitHub.

Spritesheets are a nice way to create nice effects with an extensive artistic control. You just have to create a texture with a series of cool-looking sprites to have a nice animation. They are widely used for particles, fog and explosions for example. We used spritesheets to render the explosions and the clouds in our latest game: The Mirage.

When you have the spritesheet itself, you need two things to get it working in your application:

  1. A “frame id” value that will be updated to tell which frame of the spritesheet should be sampled.
  2. A shader that will sample the spritesheet accordingly.

And here is what you get:

Get the code right after the break…

Render To Texture With Minko 2

Update: If you want to give it a try, you can start working with Minko 2 beta today!

During last week “Starting With Shaders” workshop, nicoptere asked me about Render To Texture (RTT). RTT is a very common technique used for multi-pass rendering. The principle is quite simple: the pre-passes render the scene in one or more textures. This texture can then be sampled to provide per-pixel data during the rendering pass.

A very common use for RTT is shadow mapping. Shadow mapping is a technique to create projected shadows. The shadow mapping rendering pass will require access to the depth of each pixel as seen from the light source which is “emitting” the shadow. Therefore, each mesh will have more than just a rendering pass. Indeed, we will have to add a pre-pass for each light in the scene to generate it’s “depth map”.

RTT can also be used to create portals! So it’s a very cool feature.

Demo/Code snippet after the jump…

Procedural Animated Flag With Minko ShaderLab

I used Alexandre’s work about waves simulation on the GPU with the ShaderLab to create a French flag with a simple blue-white-red procedural texture. Here is the result:

As always with the ShaderLab, not a single line of ActionScript and everything is hardware accelerated! If you want to subscribe to the Minko ShaderLab beta, you can apply on the “Minko ShaderLab: Beta testers wanted!” thread on Aerys Answers.

Minko ShaderLab: Waves Simulation On The GPU with Flash

Alexandre Cyprien – one of Aerys’ engineers – trained himself on the Minko ShaderLab with quite a challenge: waves simulation on the GPU! Alexandre is doing a very extensive work on 3D compression and Artificial Intelligence algorithms. He is not a GPU programmer so I’m very happy he was able to create such a cool and complex effect in no time with the ShaderLab! He took his inspiration from the GPU Gem Article: Effective Water Simulation from Physical Models. Here are the results:

There is not a single line of ActionScript here: everything is done on the GPU! Even the little embed above was created using the “share” feature of the ShaderLab I already detailed in a previous post.

You can actually embed it on your own page/blog with the following HTML code:

More samples and implementation details after the jump…

ShaderLab: Embedding Shaders In A Web Page

As you can read in the online Backsun FAQ, we are currently working on a tool called the “ShaderLab”. This tools provides a visual programming environment to create, debug and test shaders in a user-friendly way:

The ShaderLab is a Flex web application. It’s really easy to use and anyone can start playing. The only requirement is to know shader basics and 3D maths. The tool is actually so powerful that you can build an entire hardware accelerated particles engine without a single line of AS3 (you can look at the first screenshot if you don’t believe me). But I won’t speak about that today.

Today I want to introduce a very cool feature based on the fact that we are working on 3D for the web. To me, the web component is something that must be used to provide new features. And being able to program the GPU and create shaders with a web Flex application is one of those features. And another one is the possibility to share and embed those very shaders on a web page.

Indeed, the ShaderLab will provide a “Share” button “à la Youtube”. It will provide an HTML code – an iframe really – to embed your creations on your blog/website. Here the embedding code in action:


More samples and details after the jump…

Single Pass Cel Shading

Cel shading – aka “toon shading” – is an effect used to make 3D rendering look like cartoon. It was used in many flavours in a few games such as XIII or Zelda Wind Wakers.

TL;DR

Click on the picture above to launch the demonstration.

The final rendering owns a lot to the orginal diffuse textures used for rendering. But this cartoon style is also achieved using two distinct features on the GPU:

  1. The light is discretized using a level function.
  2. A black outline on the edges.

This is usually done using two passes (or even more). One pass renders the scene in the backbuffer with the lighting modified by the level function. Another pass renders the normals only and then pass is done as a post-process to perform an edge detection algorithm (a Sobel filter) to add the black outline.

Another technique uses two passes: one to render the object with the lighting, and a second one with front-face culling and a scale offset to render the outline with a solid black color.

But I thought it might be done more efficiently in one pass using a few tricks. The most difficult part here is how to get the black outline. Indeed, we are working on a per-vertex (vertex shader) or per-pixel (fragment shader) basis. Thus, it’s pretty hard to work on edges. It’s pretty much the same problem we encountered when working on wireframe, except here we want to outline the object and not the triangles. But this little difference actually makes it a lot easier for us.

Why? Because detecting the edges of the 3D shape is much easier.

The Outline

The trick is to get the angle between the eye-to-vertex vector and the vertex normal. If that very angle is close to PI/2 (=90°), then the vertex is on an edge. If the vertex is on an edge, then we will displace it a bit toward its normal. The vertices deplaced this way will form an outline around around shape:

Our fragment shader is pretty simple here: _isEdge is supposed to contain 1 if the vertex is on an edge, 0 otherwise. Therefore, as we want our outline to be black, we simply use the “lessThan” operation. If the vertex is on an edge, the outline value will be 0 and 1 otherwise. We just have to multiply “outline” with whatever color we want to use.

You’ll get the following result:

The Light Level Function

There are many ways to transform a continuous per-pixel lighting equation into another one that will use levels. The best way to make it possible for the artists to customize it is to use a light texture. The Lambert factor is then used as the UVs to sample that texture.

But here I wanted this effect to rely on no textures (except a diffuse one eventually). So I implemented a very simple level function using an euclidian division:

Here is what you get:

The Final Shader

You can combine both effects by simply multiply the Lambert factor by the outline value we used to output.

And voilà!