To me, never having come into contact with reddit, the main drive behind this rapid build-up of public interest, ever before, this came as a big surprise. I don't even know to whom exactly it is that I owe this incredible amount of publicity, in the end it probably comes down to a lot of people spreading the word, so thank you all for your great interest into this project. I now feel like I've been reading through hundreds and hundreds of tweets and comments, and I am really grateful that so many of you have found such kind words, even defending my work against some of the criticism without me having to intervene in any way whatsoever.
The PC gaming magazine 'Rock, Paper, Shotgun' was even so nice as to write a short review on the whole thing. Being compared to momentous projects like Narbacular Drop and TAG really is a great honour to me, especially since it was never planned for this extremely experimental prototype to be seen by such a vast audience, and it does clearly not (yet?) match the final depth or quality of either of them.
Of course, one of the most frequent questions asked was whether I would at some point continue working on this project. For the time being, all I can say is that I would definitely love to continue exploring the idea of this prototype, especially with PhysX 3 being out and providing some nifty new ways of customization PLUS DirectX 11 becoming more and more available on consumer desktop PCs, providing hardware support for tesselation, which seems a rather perfect fit for this kind of project. However, I have to admit that time is a bit scarce at this very moment with me being only halfway through my studies. I will for sure keep you posted via this website as well as twitter when there is new stuff to share with the world.
Another thing that came up about every other comment was, as expected, the comparison to the nowadays pretty commonly known mechanics of portals. At the heart of this comparison lies the eternal question, as someone put it, that must be asked: "But is it fun?". I, for my part, would have put it slightly differently, the question being: "But could it be made fun?". That one, for now, I cannot answer for sure. While many have already considered the early playable preview I had uploaded last year quite fun, others correctly remarked that this prototype is still rather weak in what concerns pacing, complexity and reward, in fact providing more of a sandbox-like testing environment than real challenge or variation. This remains subject to lots of thinking and creativity - and feedback, rest assured that I have certainly been taking notes while reading through your comments. To sum things up, I will simply end this entry with this link Chris Hecker sent me around this time yesterday, only shortly after the whole world had gone insane.
]]>In other news, I have finally taken time to capture some footage of a gameplay prototype that I was working on for several weeks about one year ago. The player is given the ability to warp space by placing two warp anchors anywhere in the world, warping the space in-between those anchors. Space warping also affects physics, which may be used to make objects roll off curved surfaces, bridge gaps etc. Like the previous demos, this prototype was built on top of my engine/framework/sandbox called breeze.
Best viewed in HD, click the title and switch to full screen on Vimeo.
Head over to the Torsion project page for more information.
]]>For now I am happy to announce that we're once again featured in 4sceners' list of the best demo scene productions released throughout the last year, this time ranked 1st in the category of 64k Intros.
It has been about this time last year that we actively started working on this production and it's always nice to see that the effort put into it has made at least some kind of longer-lasting impression on other people who know what lies behind such work. The collaboration definitely was great fun, great thanks go out to Christian Rösch (Code), Mark (Code) and Turri (Music).
]]>Best viewed in HD, click the title and switch to full screen on Vimeo.
Download: Liquidiced - 64k PC Intro, Executable
Download: Liquidiced - Unpacked PC Intro, Executable (for those experiencing malware warnings using the packed one)
Again, this was a collaboration, so more credits go to Christian Rösch (Code, GFX), Mark (Code), xTr1m (Code, Music), rip (GFX) and LPChip (Music).
]]>The major trouble that comes with a static CRT is due to the fact that any self-contained module of your program (e.g. an EXE or DLL file) will instantiate its own private copy of the CRT, including its own private heap. The moment modules start sharing objects allocated on one of these heaps across module boundaries, it becomes essential that any shared object be de-allocated on the very heap that it was originally allocated on.
Things get even worse when it comes to sharing more complex objects (e.g. STL strings or containers), as these objects may very well appear to have been stack-allocated, however, memory allocation on some heap is inevitably going to occur inside any of these objects, and might do so any time you invoke a method for one of these objects.
Fortunately, I have been aware of these issues from the start, still, circumvention of the pitfalls described can be tedious from time to time, which is why I am going to share a few tricks on that matter in this entry.
The first thing to notice is that most of the problems arise from header-only / header-centric libraries, as header code is always instantiated inside the module that is actually using the library, therefore inheriting both the module's private CRT implementation and heap. Some code might even get inlined, blurring the lines between your own module and the foreign library. This is why cross-module STL container sharing performs that badly in a static CRT environment, almost inevitably leading to crashes, whenever memory that has been internally allocated by the container code instantiated in one module is freeed in another module (e.g. to make way for more memory to be allocated, see vector reallocation).
In consequence, many sources suggest that you never ever share complex types across module boundaries altogether, which would indeed be the most simple and elegant way to overcome these issues without further caution and thinking. However, this simple rule puts harsh limits to the functionality and simplicity of your library, as there no longer is a way of sharing element collections, strings and so on without the extensive use of pre-allocated buffers.
Another possible solution would be to write your own non-header-based container library, as code that is compiled into your library and subsequently only accessed via import/export linkage will always use the same heap, being the one created for the module it was compiled to. This is not really feasible for larger container libraries, yet, the observations made here are still relevant:
The third possibility to fix issues concerning the STL is to make use of allocators. STL allocators provide a not-too-complicated way of providing your own source of memory to most of the STL objects, thus it is possible to acquire (and release) memory using an internal function that has been firmly compiled and linked into one of your modules, always accessing the same private heap, regardless of the module in which the container is actually used.
Having overcome STL container problems, there are more issues to be solved, such as creation and deletion of custom-type objects scattered across several modules. One simple and elegant solution would be to make use of the abstract factory pattern, once again firmly compiling and linking both creation and deletion into one module, subsequently making calls to these functions instead of constructing or destroying objects manually. This is also the pattern that COM follows.
Another solution involves LOCALLY overloading operator new and operator delete inside often-shared classes. Overloading these will ensure that calling new and delete for the corresponding classes will acquire and release memory using the specified overloads instead of the global operators, in this case the overloads have to follow the same rules as the custom STL allocators discussed earlier. It is also possible to put these overloads into a common base class, making sharing as easy as deriving from this class.
One final remark to be made concerns virtual methods. As, at compile time, the compiler has no way of knowing which method is going to be called in the end, a virtual method called at run-time has to lead to a method instantiated in the module that the corresponding object has originally been created in, thus, code inside virtual methods will always use the same private heap that was used on construction of the object in question.
Curiously enough, this is the reason why it is possible to use STL exceptions in a static CRT DLL environment, although these exceptions are commonly implemented using a standard STL string without any additional allocation care. Due to the fact that the standard exception classes' destructors are virtual, the message string is always freed on the same heap as it has been constructed on, regardless of module boundaries crossed by exception handling.
Final thought: Avoid statically linking against the CRT when you can afford it, as the effort, complexity and implications of supporting it are high. Also, if you do, be prepared to build your own binaries of the libraries that your project depends on, as static CRT binaries have been dropped from the binary packages of many libraries (however, building your own binaries may also be necessary for many other reasons, e.g. when tuning the performance of your project via the _SECURE_SCL define).
Further reading:
[KK's Blog] Dynamically linking with MSVCRT.DLL using Visual C++ 2005 ("History of the CRT" + Loads of most valuable CRT information)
[Stackoveflow] What's the differences between VirtualAlloc and HeapAlloc?
[MSDN] Heap: Pleasures and Pains
[MSDN] Comparing Memory Allocation Methods
Best viewed in HD, click the title and switch to full screen on Vimeo.
Download: Imagine - 48k PC Intro, Executable
Download: Imagine - Unpacked PC Intro, Executable (for those experiencing malware warnings using the packed one)
As this was a collaboration, huge credits go to Christian Rösch (Code), Mark (Code) and Turri (Music).
]]>For more information and video footage, see the second post below this one.
]]>So now it's time to catch up with all the new stuff introduced in preparation to my big Devmania presentation of the breezEngine in its current state. To avoid writing the same things over and over again, I just uploaded the original presentation as well as an English translation for the non-German readers of this blog:
Download (German original): breezEngine Devmania 2009 [~2MB]
Download (English translation): breezEngine Devmania 2009 [~2MB]
Besides, I just released the first public tech demo incorporationg physics and minor game play, a mix of jump'n'run and puzzle elements. The demo was built in less than a week, more or less in parallel to the first prototype implementation of the new physics module, therefore it is neither balanced nor really finished, yet it already shows off some quite nice visuals and dynamics.
For those of you interested and matching the ridiculously high system requirements (Shader Model 3.0, lots of VRAM and fill rate, meaning quite a decent graphics card + latest DirectX Runtime, PhysX System Software, VC 2005 Runtime, for download and installation links, either click or see READ ME), there now is a public download available:
Download (for installation instructions, see READ ME): Bonny Nightmare [~1 MB]
]]>Best viewed in HD, click the title and switch to full screen on Vimeo.
Download real-time demo: RAR - Devmania Invitation 2009 [18 MB]
Watch ambience capture: RAR presented live at Evoke 2009 on YouTube
Note that this demo is the result of a very close cooperation with Christian Rösch, who came up with the idea of using the breezEngine to create PC Demos and who was the creative mind behind all but one scene in this particular demo - it should be pretty obvious which scene I am talking about. ;-)
]]>As already stated in the title, the beezEngine rendering pipeline is fully shader-driven, allowing programmers to change the way the scene is rendered and processed simply by changing shader code. This enables programmers to introduce a great variety of entirely new shaders to the engine without any need for custom application-side integrational code.
Communication between engine and shaders is handled by the so-called effect binders. The engine offers a whole bunch of these classes, providing the data necessary to transform and render objects as well as to perform lighting, shadowing and processing. In addition, these effect binders handle the creation of temporary, permanent and persistent render targets, automatically setting, swapping and scaling them at the request of the given shader. Effect binders even allow for dynamic flow control when it comes to rendering multiple passes, repeating or skipping passes according to the context in which the shader is used.
This all sounds extremely general, so here you go with some example code:
// Enables additive BlendingThis snippet is taken from the engine's default phong shader which makes use of quite a lot of the features described in the text above: The first pass is only used as a state block, the annotation bSkip telling the effect binder being responsible for dynamic flow control to skip this pass during normal rendering. The second pass is one of the passes that actually perform lighting, each of these passes providing one permutation for all of the different (pre-sorted) possible light type combinations. The Type annotation specifies that this pass is to be applied during the main (shading) stage of the current frame, there are also a pre (depth and additional data) stage as well as several processing stages. The LightTypes annotation specifies order and type of the lights that may be applied to this pass, bRepeat stating that the pass may be repeated several times if there are more lights to follow fitting this permutation. Finally, the PostPassState annotation specifies the name of the state-block pass defined earlier, thus once a lighting pass has successfully been applied, render states are changed to enable additive blending of the passes to follow.
pass AdditiveBlending < bool bSkip = true; >
{
AlphaBlendEnable = true;
BlendOp = ADD;
SrcBlend = ONE;
DestBlend = ONE;
}// Light pass
pass LightDP < string Type = "Main"; string LightTypes = "Directional,Point";
bool bRepeat = true; string PostPassState = "AdditiveBlending"; >
{
VertexShader = compile vs_2_0 RenderPhongVS();
PixelShader = compile ps_2_0 RenderPhongPS(GetLightingPermutation(LIGHT_DIRECTIONAL, LIGHT_POINT));
}
Here's another extract taken from the tonemapping effect:
// Average luminanceThis snippet shows off some of the post-processing features provided by the effect binders. The first pass specifies a custom temporary texture to render the average logarithmic luminance to, at the same time requesting the engine to scale the render target down by 1/4. The second pass further averages the logarithmic luminance rendered by the previous pass, repeating the down-scaling in steps of 1/4 until a x resolution of 4 is reached. Note that as this is a processing effect, render targets are automatically swapped unless explicitly specified otherwise. The third pass performs one more step of averaging, outputting the exponential value of the result into a different custom persistent render target of the resolution 1x1 (persistent, as the result needs to be blended with the luminance value of the previous frame to simulate eye adaption). Of course, the 0 in the Destination0 implies that it is possible to make use of multiple render targets at once.
pass LogLuminance < string Destination0 = "LuminanceTexture";
float ScaleX = 0.25f; float ScaleY = 0.25f; >
{
VertexShader = compile vs_3_0 Prototype::RenderScreenVS();
PixelShader = compile ps_3_0 RenderDownscaledLogLuminancePS(g_screenSampler, g_fScreenResolution);
}
pass AvgLuminance < string Destination0 = "LuminanceTexture";
float ScaleX = 0.25f; float ScaleY = 0.25f;
int ResolutionX = 4;
bool bRepeat = true; >
{
VertexShader = compile vs_2_0 Prototype::RenderScreenVS();
PixelShader = compile ps_2_0 RenderDownscaledLuminancePS(g_luminanceSampler, g_fLuminanceResolution);
}
pass ExpLuminance < string Destination0 = "AdaptedLuminanceTexture"; int ResolutionX = 1; int ResolutionY = 1; >
{
VertexShader = compile vs_2_0 Prototype::RenderScreenVS();
PixelShader = compile ps_2_0 RenderDownscaledLuminancePS(g_luminanceSampler, g_fLuminanceResolution, true);
}
The creation of new render targets to perform averaging, blurring and similar stuff on, is easy as pie:
// Luminance textureIn that way, the whole rendering pipeline may be customized simply by changing shader code:
Texture g_luminanceTexture : LuminanceTexture <
string Type = "Temporary";
string Format = "R32F";
>;
// Adapted luminance texture
Texture g_adaptedLuminanceTexture : AdaptedLuminanceTexture <
string Type = "Persistent";
string Format = "R32F";
>;
// Depth textureThe DefaultIn and FinalIn annotations also explain why it is not always necessary to specify custom render targets: DefaultIn specifies the stages that the render target is to be used in whenever there is no explicit destination given. FinalIn specifies the stages during which the render target is to be promoted onto the screen or onto different objects (e.g. when rendering reflections), given that one of the stages specified is the last to be rendered.
Texture g_sceneDepthTexture : SceneDepthTexture <
string Type = "Permanent";
string Format = "R32F";
string DefaultIn = "Pre";
string FinalIn = "Pre";
bool bClear = true;
bool bClearDepth = true;
float4 ClearColor = 2.0e16f;
>;// Scene texture
Texture g_sceneTexture : SceneTexture <
string Type = "Permanent";
string Format = "R16G16B16A16F"; // HDR
#ifndef SCREEN_PROCESSING
string DefaultIn = "Main,Processing";
string FinalIn = "Main,Processing";
#else
string DefaultIn = "Main";
string FinalIn = "Main";
#endif
string DefaultSlot = 0; // Allows for MRT
bool bClear = true;
float4 ClearColor = float4(0.0f, 0.0f, 0.0f, 1.0f);
>;
One more important feature concerning the shader-drivenness is the possibility to define render queues inside the shader framework:
// Solid renderablesRender queues specify a layer number that influences the order in which the queues are rendered (similar to the css layer attribute, haha), as well as certain flags, such as switching off specific stages for a particular queue, or enabling depth sort for alpha-transparent objects. A shader may then specify a render queue inside its technique annotations:
RenderQueue g_solidRenderQueue : SolidRenderQueue <
unsigned int Layer = 0;
bool bDefault = true;
>;// Canvas renderables
RenderQueue g_canvasRenderQueue : CanvasRenderQueue <
unsigned int Layer = 1;
bool bPrePass = false;
bool bDepthSort = true;
>;// Alpha renderables
RenderQueue g_alphaRenderQueue : AlphaRenderQueue <
unsigned int Layer = 2;
bool bPrePass = false;
bool bDepthSort = true;
>;
technique SM_2_0 < string RenderQueue = "AlphaRenderQueue"; >]]>
I won't go into the details today, instead I will just try to please you with another video showcasing the already well-known Amsterdam TechDemo in a completely new light:
Stay tuned, great things are about to happen in very near future. (And by the way, thanks for all the nice feedback!) Wonder what Devmania is? Check it out here. And sorry for the shaky free-hand cam...
]]>The demo was developed in exactly one week, inspired by my uncle's idea of putting up towers that are connected by ropes all over Amsterdam, transporting people anywhere solely by means of gravity, and thereby solving the city's traffic problems. Besides being a quite innovative concept, this idea proved to be a great opportunity to evaluate the engine's current capabilities and workflow. There's even a video online, showing some footage taken from this testing application:
]]>Shortly after the previous entry, I finished my experimental implementation of Screen Space Ambient Occlusion. Blurring the occlusion buffer turned out to be quite harder than expected, as the random noise generated by the ambient occlusion algorithm turned into nasty cloggy artifacts the moment I tried applying some gaussian blur. After lots of experimenting, I ended up using a 12-sample poisson disc to blur the occlusion buffer, which resulted in an ok-ish image with some barely noticable smooth random patterns left. Finally, I combined the occlusion buffer with the original scene:

Besides doing lots of architectural reorganization and refactoring, I also started to implement shadow mapping. Up to now, directional light (sun) shadows are the only shadows available, as high-quality long-range shadows seemed the most demanding to me, clearing the way for other light types' shadows as well. The current implementation uses a technique known as PSSM (parallel-split shadow maps), orthographically rendering the scene to one of three different shadow splits, depending on the viewer's distance (visualized on the left screenshot). Filtering & softness still missing:

Yet, by far the largest number of commits throughout these last two months were dedicated to the API. These changes include both generalization (e.g. to make lights available in processing effects) and simplification (making overly nested methods more accessible). Lots of classes were renamed according to their final responsibilities, making the API much more understandable and intuitive. Although there are still some methods missing, the basic class hierarchy may now be considered close to final.
]]>My original intention was to implement a simple class taking over the management of scene elements that would frequently be required throughout the whole rendering process of a scene. This concept included collections of lights, renderables and perspectives, a central interface providing all the information needed to render a typical 3D object. The concept worked out pretty well. Soon, I had my first lit objects on screen, which can still be seen on the breezEngine project page.
Following this rather basic functionality, I started implementing a post-processing framework. This processing pipeline basically consisted of a list of processing effects being applied to the fully drawn scene, one after the other. The processing pipeline also provided depth, scene and screen targets any effect could write to and read from. In addition, it implemented the swapping mechanism necessary to allow for chaining of several effects. Of course, I also ran into the mysterious pixel shifting issues that almost certainly occur whenever people start implementing post-processing for the first time. Fortunately, there is this great article by Simon Brown on the net, explaining all about these issues.
Next thing, I introduced intermediate render targets into the processing pipeline, enabling shaders to define additional textures of arbitrary dimensions to write their temporary output to, allowing for blurring and downsampling without any need for additional engine code. The result of these efforts can also be seen on the breezeEngine project page as well as in the second entry below this one.
Afterwards, I realized that the concept of intermediate texture targets had even more potential than the actual implementation made use of. The basic idea was to generalize the possibility of defining additional render targets for all effects, moreover introducing the possibility to share these intermediate target textures among all effects. This led to the distinction between "temporary" and "permanent" render targets, the former only existing throughout the execution of the corresponding shader code, the latter existing throughout the rendering process of the whole scene. With this functionality implemented, it is not only possible to add pre-processing effects preparing scene-wide textures such as ambient occlusion, but it is also possible to change the whole process of rendering. For example, by introducing additional render targets, it is now possible to also render positions, normals and material IDs, allowing for the implementation of deferred shading only by changing shader code. In the end, I even removed all of the predefined render targets except for the depth buffers (and the back buffer, naturally), which led to a pretty neat design.
Lastly, the obligatory screen shots of my first attempt implementing Screen Space Ambient Occlusion:

I might also cover some of the theory behind this technique in another entry.
]]>First thing, I reinstalled subversion as I was once again fed up with all those hacky *.*_ and *.*__ backup file names. (Yes, I did work without any version control since I moved to Vista, due to laziness I guess, although it could have saved me some trouble). After some nasty fiddling around with apache and its modules, I finally got it working again and I'm really pleased with this new comfort.
Afterwards, having all the freedom of messing around with my code without breaking anything, I did some testing on the efficient rendering of landscape, which resulted in a first prototype of a hierarchical patch-based level of detail approach, theoretically allowing for infinite tesselation, as always strongly bonded by memory constraints. This approach basically works like a quadtree, splitting the landscape into four patches, again being split into four patches, and so on. Every patch uses the same vertex and index data, allowing for extremely memory efficient rendering. The height data is read from texture, which sadly limits this approach to Shader Model 3.0 hardware. I did some testing on reading the height data from texture on-the-fly, interleaving it into the vertex data using vertex streams on Shader Model 2.0 hardware. Updating heights only when necessesary, this did indeed work out. Yet, naturally, it greatly increased the number of vertex buffers, requiring a lot more VRAM. I ended up using a pool of vertex buffers shared among all patches, acquired and released according to the patches' current level of detail and visiblity.

Some early stage of my landscape implementation can also be seen on the last screen shot featuring god rays on the breezeEngine project page. More images to follow soon, including my early experiences with Screen Space Ambient Occlusion.
Besides, I am still working on the rendering pipeline, currently introducing simulation-wide scene management, perspectives and lots of other stuff. The post-processing framework is more or less complete, there has been a lot of reworking since the last entry, opening up lots of new possibilities such as mixing of forward and deferred rendering, pre-processing, shader-controlled temporary and permanent render targets etc., which I am also going to describe in one of the next entries.
]]>Hope you like 'em. The framework's still WIP, yet I think it's already working out quite fine.
]]>