Category Archives: Programming

DirectX 12 Demos

This last month, DirectX 12 launched alongside windows 10 so I decided to see what would happen if I ported my engine backend to run on something besides OpenGL. Because I’m crazy enough to attempt something like this, I dove into what limited documentation there was on D3D12 and went at it. I wanted to get a head start on some of the concepts that I know are coming when Vulkan launches later this year, and to get myself off of a strict dependency of OpenGL for my engine graphics. One crucial decision that I made early was to continue using just one shader language (GLSL) for even the D3D12 engine backend. I did this because Verto Studio files often contain embedded shader code that is written against the Verto Studio shader standard, and I wasn’t about to break that part of the platform. Luckily, the Angle project’s shader translator proved to be a perfect solution to auto-translate my GLSL code to HLSL and interoperate with the rest of my D3D12 layer quite nicely.

I had worked at this low of a level before and I was amazed at how hard I had to bust my ass to get performance that would first match, and eventually beat my windows NVIDIA OpenGL drivers in some cases. Whats more interesting, is that I had managed to double the framerate performances that I was getting on OpenGL on OSX, which really speaks to how bad the Apple OpenGL drivers really are in comparison to windows. Mechanisms such as pipeline state caching, frame latency planning, and large ahead-of-time heap allocations were all necessary to bring my performance up to par. There still are some hiccups but for the most part, I’m really pleased with what amounted from about a week and a half of messing around with D3D12.

Windows 10 demos are below. Keyboard shortcuts aren’t that great but namely arrow keys revolve the camera, and shift+alt combinations allow you to move forward, and vertically. These demos don’t push the bar graphically and they certainly come nowhere close to pushing D3D12 to what it can do, but if you just want something to run that uses DirectX 12, well here you go.

Download DirectX 12 Demos

OpenGL equivalent demos are provided in the zip archive as well for comparison..

Ambient Occlusion

So I got the ambient occlusion effect implemented in the actual game shader code.  I followed the technique outlined in this article.  I must say, it’s very subtle, but it looks nice.  The real benefit here is that when I turn off the “check depth range” mechanism, I get a very awesome surreal dark outline around all of the models in the scene.  This will be an excellent effect to use when I work on the second half of the game (which will use an artistic black and white noire post processing style signifying the protagonists slowly weakening grip on reality).

I didn’t understand how ambient occlusion worked at all before implementing the effect, but now I must say, I have a pretty reasonable grasp on it.  Basically in one brief description, it takes random samples at every point on the surface to determine how much ambient light from the rest of the scene can reach that point.  In other words, how much of the ambient light is occluded by other nearby surfaces.  This is done by randomly sampling points within a small hemisphere emanating from the surface.  What makes SSAO so nice, is that this technique is approximated entirely in screen space, making it reasonable for real-time rendering in a game.  The downside is it seems to be a little noisy and I see artifacts from time to time on the screen, but hopefully I’ll be able to clean some of that up in time.

Below are some more progress shots showing the kinds of subtle shadowing I can achieve using SSAO in the VGLPP/DrivebyGangster engine.

Ambient Occlusion!

Today was huge!  I am exhausted.  I busted my ass today to get an affect known as screen-space ambient occlusion (SSAO) working in a prototype scene in verto studio.  Right when I was about to give up, I got it working!  I’m too groggy to explain anything about it, so I’ll just post pictures!

This is going to add a huge boost to the visual look-and-feel of the game without a severe performance penalty.  Needless to say, I’m pretty stoked!

 

 

OpenGL point light shadows & Atomspheric FX hacks

So asset creation in the form of modeling the final environment that’ll be used in the game took the past few days.  It’s amazed how much faster it went this time compared the first scene (the city street).  I chalk this up to experience that I’ve picked up now that I’ve been doing this awhile, and my understanding of things that are very time consuming to model vs things I can quickly grab, import, texture, and place from turbosquid.  The perfect balance of both techniques has led to me being done with the dark night alley street scene after just a few days.

Where I spent the most time this time was in the shaders.  I really wanted to have quality rendering and go out with a bang since this’ll likely be the last environment that I do for the game.  I want the dark alleyway to appear foggy and dark, like its just about to rain.  That includes moist shiny cobblestone rocks, and a damp fog effect that feels very palpable to the player.

This idea of mine posed two major problems that I haven’t had to tackle until now.

Point Light Shadows

The first major problem, is that I could no longer “cheat” using a single directional light source representing the “sun” in the scene.  Instead, I had to place a street lamp every building or so and have these be the only sources of light in the scene.  This means, in order to keep my shadows working, I now had to implement shadows from a point light source.  In some ways this was easier than point lighting (no longer necessary to compute an optimal shadow bounding box to be used for the light’s ortho matrix projection).  However, now I needed render to and utilize cube maps to store the shadow depth information.  Surprisingly, there was very little comprehensive information on the web about how to properly do PCF shadows using cubemaps.

What I found that works is the following.

  • Create a cubemap renderer that is positioned at the exact same position as one of the point lights - this special vgl engine object renders the scene into 6 faces of a cubemap with 90-degree fov angles to properly capture the entire scene from “all angles”.
  • Format the cubemap shader to have each face hold floating point depth-component values and keep the size small (256×256) since there will be alot of these.
  • Define an “override” shader to be used by the above cubemap renderer to ensure a simple specialized “light render” shader was used when rendering the scene into the cubemap faces.  
  • Set the cubemap compare mode (GL_TEXTURE_COMPARE_MODEfor the rendered cubemap texture to GL_COMPARE_REF_TO_TEXTURE which helps with percentage-closest-filtering (PCF) mode to allow smoother linear-interpolated shadow edges.  
  • Lastly, pass a rendered shadow cubemap for each light to the scene shaders in the form of samplerCubeShadow

The “light render” shader mentioned above which is used when rendering the depth information to the cubemaps looks like so:

Basically this encodes the distance from the surface point to the light in the form normalized depth information.  This is done by manually overriding the depth value stored in gl_FragDepth.  Each light has its own cubemap renderer with a custom-tailored shader like this (using the correct light index to compute distance from).

The per-fragment lighting shader code that utilizes the cube shadow maps looks like:

Essentially the highlighted code above compares the distance between the surface point to the light to the stored depth value in the cube texture using the fragment-light vector as a lookup value into the cubemap.  Because we’re sampling using a special “shadow” version of the cubemap sampler, the result will be properly interpolated between shadow texels to avoid ugly edges between shadowed and non-shadowed areas.

Luckily, I was able to build this into the Verto Studio editor and associated graphics system and test this out with relatively little trouble.  Even though I have 4 or 5 statically rendered cubemap shadow textures in the entire scene.  I was able to keep performance high by building versions of the shaders for each side of the street so that each individual shader only has to shade with at most 3 lights at a time.  This worked out better than I had expected.

Light-Fog Interaction (Volume FX)

This part was tricky..  I had an idea in my head of how I wanted this to look.  So I did something that usually leads to trouble;  I tried to come up with a volume rendering technique on my own from scratch and implement it and just kinda see how it goes.

The basic idea stems from what I’ve observed in real life from foggy, dark, damp nights and lighting.  Essentially, fog at night is black… IF there is no light around to interact with the fog.  Naturally, if the water droplets don’t have any light to interact with them, you won’t see them and the fog will appear black.  However, if any light interacts with the water vapor in the air, it’ll create the illusion of a whiter colored and denser fog.  So this is what I set out to emulate with my shader.

Now atmospheric shader effects can often lead to the necessity of raymarching and heavy iteration in the fragment shader to simulate the accumulation of light-atomosphere interaction.  To this I said “hell no” since ray marching of any kind in a shader terribly robs performance.  I quickly realized that I could avoid raymarching entirely if I used a simple model to represent the light-fog interaction that I was going for.

In my case, it turned out I could do the whole effect using something as simple as a sphere intersection test.  Basically, when I’m shading a pixel (a point on the surface), I’m interested in what happens to the light on its way back from the surface to the viewer, the surface-to-viewer vector.  If the atmosphere affects the light at any point along this vector, I’ll need to compute that.  In other words, if the ray from the surface to the viewer intersects a sphere centered at the light, then the fog affects the light on the way back to the viewer.  How much?  Well if I calculate the length of the segment between the entry and exit points of the ray intersection (how much of the sphere the ray pierces), I find that length is proportional to both the perceived increase in density of the fog and the brightening of the fog color.

This algorithm is given below in fragment shader code:

 

I’m sure I’m not the first one to come up with this idea before, but it still did feel pretty cool to reason my way through a shading problem like this.  The visual result looks amazing.

The progress of this all is below in a gallery.  

How to 3D sound in windows (hint: it’s not OpenAL)

So yesterday and today, I bit the bullet and started working on compiling the game for windows.  I wanted to do this before the project got too much further along so I didn’t write any code that would have serious implications on windows.  I must say, I’m amazed at how much trouble I didn’t have with the rendering system.  The levels/screens load up and look great!  I did have some issues with visual studio that I chalk up to lack of experience in windows dev on my part.  Namely getting the builds to be 64-bit instead of the default 32, SSE3 operator overload stuff, and getting debug builds to forego some of the extremely slow settings that make the engine run at a crawl.  Once I took care of all that, my little game was running on windows like a champ, complete with same stellar OpenGL 3.3 performance I get on mac.  All was well… that is except for one little tiny issue.  The sound.

One thing that became apparent almost right away was that OpenAL support on windows is in a terrible state.  Counting on it to just be there like OpenGL is a no no.. and installing your own via something like OpenAL soft gives pretty poor support.  I realized that I had to do something I’ve never done before, and that’s directly use a DirectX subsystem.  After some quick googling I discovered what I need can be done using a combination of xAudio2 and X3DAudio.  Both of these systems come with DirectX and the DX SDK so I was already set up to use em.

So basically, the plan on windows was pretty simple:  since I already have all my 3D sound calls abstracted out in my SoundManager class, I all I had to do is re-implement the stuff that OpenAL handled using this new xAudio2 stuff.. and pray that it played nice along side SDL_Mixer which I use for mp3 music.  More or less, things went relatively painless and I was able to find an xAudio2/X3DAudio analog for just about everything I was doing in OpenAL EXCEPT for a little issue I had with source playback.

On mac, I’m able to play a source overtop of itself as many times as I want to.  On windows, this seems to be a no-go.  So for win, I had to manually pre-allocate pools of sources for the sounds that I would be playing rapidly such as bullet ricochet sounds and typewriter keystrokes and ensure that I return an available source whenever I ask to play that sound.  That process more or less worked out just fine.

After all that, we have my reworked soundmanager class for windows.  The code is below for anyone who wants to learn from it.

All of the multiple source stuff is essentially handled by my inner MultiSource class which handles the souce pools for sounds that need em.  I found that I never needed more than 8 sources per sound.  I also had to add a new update3D method to update the 3D sound calculations and call it once every other frame or so (which wasn’t needed in OpenAL since it was done automatically).

Not shown here is a Wave class which essentially is a slightly modified version of the one from this tutorial.  That class basically handles wav file loading for me and manages the sound buffer data internally.  

That’s just about it.  Everything else was pretty straightforward.

C++11′s take on retain cycles

I’d like to take a brief moment to discuss something rather programmy today.  Namely, the topic of retain cycles.  If you’ve ever programmed in Objective-C before (since the addition of ARC) or Swift, this concept will be very familiar to you.  But if you’re coming from other programming languages such as the older (pre-11) C++, This concept will be new and you might want to read on.

Now before I get started, let me say that this example may not be the most correct way to handle delegation via lambda (block) functions in C++11.  Many will tell you that shared ownership of objects in C++ can get you into trouble, and this is certainly a case of how that can happen.  However it’s the most conceptually compatible with other languages, which I value over “c++ correctness” due to the fact that I’m porting code from another language to start with, and shared ownership allows me to get work done without reasoning around the lifespan of my objects.

Now as many newcomers to C++11 know, concepts such as smart pointers and lambda callback delegation can really save you alot of time when developing new systems.  However, there’s ahidden evil in using these two concepts together, that must be uncovered.

Example 1 :: All is well

Consider this example:

and its output:

What we’ve essentially done here is created an object (via shared pointer), set a callback lambda, and issued a method call which at some point calls our callback.  We see that everything is working as desired.  Most importantly, we see that our object is properly deleted when we are finished our program (when the shared pointer goes out of scope).

Example 2 :: Not so much

Now lets look what happens when I add a slightly more complicated, second case.

Obj2 essentially is the same as obj1 right?  All we do is add a little printout of obj2′s someProperty value from within the lambda… and of course, C++11 dictates that we must explicitly declare all external objects that we access within our lambda (a capture list).

So what is the output then?

Note the alarming lack of “object freed” being displayed for obj2.  That’s right, the above code memory-leaks obj2.  You see, obj2 owns someCallback, and someCallback owns obj2 via a capture. This is a retain cycle and it permanently causes obj2 to be “forever retained” and therefore leaks.  Once a retain cycle is made, you can consider things to be more-or-less, too late.

This type of bug is a dangerous silent killer of the night.  Why?  Because it can be so easy to miss when developing.  Objective-C and swift have evolved over time to add compiler and IDE warnings about this kind of “retain cycle” problem.  C++11 assumes that you just wouldn’t write this type of code.  But when porting code over from Objective-C, this kind of code can pop up ALL the time.

So what do we do?

Solution 1 :: The ObjC way

Well Objective-C handles this problem by introducing the concept of “weak pointers”.  These are special pointers that ensure that they do not own their reference, they only keep track of whether or not it is allocated (if any shared pointers to it exist).  Luckily, C++11 brings the weak pointer to us as well.  A solution that uses them looks like so:

You’ll note now that running this code will solve the leak.  It’s also important to note that, while this is an ugly solution, this is/was the only way to solve a retain cycle of this nature in Objective-C.

However, for this explicitly unique case, in C++, there’s another way.

Solution 2 :: The tempting C++ way

What if I told you there was a way to solve the original problem by adding one character to the original code?  Here it is:

See that little ‘&’ added before obj2 in the capture?  This allows us to capture a reference to the smart pointer and not a value.  Why does this matter?  Well when a smart_pointer value is copied, it essentially increments its retain count of the object in question, and when it finally is destroyed by going out of scope, it’ll decrement the retain count, restoring the natural balance to the universe.  However, as I mentioned before, a retain cycle prevents the retain count from ever reaching zero in this case.  A reference to obj2′s smart pointer value allows us to forego the copy-increment mechanism and essentially stop obj2 from being over-retained.  Its essentially like taking a pointer to a pointer….

However, this solution can be dangerous, as it assumes the coder understands the rammifcations of doing things like capturing a reference to a stack variable, an argument, or anything that may be “dead” by the time the callback is ran.  Almost always, this “solution” may give unintended consequences.

The “correct” way

Here’s where things get somewhat opinionated. The “correct” way to handle this really depends on the situation at hand. I suggest that if you a working with a team, you might want to use the safer and more verbose weak pointer method since it is more explicit in its intentions and helps explain why strong ownership could be a problem.  Furthermore, if you are working in a smaller system where shared pointership is not necessary, you might want to avoid using shared pointers altogether (assuming you fully understand the ownership semantics of the objects being used in the captures).

Either way, the golden rule is to understand the side effects of the code you are writing and when using shared pointers, keep that little retain cycle demon in the back of your mind… always

Parallax Occlusion Mapping – Advances in shader land

Today was cool… at least the first part of it was, before I spend 5 hours tracking down a bug in some of the game code.

I stumbled onto a pretty cool little article geared towards a very awesome technique called “parallax occlusion mapping”. This is essentially bump mapping with some extra samples taken at each pixel to provide a realistic depth (or parallax) effect to the surface. This provides the illusion of realistic surface detail all while shading a single polygon. The article was somewhat dated and targeted for Direct3D instead of OpenGL, but I was able to port it over to GLSL 150 for OpenGL 3.2.

Here’s the original article:
http://www.gamedev.net/page/resources/_/technical/graphics-programming-and-theory/a-closer-look-at-parallax-occlusion-mapping-r3262

And here’s my port of the shader code for anyone who is interested.

 

Some of this code is “massaged” by Verto Studio’s code converter when ran on mobile.  It’s definitely a pretty cool effect!  It requires 3 texture maps to work currently, a standard diffuse (color texture) map, a normal map, and a displacement or height map.  Lucky for me, there’s a sick little program called “crazy bump” for mac that can generate both normal maps and displacement maps from any standard diffuse texture map!

 WebGL demo

For those who want to see the shader effect in action, I got a WebGL demo which runs on chrome and safari (firefox and IE don’t work).

It’s an expensive effect however, so I’m not sure yet if I can work it in for Driveby Gangster.  Either way, it’ll definitely be a nice little addition to the shader arsenal.  If I do actually put it into use, I hope to eventually add self-shadowing effects as well.

Swift is fired

So it had to happen…

The more I work on this game the more I realize that I kind of like it… and that means there’s certain things need to change.  For starters, it’s definitely running longer than I expected it to (the original two-week estimate now sounds pretty “out there”).  For this reason, I need to shave down on the programming time. It has become pretty obvious to me that for every 1/2 hour that I spend writing code in swift for this game, 10 minutes of it is spent screwing around with the language and that is friction then I can no longer accept.  So I’m dropping swift for this game.

Since I’m switching languages and I’m going to be rewriting some code, I decided to open things up quite a bit and go back to C++.  It’s going to cost me roughly a week or two to rewrite things, but the benefit is going to be huge.  For starters, since this game depended on my very large and very complicated Objective-C graphics engine (Verto Studio), I now have the “opportunity” to continue rewriting all of this code in C++ (this was originally started on my github as VGLPP).  This means that this game and any other game that I want to use with the ported engine will be able to run on ANY platform including windows.  This is a major win for me since making a game for OSX/iOS only isn’t really that smart.  I’ve been wanting to do this for awhile and my frustrations with swift finally pushed me to do it.

So that’s what I’m going to be up to for awhile.  As I’ve mentioned on twitter, I’ll be streaming roughly twice a week on my twitch channel for anyone who is curiuos about the process and wants to follow along.  http://twitch.tv/vertostudio3d

I’ve already gotten quite a bit done and have surpassed my biggest barrier for doing this which was finding a reasonable way to parse verto studio files (spat out by NSKeyedArchiver/NSCoder objective-c mechanisms) in a portable way in C++.  TinyXML and a little of reverse engineering did the trick.

With respect to all of the “benefits” swift and Objective C were providing over a lower level language such as C++, I’ve essentially found a reasonable replacement for each one thanks to C++11′s new features.

Basic Outline of Porting to C++ 

Swift/Objc C++11
Strong Pointer (Reference Counting) std::shared_ptr
Weak Pointer (Reference Counting) std::weak_ptr
Blocks & Closures Lambda Functions
Protocols Multiple Inheritence
Dynamic Typechecking via “isKindOfClass” C++ typeid / RTTI

So you get the idea, nothings really out of reach with C++. It’s still more annoying and I’m not 100% thrilled with dealing with things like templates, header files and verbose syntax again, but I’ll take it over an unstable language any day.

Lots of code… lots

I’ve been less active on here because I’ve been writing quite a bit of code.  Like I said previously, I’m in the main swing of developing this game.  This is where things start to get crazy.  Namely, I’ve added quite a bit of classes to the project to handle everything from basic collisions and 3D math extensions to generating and displaying 3D text on the screen.  I feel like (despite my previous post regarding Swift) things are keeping organized quite well and I haven’t strayed too far from my original architecture plan.  I’ve spent most of my time working on the PracticeGameStateRunner class which runs the “Target Practice” initial level in the game.  This mini level serves as a point for the player to learn the very simple controls and game mechanics of the game… and it’s serving excellently as a sandbox for me to test these all during development as well.  

I’ll cover just a few pieces of code today to show the changes that I’ve made regarding the game State Runner protocol, and some cool stuff that I’ve been able to do with “smart” enums in swift.  There’s a lot more I can talk about that I don’t want this post to go on forever.

Some Code

State runner protocol.  Now I have some stuff in there to quickly respond to “game controller” and mouse events, all stemming from the game loop class.

Mesh Line Collider – a badly needed construct to determine whether or not a bullet-trajectory would intersect a polygonal mesh in the scene or not (and if so where).  I tried porting this over to ObjC before swift and it was a nitemare.  Swift’s version is definitely simpler thanks to operator overloading and multiple return values.  Note the unsafe pointer craziness in swift which is considerably easier to deal with in C.  (ported from http://geomalgorithms.com/a06-_intersect-2.html)

Cool little WIP controller button enum nested in the game loop class

 

Some Screens

More about swift…

Okay so when you use a language daily, you begin to really pick up some opinions and feelings about it that tend to really stick with you.  With swift, I definitely have some of those.

Let me focus on just two… of the things that really irk me about the freaking swift language.

Number one – Casting

This is my number one gripe with the swift language.  The compiler just wont shut the hell up and do what I was intending.  Now I know that apple has designed the language to absolutely not allow any implicit casting, and I understand their reasons behind this, but man, you’ve taken one benefit, and added 10-fold the costs.  Requiring explicit casting takes code in swift that should look extremely simple and adds about 3-4 explicit constructor-casts to it causing the code to display as bloated, confusing ill-intended crap.  What really bothers me is that this is the case even for upgrading types such as assigning a float into a double.  Another problem is that in the C world (all of the APIs that we work with on a daily basis) there might be 4 or 5 different integer types (int, Sint32, GLuint, etc) and multiple float types (GLfloat, float, CGFloat, etc).  Casting between these in C/C++/Objc is completely harmless 99% of the time and therefore done implicitly.  In swift, I have to pepper this code with constructor casts every, single, freaking time.

In C++, you could allow implicit types by overloading the constructor so that an expression such as int x = someFloat would be the same as a constructor for int that takes a float as an argument.  Swift badly needs this.

For now I’ve gotten to the point where I’ve added very simple computed properties to the various types in swift to at least make some of this casting more painless.  Just a few of those extensions are below.

This makes it so that code that would have to look like

Can be…

 

Number two – Stability (source kit crashes)

Google swift and “sourcekit” and you’ll see, this is nowhere NEAR an actual finished language.  Apple has rushed this out the door and I’m honestly sorry that I’ve decided to use it to make a game.  Imagine typing code and then every half hour you get 30 seconds of a frozen spinner before ”sourcekit crashed” appears in your IDE.  If you’re lucky, it’ll stop for another 5 minutes, but if you aren’t it’ll happen in a loop and you have to go delete your “ModuleCache” dir and quit and restart xcode and start working again…. until about an hour later.  This is NOT how I prefer to work.  I like the “idea” of swift, but its just way too unstable to use.  I’m going to finish this game with it, and then probably stay away from swift for any real projects for another year or so until apple fixes this extremely annoying stability problem.

 

These two things may seem small, but they’re enough to waste 10 minutes of time for every hour I work on this game, and for me, that’s just completely unacceptable.