Takua Render Revision 5

Rough blue metallic XYZRGB Dragon model in a Cornell Box, rendered entirely with Takua Render a0.5

I haven’t posted much at all this past year due, but I’ve been working on some stuff that I’m really excited about! For the past year and a half, I’ve been building a new, much more advanced version of Takua Render completely from scratch. In this post, I’ll give a brief introduction and runthrough of the new version of Takua, which I’ve numbered as Revision 5 or a0.5. Since I first started exploring the world of renderer construction a few years back, I’ve learned an immense amount about every part of building a renderer, ranging all the way from low level architecture all the way up to light transport and surface algorithms. I’ve also been fortunate and lucky enough to be able to meet and talk to a lot of people working on professional, industry quality renderers and people from some of the best rendering research groups in the world, and so this new version of my own renderer is an attempt at applying everything I’ve learned and building a base for even further future improvement and research projects.

Very broadly, the two things I’m most proud of with Takua a0.5 are the internal renderer architecture and a lot of work on integrators and light transport. Takua a0.5’s internal architecture is heavily influenced by Disney’s Sorted Deferred Shading paper, the internal architecture of NVIDIA’s Optix engine, and the modular architecture of Mitsuba Render. In the light transport area, Takua a0.5 implements not just unidirectional pathtracing with direct light importance sampling (PT), but also correctly implements multiple importance sampled bidirectional pathtracing (BDPT), progressive photon mapping (PPM), and the relatively new vertex connection and merging (VCM) algorithm. I’m planning on writing a series of posts in the next few weeks/months that will dive in depth into Takua a0.5’s various features.

Takua a0.5 has also marked a pretty large shift in strategy in terms of targeted hardware. In previous versions of Takua, I did a lot of exploration with getting the entire renderer to run on CUDA-enabled GPUs. In the interest of increased architectural flexibility, Takua a0.5 does not have a 100% GPU mode anymore. Instead, Takua a0.5 is structured in such a way that certain individual modules can be accelerated by running on the GPU, but overall much of the core of the renderer is designed to make efficient use of the CPU to achieve high performance while bypassing a lot of the complexity of building a pure GPU renderer. Again, I’ll have a detailed post on this decision later down the line.

Here is a list of the some of the major new things in Takua a0.5:

  • Completely modular plugin system
    • Programmable ray/shader queue/dispatch system
    • Natively bidirectional BSDF system
    • Multiple geometry backends optimized for different hardware
    • Plugin systems for cameras, lights, acceleration structures, geometry, viewers, materials, surface patterns, BSDFs, etc.
  • Task-based concurrency and parallelism via Intel’s TBB library
  • Mitsuba/PBRT/Renderman 19 RIS style integrator system
    • Unidirectional pathtracing with direct light importance sampling
    • Lighttracing with camera importance sampling
    • Bidirectional pathtracing with multiple importance sampling
    • Progressive photon mapping
    • Vertex connection and merging
    • All integrators designed to be re-entrant and capable of deferred operations
  • Native animation support
    • Renderer-wide keyframing/animation support
    • Transformational AND deformational motion blur
    • Motion blur support for all camera, material, surface pattern, light, etc. attributes
    • Animation/keyframe sequences can be instanced in addition to geometry instancing

The blue metallic XYZRGB dragon image is a render that was produced using only Takua a0.5. Since I now have access to the original physical Cornell Box model, I thought it would be fun to use a 100% measurement-accurate model of the Cornell Box as a test scene while working on Takua a0.5. All of these renders have no post-processing whatsoever. Here are some other renders made as tests during development:

Vanilla Cornell Box with measurements taken directly off of the original physical Cornell Box model.

Glass Stanford Dragon producing some interesting caustics on the floor.

Floating glass ball as another caustics test.

Mirror cube.

Deformational motion blur test using a glass rectangular prism with the top half twisting over time.

A really ugly texture test that for some reason I kind of like.

More interesting non-Cornell Box renders coming in later posts!

Edit: Since making this post, I found a weighting bug that was causing a lot of energy to be lost in indirect diffuse bounces. I’ve since fixed the bug and updated this post with re-rendered versions of all of the images.

SIGGRAPH Asia 2014 Paper- A Framework for the Experimental Comparison of Solar and Skydome Illumination

One of the projects I worked on in my first year as part of Cornell University’s Program of Computer Graphics has been published in the November 2014 issue of ACM Transactions on Graphics and is being presented at SIGGRAPH Asia 2014! The paper is “A Framework for the Experimental Comparison of Solar and Skydome Illumination”, and the team on the project was my junior advisor Joseph T. Kider Jr., my lab-mates Dan Knowlton and Jeremy Newlin, myself, and my main advisor Donald P. Greenberg.

The bulk of my work on this project was in implementing and testing sky models inside of Mitsuba and developing the paper’s sample-driven model. Interestingly, I also did a lot of climbing onto the roof of Cornell’s Rhodes Hall building for this paper; Cornell’s facilities was kind enough to give us access to the roof of Rhodes Hall to set up our capture equipment on. This usually involved Joe, Dan, and myself hauling multiple tripods and backpacks of gear up onto the roof in the morning, and then taking it all back down in the evening. Sunny clear skies can be a rare sight in Ithaca, so getting good captures took an awful lot of attempts!

Here is the paper abstract:

The illumination and appearance of the solar/skydome is critical for many applications in computer graphics, computer vision, and daylighting studies. Unfortunately, physically accurate measurements of this rapidly changing illumination source are difficult to achieve, but necessary for the development of accurate physically-based sky illumination models and comparison studies of existing simulation models.

To obtain baseline data of this time-dependent anisotropic light source, we design a novel acquisition setup to simultaneously measure the comprehensive illumination properties. Our hardware design simultaneously acquires its spectral, spatial, and temporal information of the skydome. To achieve this goal, we use a custom built spectral radiance measurement scanner to measure the directional spectral radiance, a pyranometer to measure the irradiance of the entire hemisphere, and a camera to capture high-dynamic range imagery of the sky. The combination of these computer-controlled measurement devices provides a fast way to acquire accurate physical measurements of the solar/skydome. We use the results of our measurements to evaluate many of the strengths and weaknesses of several sun-sky simulation models. We also provide a measurement dataset of sky illumination data for various clear sky conditions and an interactive visualization tool for model comparison analysis available at http://www.graphics.cornell.edu/resources/clearsky/.

The paper and related materials can be found at:

Joe Kider will be presenting the paper at SIGGRAPH Asia 2014 in Shenzen as part of the Light In, Light Out Technical Papers session. Hopefully our data will prove useful to future research!


Addendum 2017-04-26

I added a personal project page for this paper to my website, located here. My personal page mirrors the same content found on the main site, including an author’s version of the paper, supplemental materials, and more.

PIC/FLIP Simulator Meshing Pipeline

In my last post, I gave a summary of how the core of my new PIC/FLIP fluid simulator works and gave some thoughts on the process of building OpenVDB into my simulator. In this post I’ll go over the meshing and rendering pipeline I worked out for my simulator.

Two years ago, when my friend Dan Knowlton and I built our semi-Lagrangian fluid simulator, we had an immense amount of trouble with finding a good meshing and rendering solution. We used a standard marching cubes implementation to construct a mesh from the fluid levelset, but the meshes we wound up with had a lot of flickering issues. The flickering was especially apparent when the fluid had to fit inside of solid boundaries, since the liquid-solid interface wouldn’t line up properly. On top of that, we rendered the fluid using Vray, but relied on a irradiance map + light cache approach that wasn’t very well suited for high motion and large amounts of refractive fluid.

This time around, I’ve tried to build a new meshing/rendering pipeline that resolves those problems. My new meshing/rendering pipeline produces stable, detailed meshes that fit correctly into solid boundaries, all with minimal or no flickering. The following video is the same “dambreak” test from my previous test, but fully meshed and rendered using Vray:

One of the main issues with the old meshing approach was that marching cubes was run directly on the same level set we were using for the simulation, which meant that the resolution of the final mesh was effectively bound to the resolution of the fluid. In a pure semi-Lagrangian simulator, this coupling makes sense, however, in a PIC/FLIP simulator, the resolution of the simulator is dependent on the particle count and not the projection step grid resolution. This property means that even on a simulation with a grid size of 128x64x64, extremely high resolution meshes should be possible if there are enough particles, as long as a level set was constructed directly from the particles completely independently of the projection step grid dimensions.

Fortunately, OpenVDB comes with an enormous toolkit that includes tools for constructing level sets from various type of geometry, including particles, and tools for adaptive level set meshing. OpenVDB also comes with a number of level set operators that allow for artistic tuning of level sets, such as tools for dilating, eroding, and smoothing level set. At the SIGGRAPH 2013 OpenVDB course, Dreamworks had a presentation on how they used OpenVDB’s level set operator tools to extract really nice looking, detailed fluid meshes from relatively low resolution simulations. I also integrated Walt Disney Animation Studios’ Partio library for exporting particle data to standard formats so that I could get particles, level sets, and meshes.

Zero adaptive meshing (on the left) versus adaptive meshing with 0.5 adaptivity (on the right). Note the significantly lower poly count in the adaptive meshing, but also the corresponding loss of detail in the mesh.

I started by building support for OpenVDB’s adaptive level set meshing directly into my simulator and dumping out OBJ sequences straight to disk. In order to save disk space, I enabled fairly high adaptivity in the meshing. However, upon doing a first render test, I discovered a problem: since OpenVDB’s adaptive meshing optimizes the adaptivity per frame, the result is not temporally coherent with respect to mesh resolution. By itself this property is not a big deal, but it makes reconstructing temporally coherent normals difficult, which can contribute to flickering in final rendering. So, I decided that disk space was not as big deal and just disabled adaptivity in OpenVDB’s meshing for smaller simulations; in sufficiently large sims, the scale of the final render more often than not will make normal issues far less important and disk resource demands become much greater, so the tradeoffs of adaptivity become more worthwhile.

The next problem was getting a stable, fitted liquid-solid interface. Even with a million particles and a 1024x512x512 level set driving mesh construction, the produced fluid mesh still didn’t fit the solid boundaries of the sim precisely. The reason is simple: level set construction from particles works by treating each particle as a sphere with some radius and then unioning all of the spheres together. The first solution I thought of was to dilate the level set and then difference it with a second level set of the solid objects in the scene. Since Houdini has full OpenVDB support and I wanted to test this idea quickly with visual feedback, I prototyped this step in Houdini instead of writing a custom tool from scratch. This approach wound up not working well in practice. I discovered that in order to get a clean result, the solid level set needed to be extremely high resolution to capture all of the detail of the solid boundaries (such as sharp corners). Since the output levelset from VDB’s difference operator has to match the resolution of the highest resolution input, that meant the resultant liquid level set was also extremely high resolution. On top of that, the entire process was extremely slow, even on smaller grids.

The mesh on the left has a cleaned up, stable liquid-solid interface. The mesh on the right is the same mesh as the one on the left, but before going through cleanup.

The solution I wound up using was to process the mesh instead of the level set, since the mesh represents significantly less data and at the end of the day the mesh is what we want to have a clean liquid-solid interface. The solution is from every vertex in the liquid mesh, raycast to find the nearest point on the solid boundary to each vertex (this can be done either stochastically, or a level set version of the solid boundary can be used to inform a good starting direction). If the closest point on the solid boundary is within some epsilon distance of the vertex, move the vertex to be at the solid boundary. Obviously, this approach is far simpler than attempting to difference level sets, and it works pretty well. I prototyped this entire system in Houdini.

For rendering, I used Vray’s ply2mesh utility to dump the processed fluid meshes directly to .vrmesh files and rendered the result in Vray using pure brute force pathtracing to avoid flickering from temporally incoherent irradiance caching. The final result is the video at the top of this post!

Here are some still frames from the same simulation. The video was rendered with motion blur, these stills do not have any motion blur.

New PIC/FLIP Simulator

Over the past month or so, I’ve been writing a brand new fluid simulator from scratch. It started as a project for a course/seminar type thing I’ve been taking with Professor Doug James, but I’ve been working on since the course ended for fun. I wanted to try our implementing the PIC/FLIP method from Zhu and Bridson; in industry, PIC/FLIP has more or less become the de fact standard method for fluid simulation. Houdini and Naiad both use PIC/FLIP implementations as their core fluid solvers, and I’m aware that Double Negative’s in-house simulator is also a PIC/FLIP implementation.

I’ve named my simulator “Ariel”, since I like Disney movies and the name seemed appropriate for a project related to water. Here’s what a “dambreak” type simulation looks like:

That “dambreak” test was run with approximately a million particles, with a 128x64x64 grid for the projection step.

PIC/FLIP stands for Particle-In-Cell/Fluid-Implicit Particles. PIC and FLIP are actually two separate methods that each have certain shortcomings, but when used together in a weighted sum, produces a very stable fluid solver (my own solver uses approximately a 90% FLIP to 10% PIC ratio). PIC/FLIP is similar to SPH in that it’s fundamentally a particle based method, but instead of attempting to use external forces to maintain fluid volume, PIC/FLIP splats particle velocities onto a grid, calculates a velocity field using a projection step, and then copies the new velocities back onto the particles for each step. This difference means PIC/FLIP doesn’t suffer from the volume conservation problems SPH has. In this sense, PIC/FLIP can almost be thought of as a hybridization of SPH and semi-Lagrangian level-set based methods. From this point forward, I’ll refer to the method as just FLIP for simplicity, even though it’s actually PIC/FLIP.

I also wanted to experiment with OpenVDB, so I built my FLIP solver on top of OpenVDB. OpenVDB is a sparse volumetric data structure library open sourced by Dreamworks Animation, and now integrated into a whole bunch of systems such as Houdini, Arnold, and Renderman. I played with it two years ago during my summer at Dreamworks, but didn’t really get too much experience with it, so I figured this would be a good opportunity to give it a more detailed look.

My simulator uses OpenVDB’s mesh-to-levelset toolkit for constructing the initial fluid volume and solid obstacles, meaning any OBJ meshes can be used to building the starting state of the simulator. For the actual simulation grid, things get a little bit more complicated; I initially started with using OpenVDB for storing the grid for the projection step with the idea that storing the projection grid sparsely should allow for scaling the simulator to really really large scenes. However, I quickly ran into the ever present memory-speed tradeoff of computer science. I found that while the memory footprint of the simulator stayed very small for large sims, it ran almost ten times slower compared to when the grid is stored using raw floats. The reason is that since OpenVDB under the hood is a B+tree, constant read/write operations against a VDB grid end up being really expensive, especially if the grid is not very sparse. The fact that VDB enforces single-threaded writes due to the need to rebalance the B+tree does not help at all. As a result, I’ve left in a switch that allows my simulator to run in either raw float of VDB mode; VDB mode allows for much larger simulations, but raw float mode allows for faster, multithreaded sims.

Here’s a video of another test scene, this time patterned after a “waterfall” type scenario. This test was done earlier in the development process, so it doesn’t have the wireframe outlines of the solid boundaries:

In the above videos and stills, blue indicates higher density/lower velocity, white indicate lower density/higher velocity.

Writing the core PIC/FLIP solver actually turned out to be pretty straightforward, and I’m fairly certain that my implementation is correct since it closely matches the result I get out of Houdini’s FLIP solver for a similar scene with similar parameters (although not exactly, since there’s bound to be some differences in how I handle certain details, such as slightly jittering particle positions to prevent artifacting between steps). Figuring out a good meshing and rendering pipeline turned out to be more difficult; I’ll write about that in my next post.

Takua Chair Renders

A while back, I did some test renders with Takua a0.4 to test out the material system. The test model was a model of an Eames Lounge Chair Wood, and the materials were glossy wood and aluminum. Each render was done with a single large, importance sampled area light and took about two minutes to complete.

These renders were the last tests I did with Takua a0.4 before starting the new version. More on that soon!

Throwback- Holiday Card 2011

Two years ago, I was asked to create CG@Penn’s 2011 Holiday Card. Shortly after finishing that particular project, I started writing a breakdown post but for some reason never finished/posted it. While going through old content for the move to Github Pages, I found some of my old unfinished posts, and I’ve decided to finish up some of them and post them over time as sort of a series of throwback posts.

This project is particularly interesting because almost every approach I took two years ago to finish this project, I would not bother using today. But its still interesting to look back on!

Amy and Joe wanted something wintery and nonreligious for the card, since it would be sent to a very wide and diverse audience. They suggested some sort of snowy landscape piece, so I decided to make a snow-covered forest. This particular idea meant I had to figure out three key elements:

  • Conifer trees
  • Modeling snow ON the trees
  • Rendering snow

Since the holiday card had to be just a single still frame and had to be done in just a few days, I knew right away that I could (and would have to!) cheat heavily with compositing, so I was willing to try more unknown elements than I normally would throw into a single project. Also, since the shot I had in mind would be a wide, far shot, I knew that I could get away with less up-close detail for the trees.

I started by creating a handful of different base conifer tree models in OnyxTree and throwing them directly into Maya/Vray (this was before I had even started working on Takua Render) just to see how they would look. Normally models directly out of OnyxTree need some hand-sculpting and tweaking to add detail for up-close shots, but here I figured if they looked good enough, I could skip those steps. The result looked okay enough to move on:

The textures for the bark and leaves were super simple. To make the bark texture’s diffuse layer, I pulled a photograph of bark off of Google, modified it to tile in Photoshop, and adjusted the contrast and levels until it was the color I wanted. The displacement layer was simply the diffuse layer converted to black and white and with contrast and brightness adjusted. Normally this method won’t work well for up close shots, but again, since I knew the shot would be far away, I could get away with some cheating. Here’s a crop from the bark textures:

The pine needles were also super cheatey. I pulled a photo out of one of my reference libraries, dropped an opacity mask on top, and that was all for the diffuse color. Everything else was hacked in the leaf material’s shader; since the tree would be far away, I could get away with basic transparency instead of true subsurface scattering. The diffuse map with opacity flattened to black looks like this:

With the trees roughed in, the next problem to tackle was getting snow onto the trees. Today, I would immediately spin up Houdini to create this effect, but back then, I didn’t have a Houdini license and hadn’t played with Houdini enough to realize how quickly it could be done. Not knowing better back then, I used 3dsmax and a plugin called Snowflow (I used the demo version since this project was a one-off). To speed up the process, I used a simplified, decimated version of the tree mesh for Snowflow. Any inaccuracies between the resultant snow layer and the full tree mesh were acceptable, since they would look just like branches and leaves poking through the snow:

I tried a couple of different variations on snow thickness, which looked decent enough to move on with:

The next step was a fast snow material that would look reasonably okay from a distance, and render quickly. I wasn’t sure if the snow should have a more powdery, almost diffuse look, or if it should have a more refractive, frozen, icy look. I wound up trying both and going with a 50-50 blend of the two:

From left to right: refractive frozen ice, powdery diffuse, 50-50 blend

The next step was to compose a shot, make a very quick, simple lighting setup, and do some test renders. After some iterating, I settled for this render as a base for comp work:

The base render is very blueish since the lighting setup was a simple, grey-blueish dome light over the whole scene. The shadows are blotchy since I turned Vray’s irradiance cache settings all the way down for faster rendertimes; I decided that I would rather deal with the blotchy shadows in post and have a shot at making the deadline rather than wait for a very long rendertime. I wound up going with the thinner snow at the time since I wanted the trees to be more recognizable as trees, but in retrospect, that choice was probably a mistake.

The final step was some basic compositing. In After Effects, I applied post-processed DOF using a z-depth layer and Frischluft, color corrected the image, cranked up the exposure, and added vignetting to get the final result:

Looking back on this project two years later, I don’t think the final result looks really great. The image looks okay for two days of rushed work, but there is enormous room for improvement. If I could go back and change one thing, I would have chosen to use the much heavier snow cover version of the trees for the final composition. Also, today I would approach this project very very differently; instead of ping-ponging between multiple programs for each component, I would favor a almost pure-Houdini pipeline. The trees could be modeled as L-systems in Houdini, perhaps with some base work done in Maya. The snow could absolutely be simmed in Houdini. For rendering and lighting, I would use either my own Takua Render or some other fast physically based renderer (Octane, or perhaps Renderman 18’s iterative pathtracing mode) to iterate extremely quickly without having to compromise on quality.

So that’s the throwback breakdown of the CG@Penn Holiday 2011 card! I learned a lot from this project, and looking back and comparing how I worked two years ago to how I work today is always a good thing to do.