Scandinavian Room Scene

Almost three years ago, I rendered a small room interior scene to test an indoor, interior illumination scenario. Since then, a lot has changed in Takua, so I thought I’d revisit an interior illumination test with a much more complex, difficult scene. I don’t have much time to model stuff anymore these days, so instead I bought Evermotion’s Archinteriors Volume 48 collection, which is labeled as Scandinavian interior room scenes (I don’t know what’s particularly Scandinavian about these scenes, but that’s what the label said) and ported one of the scenes to Takua’s scene format. Instead of simply porting the scene as-is, I modified and added various things in the scene to make it feel a bit more customized. See if you can spot what they are:

Figure 1: A Scandinavian room interior, rendered in Takua a0.8 using VCM.

I had a lot of fun adding all of my customizations! I brought over some props from the old complex room scene, such as the purple flowers and vase, a few books, and Utah teapot tray, and also added a few new fun models, such as the MacBook Pro in the back and the copy of Physically Based Rendering 3rd Edition in the foreground. The black and white photos on the wall are crops of my Minecraft renders, and some of the books against the back wall have fun custom covers and titles. Even all of the elements that came with the original scene are re-shaded. The original scene came with Vray’s standard VrayMtl as the shader for everything; Takua’s base shader parameterization draws some influence from Vray, but also draws from the Disney Bsdf and Arnold’s AlShader and as a result has a parameterization that is sufficiently different that I wound up just re-shading everything instead of trying to write some conversion tool. For the most part I was able to re-use the textures that came with the scene to drive various shader parameters. The skydome is from the noncommercial version of VizPeople’s HDRi v1 collection.

Speaking of the skydome… the main source of illumination in this scene comes from the sun in the skydome, which presented a huge challenge for efficient light sampling. Takua has had domelight/environment map importance sampling using CDF inversion sampling for a long time now, which helps a lot, but the indoor nature of this scene still made sampling the sun difficult. Sampling the sun in an outdoor scene is fairly efficient since most rays will actually reach the sun, but in indoor scenes, importance sampling the sun becomes inefficient without taking occlusion into account since only rays that actually make it outdoors through windows can reach the sun. The best known method currently for handling domelight importance sampling through windows in an indoor scene is Portal Masked Environment Map Sampling (PMEMS) by Bitterli et al. I haven’t actually implemented PMEMS yet though, so the renders in this post all wound up requiring a huge number of samples per pixel to render; I intend on implementing PMEMS at some point in the near future.

Apart from the skydome, this scene also contains several other practical light sources, such as the lamp’s bulb, the MacBook Pro’s screen, and the MacBook Pro’s glowing Apple logo on the back of the screen (which isn’t even visible to camera, but is still enabled since it provides a tiny amount of light against the back wall!). In addition to choosing where on a single light to sample, choosing which light to sample is also an extremely important and difficult problem. Until this rendering this scene, I hadn’t really put any effort into efficiently selecting which light to sample. Most of my focus has been on the integration part of light transport, so Takua’s light selection has just been uniform random selection. Uniform random selection is terrible for scenes that contain multiple lights with highly varying emission between different lights, which is absolutely the case for this scene. Like any other importance sampling problem, the ideal solution is to send rays towards lights with a probability proportional to the amount of illumination we expect each light to contribute to each ray origin point.

I implemented a light selection strategy where the probability of selecting each light is weighted by the total emitted power of each light; essentially this boils down to estimating the total emitted power of each light according to the light’s surface texture and emission function, building a CDF across all of the lights using the total emission estimates, and then using standard CDF inversion sampling to pick lights. This strategy works significantly better than uniform random selection and made a huge difference in render speed for this scene, as seen in Figures 2 through 4. Figure 2 uses uniform random light selection with 128 spp; note how the area lit by the wall-mounted lamp is well sampled, but the image overall is really noisy. Figure 3 uses power-weighted light selection with the same spp as Figure 2; the lamp area is more noisy than in Figure 2, but the render is less noisy overall. Notably, Figure 3 also took a third of the time compared to Figure 2 for the same sample count; this is because in this scene, sending rays towards the lamp is significantly more expensive due to heavier geometry than sending rays towards the sun, even when rays towards the sun get occluded by the walls. Figure 4 uses power-weighted light selection again, but is equal-time to Figure 2 instead of equal-spp; note the significant noise reduction:

Figure 2: The same frame from Figure 1, 128 spp using uniform random light selection. Average pixel RMSE compared to Figure 1: 0.439952.

Figure 3: Power-weighted light selection, with equal spp to Figure 2. Average pixel RMSE compared to Figure 1: 0.371441.

Figure 4: Power-weighted light selection again, but this time with equal time instead of equal spp to Figure 2. Average pixel RMSE compared to Figure 1: 0.315465.

Figure 5: Zoomed crops of Figures 2 through 4. From left to right: uniform random sampling, equal sample power-weighted sampling, and equal time power-weighted sampling.

However, power-weighted light selection still is not even close to being the most optimal technique possible; this technique completely ignores occlusion and distance, which are extremely important. Unfortunately, because occlusion and distance to each light varies for each point in space, creating a light selection strategy that takes occlusion and distance into account is extremely difficult and is a subject of continued research in the field. In Hyperion, we use a cache point system, which we described on page 97 of our SIGGRAPH 2017 Production Volume Rendering course notes. Other published research on the topic includes Practical Path Guiding for Efficient Light-Transport Simulation by Muller et al, On-line Learning of Parametric Mixture Models for Light Transport Simulation by Vorba et al, Product Importance Sampling for Light Transport Path Guiding by Herholz et al, Learning Light Transport the Reinforced Way by Dahm et al, and more. At some point in the future I’ll revisit this topic.

For a long time now, Takua has also had a simple interactive mode where the camera can be moved around in a non-shaded/non-lit view; I used this mode to interactively scout out some interesting and fun camera angles for some more renders. Being able to interactively scout in the same renderer used to final rendering is an extremely powerful tool; instead of guessing at depth of field settings and such, I was able to directly set and preview depth of field with immediate feedback. Unfortunately some of the renders below are noisier than I would like, due to the previously mentioned light sampling difficulties. All of the following images are rendered using Takua a0.8 with VCM:

Figure 6: A MacBook Pro running Takua Render to produce Figure 1.

Figure 7: Physically Based Rendering Third Edition sitting on the coffee table.

Figure 8: Closeup of the same purple flowers from the old Complex Room scene.

Figure 9: Utah Teapot tea set on the coffee table.

Figure 10: A glass globe with mirror-polished metal continents, sitting in the sunlight from the window.

Figure 11: Close-up of two glass and metal mugs filled with tea.

Beyond difficult light sampling, generally complex and difficult light transport with lots of subtle caustics also wound up presenting major challenges in this scene. For example, note the subtle caustics on the wall in the upper right hand part of Figure 10; those caustics are actually visibly not fully converged, even though the sample count across Figure 10 was in the thousands of spp! I intentionally did not use adaptive sampling in any of these renders; instead, I wanted to experiment with a common technique used in a lot of modern production renderers for noise reduction: in-render firefly clamping. My adaptive sampler is already capable of detecting firefly pixels and driving more samples at fireflies in the hopes of accelerating variance reduction on firefly pixels, but firefly clamping is a much more crude, biased, but nonetheless effective technique. The idea is to detect on each pixel spp if a returned sample is an outlier relative to all of the previously accumulated samples, and discard or clamp the sample if it in fact is an outlier. Picking what threshold to use for outlier detection is a very manual process; even Arnold provides a tuning max-value parameter for firefly clamping.

I wanted to be able to directly compare the render with and without firefly clamping, so I implemented firefly clamping on top of Takua’s AOV system. When enabled, firefly clamping mode produces two images for a single render: one output with firefly clamping enabled, and one with clamping disabled. I tried re-rendering Figure 10 using unidirectional pathtracing and a relatively low spp count to produce as many fireflies as I could, for a clearer comparison. For this test, I set the firefly threshold to be samples that are at least 250 times brighter than the estimated pixel value up to that sample.

Figure 12: The same render as Figure 10, but rendered with a lower sample count and using unidirectional pathtracing instead of VCM to draw out more fireflies.

Figure 13: From the same run of Takua Render as Figure 12, but the firefly-clamped render output instead of the raw render.

Note how Figure 13 appears to be completely firefly-free compared to Figure 12, and how Figure 13 doesn’t have visible caustic noise on the walls compared to Figure 10. However, notice how Figure 13 is also missing significant illumination in some areas, such as in the corner of the walls near the floor behind the wooden step ladder, or in the deepest parts of the purple flower bunch. Finding a threshold that eliminates all fireflies without loosing significant illumination in other areas is very difficult or, in some cases, impossible since some of these types of light transport essentially manifest as firefly-like high energy samples that only smooth out over time. For the final renders in Figure 1 and Figures 6 through 11, I wound up not actually using any firefly clamping. While biased noise-reduction techniques are a necessary evil in actual production, I expect that I’ll try to avoid relying on firefly clamping in the vast majority of what I do with Takua, since Takua is meant to just be a brute-force, hobby kind of thing anyway.

Aventador Renders Revisited

A long time ago, I made some posts that featured a cool Lamborghini Aventador model. Recently, I revisited that model and made some new renders using the current version of Takua, mostly just for fun. To me, one of the most important parts of writing a renderer has always been being able to actually use the renderer to make fun images. The last time I rendered this model was something like four years ago, and back then Takua was still in a very basic state; the renders in those old posts don’t even have any shading beyond 50% grey lambertian surfaces! The renders in this post utilize a lot of advanced features that I’ve added since then, such as a proper complex layered Bsdf and texturing system, advanced bidirectional light transport techniques, huge speed improvements to ray traversal, advanced motion blur and generalized time capabilities, and more. I’m way behind in writing up many of these features and capabilities, but in the meantime, I thought I’d post some for-fun rendering projects I’ve done with Takua.

All of the renders in this post are directly from Takua, with a basic white balance and conversion from HDR EXR to LDR PNG being the only post-processing steps. Each render took about half a day to render (except for the wireframe render, which was much faster) on a 12-core workstation at 2560x1440 resolution.

Figure 1: An orange-red Lamborghini Aventador, rendered in Takua a0.7 using VCM.

Shading the Aventador model was a fun, interesting exercise. I went for a orange-red paint scheme since, well, Lamborghinis are supposed to look outrageous and orange-red is a fairly exotic paint scheme (I suppose I could have picked green or yellow or something instead, but I like orange-red). I ended up making a triple-lobe shader with a metallic base, a dielectric lobe, and a clear-coat lobe on top of that. The base lobe uses a GGX microfacet metallic Brdf. Takua’s shading system implements a proper metallic Fresnel model for conductors, where the Fresnel model includes both a Nd component representing refractive index and a k component representing the extinction coefficient for when an electromagnetic wave propagates through a material. For conductors, the final Fresnel index of refraction for each wavelength of light is defined by a complex combination of Nd and k. For the base metallic lobe, most of the color wound up coming from the k component. The dielectric lobe is meant to simulate paint on top of a car’s metal body; the dielectric lobe is where most of the orange-red color comes from. The dielectric lobe is again a GGX microfacet Brdf, but with a dielectric Fresnel model, which has a much simpler index of refraction calculation than the metallic Fresnel model does. I should note that Takua’s current standard material implementation actually only supports a single primary specular lobe and an additional single clear-coat lobe, so for shaders authored with both a metallic and dielectric component, Takua takes a blend weight between the two components and for each shading evaluation stochastically selects between the two lobes according to the blend weight. The clear-coat layer on top has just a slightly amount of extinction to provide just a bit more of the final orange look, but is mostly just clear.

All of the window glass in the render is tinted slightly dark through extinction instead of through a fixed refraction color. Using proper extinction to tint glass is more realistic than using a fixed refraction color. Similarly, the red and yellow glass used in the head lights and tail lights are colored through extinction. The brake disks use an extremely high resolution bump map to get the brushed metal look. The branding and markings on the tire walls are done through a combination of bump mapping and adjusting the roughness of the microfacet Brdf; the tire treads are made using a high resolution normal map. There’s no displacement mapping at all, although in retrospect the tire treads probably should be displacement mapped if I want to put the camera closer to them. Also, I actually didn’t really shade the interior of the car much, since I knew I was going for exterior shots only.

Eventually I’ll get around to implementing a proper car paint Bsdf in Takua, but until then, the approach I took here seems to hold up reasonable well as long as the camera doesn’t get super close up to the car.

I lit the scene using two lights: an HDR skydome from HDRI-Skies, and a single long, thin rectangular area light above the car. The skydome provides the overall soft-ish lighting that illuminates the entire scene, and the rectangular area light provides the long, interesting highlights on the car body that help with bringing out the car’s shape.

For all of the renders in this post, I used my VCM integrator, since the scene contains a lot of subtle caustics and since the inside of the car is lit entirely through glass. I also wound up modifying my adaptive sampler; it’s still the same adaptive sampler that I’ve had for a few years now, but with an important extension. Instead of simply reducing the total number of paths per iteration as areas reach convergence, the adaptive sampler now keeps the number of paths the same and instead reallocates paths from completed pixels to high-variance pixels. The end result is that the adaptive sampler is now much more effective at eliminating fireflies and targeting caustics and other noisy areas. In the above render, some pixels wound up with as few as 512 samples, while a few particularly difficult pixels finished with as many as 20000 samples. Here is the adaptive sampling heatmap for Figure 1 above; brighter areas indicate more samples. Note how the adaptive sampler found a number of areas that we’d expect to be challenging, such as the interior through the car’s glass windows, and parts of the body with specular inter-reflections.

Figure 2: Adaptive sampling heatmap for Figure 1. Brighter areas indicate more samples.

I recently implemented support for arbitrary camera shutter curves, so I thought doing a motion blurred render would be fun. After all, Lamborghinis are supposed to go fast! I animated the Lamborghini driving forward in Maya; the animation was very basic, with the main body just translating forward and the wheels both translating and rotating. Of course Takua has proper rotational motion blur. The motion blur here is effectively multi-segment motion blur; generating multi-segment motion blur from an animated sequence in Takua is very easy due to how Takua handles and understands time. I actually think that Takua’s concept of time is one of the most unique things in Takua; it’s very different from how every other renderer I’ve used and seen handles time. I intend to write more about this later. Instead of an instantaneous shutter, I used a custom cosine-based shutter curve that places many more time samples near the center of the shutter interval than towards the shutter open and close. Using a shutter shape like this wound up being important to getting the right look to the motion blur; even the car is moving extremely quickly, the overall form of the car is still clearly distinguishable and the front and back of the car appear more motion-blurred than the main body.

Figure 3: Motion blurred render, using multi-segment motion blur with a cosine-based shutter curve.

Since Takua has a procedural wireframe texture now, I also did a wireframe render. I mentioned my procedural wireframe texture in a previous post, but I didn’t write about how it actually works. For triangles and quads, the wireframe texture is simply based on the distance from the hitpoint to the nearest edge. If the distance to the nearest edge is smaller than some threshold, draw one color, otherwise, draw some other color. The nearest edge calculation can be done as follows (the variable names should be self-explanatory):

float calculateMinDistance(const Poly& p, const Intersection& hit) const {
    float md = std::numeric_limits<float>::infinity();
    const int verts = p.isQuad() ? 4 : 3;
    for (int i = 0; i < verts; i++) {
        const glm::vec3& cur = p[i].m_position;
        const glm::vec3& next = p[(i + 1) % verts].m_position;
        const glm::vec3 d1 = glm::normalize(next - cur);
        const glm::vec3 d2 = hit.m_point - cur;
        const float l = glm::length((cur + d1 * glm::dot(d1, d2) - hit.m_point));
        md = glm::min(md, l * l);
    return md;

The topology of the meshes are pretty strange, since the car model came as a triangle mesh, which I then subdivided:

Figure 4: Procedural wireframe texture.

The material in the wireframe render only uses the lambertian diffuse lobe in Takua’s standard material; as such, the adaptive sampling heatmap for the wireframe render is interesting to compare to Figure 2. Overall the sample distribution is much more even, and areas where diffuse inter-reflections are present got more samples:

Figure 5: Adaptive sampling heatmap for Figure 4. Brighter areas indicate more samples. Compare with Figure 2.

Takua’s shading model supports layering different materials through parameter blending, similar to how the Disney Brdf (and, at this point, most other shading systems) handles material layering. I wanted to make an even more outrageous looking version of the Aventador than the orange-red version, so I used the procedural wireframe texture as a layer mask to drive parameter blending between a black paint and a metallic gold paint:

Figure 6: An outrageous Aventador paint scheme using a procedural wireframe texture to blend between black and metallic gold car paint.

Olaf's Frozen Adventure

After an amazing 2016, we’re having a bit of a break year this year at Walt Disney Animation Studios. We don’t have a feature film this year; instead, we made a half-hour featurette called Olaf’s Frozen Adventure, which will be released in front of Pixar’s Coco during Thanksgiving. I think this is the first time a Disney Animation short/featurette has accompanied a Pixar film. Olaf’s Frozen Adventure is a fun little holiday story set in the world of Frozen, and I had the privilege of getting to play a small role in making Olaf’s Frozen Adventure! I got an official credit as part of a handful of engineers that did some specific, interesting technology development for Olaf’s Frozen Adventure.

Olaf’s Frozen Adventure is really really funny; because Olaf is the main character, the entire story takes on much more of a self-aware, at times somewhat absurdist tone. The featurette also has a bunch of new songs- there are six new songs in total, which is somehow pretty close to the original film’s count of eight songs, but in a third of the runtime. Olaf’s Frozen Adventure was originally announced as a TV special, but the wider Walt Disney Company was so happy with the result that they decided to give us a theatrical release instead!

Something I personally find fascinating about Olaf’s Frozen Adventure is comparing it visually with the original Frozen. Olaf’s Frozen Adventure is rendered entirely with Disney’s Hyperion Renderer, compared with Frozen, which was rendered using pre-RIS Renderman. While both films used our Disney BRDF, Olaf’s Frozen Adventure benefits from all of the improvements and advancements that have been made during Big Hero 6, Zootopia, and Moana. The original Frozen used dipole subsurface scattering, radiosity caching, and generally had fairly low geometric complexity relative to Hyperion-era films. In comparison, Olaf’s Frozen Adventure uses brute force subsurface scattering, uses path-traced global illumination, uses the full Disney BSDF (which is significantly extended from the Disney BRDF), uses our advanced fur/hair shader developed during Zootopia, and has much greater geometric complexity. Some shots even utilize an extended version of the photon mapped caustics we developed during Moana. Extensions to our photon mapping system is one of the things I worked on for Olaf’s Frozen Adventure. Even the water in Arendelle Harbor looks way better than in Frozen, since we were able to make use of the incredible water systems developed for Moana. All of these advancements are discussed in our SIGGRAPH 2017 Course Notes.

As usual with Disney Animation projects I get to work on, here are some stills in no particular order, pulled from marketing material. Even though Olaf’s Frozen Adventure was originally meant for TV, we still put the same level of effort into it that we do with theatrical features, and I think it shows!

All images in this post are courtesy of and the property of Walt Disney Animation Studios.

SIGGRAPH 2017 Course Notes- Recent Advances in Disney's Hyperion Renderer

This year at SIGGRAPH 2017, Luca Fascione and Johannes Hanika from Weta Digital organized a Path Tracing in Production course. The course was split into two halves: a first half about production renderers, and a second half about using production renderers to make movies. Brent Burley presented our recent work on Disney’s Hyperion Renderer as part of the first half of the course. To support Brent’s section of the course, the entire Hyperion team worked together to put together some course notes describing recent work in Hyperion done for Zootopia, Moana, and upcoming films.

Image from course notes Figure 8: a production frame from Zootopia, rendered using Disney's Hyperion Renderer.

Here is the abstract for the course notes:

Path tracing at Walt Disney Animation Studios began with the Hyperion renderer, first used in production on Big Hero 6. Hyperion is a custom, modern path tracer using a unique architecture designed to efficiently handle complexity, while also providing artistic controllability and efficiency. The concept of physically based shading at Disney Animation predates the Hyperion renderer. Our history with physically based shading significantly influenced the development of Hyperion, and since then, the development of Hyperion has in turn influenced our philosophy towards physically based shading.

The course notes and related materials can be found at:

The course wasn’t recorded due to proprietary content from various studios, but the overall course notes for the entire course cover everything that was presented. The major theme of our part of the course notes (and Brent’s presentation) is replacing multiple scattering approximations with accurate brute-force path-traced solutions. Interestingly, the main motivator for this move is primarily a desire for better, more predictable and intuitive controls for artists, as opposed to simply just wanting better visual quality. In the course notes, we specifically discuss fur/hair, path-traced subsurface scattering, and volume rendering.

The Hyperion team also had two other presentations at SIGGRAPH 2017:

SIGGRAPH 2017 Paper- Spectral and Decomposition Tracking for Rendering Heterogeneous Volumes

Some recent work I was part of at Walt Disney Animation Studios has been published in the July 2017 issue of ACM Transactions on Graphics as part of SIGGRAPH 2017! The paper is titled “Spectral and Decomposition Tracking for Rendering Heterogeneous Volumes”, and the project was a collaboration between the Hyperion development team at Walt Disney Animation Studios (WDAS) and the rendering group at Disney Research Zürich (DRZ). From the WDAS side, the authors are Peter Kutz (who was at Penn at the same time as me), Ralf Habel, and myself. On the DRZ side, our collaborator was Jan Novák, the head of DRZ’s rendering research group.

Image from paper Figure 12: a colorful explosion with chromatic extinction rendered using spectral tracking.

Here is the paper abstract:

We present two novel unbiased techniques for sampling free paths in heterogeneous participating media. Our decomposition tracking accelerates free-path construction by splitting the medium into a control component and a residual component and sampling each of them separately. To minimize expensive evaluations of spatially varying collision coefficients, we define the control component to allow constructing free paths in closed form. The residual heterogeneous component is then homogenized by adding a fictitious medium and handled using weighted delta tracking, which removes the need for computing strict bounds of the extinction function. Our second contribution, spectral tracking, enables efficient light transport simulation in chromatic media. We modify free-path distributions to minimize the fluctuation of path throughputs and thereby reduce the estimation variance. To demonstrate the correctness of our algorithms, we derive them directly from the radiative transfer equation by extending the integral formulation of null-collision algorithms recently developed in reactor physics. This mathematical framework, which we thoroughly review, encompasses existing trackers and postulates an entire family of new estimators for solving transport problems; our algorithms are examples of such. We analyze the proposed methods in canonical settings and on production scenes, and compare to the current state of the art in simulating light transport in heterogeneous participating media.

The paper and related materials can be found at:

Peter Kutz will be presenting the paper at SIGGRAPH 2017 in Log Angeles as part of the Rendering Volumes Technical Papers session.

Instead of repeating the contents of the paper here (which is pointless since the paper already says everything we want to say), I thought instead I’d use this blog post to talk about some of the process we went through while writing this paper. Please note that everything stated in this post are my own opinions and thoughts, not Disney’s.

This project started over a year ago, when we began an effort to significantly overhaul and improve Hyperion’s volume rendering system. Around the same time that we began to revisit volume rendering, we heard a lecture from a visiting professor on multilevel Monte Carlo (MLMC) methods. Although the final paper has nothing to do with MLMC methods, the genesis of this project was in initial conversations we had about how MLMC methods might be applied to volume rendering. We concluded that MLMC could be applicable, but weren’t entirely sure how. However, these conversations eventually gave Peter the idea to develop the technique that would eventually become decomposition tracking (importantly, decomposition tracking does not actually use MLMC though). Further conversations about weighted delta tracking then led to Peter developing the core ideas behind what would become spectral tracking. After testing some initial implementations of these prototype version of decomposition and spectral tracking, Peter, Ralf, and I shared the techniques with Jan. Around the same time, we also shared the techniques with our sister teams, Pixar’s RenderMan development group in Seattle and the Pixar Research Group in Emeryville, who were able to independently implement and verify our techniques. Being able to share research between Walt Disney Animation Studios, Disney Research, the Renderman group, Pixar Animation Studios, Industrial Light & Magic, and Imagineering is one of the reasons why Disney is such an amazing place to be for computer graphics folks.

At this point we had initial rudimentary proofs for why decomposition and spectral tracking worked separately, but we still didn’t have a unified framework that could be used to explain and combine the two techniques. Together with Jan, we began by deep-diving into the origins of delta/Woodcock tracking in neutron transport and reactor physics papers from the 1950s and 1960s and working our way forward to the present. All of the key papers we dug up during this deep-dive are cited in our paper. Some of these early papers were fairly difficult to find. For example, the original delta tracking paper, “Techniques used in the GEM code for Monte Carlo neutronics calculations in reactors and other systems of complex geometry” (Woodcock et al. 1965), is often cited in graphics literature, but a cursory Google search doesn’t provide any links to the actual paper itself. We eventually managed to track down a copy of the original paper in the archives of the United States Department of Commerce, which for some reason hosts a lot of archive material from Argonne National Laboratory. Since the original Woodcock paper has been in the public domain for some time now but is fairly difficult to find, I’m hosting a copy here for any researchers that may be interested.

Several other papers we were only able to obtain by requesting archival microfilm scans from several university libraries. I won’t host copies here, since the public domain status for several of them isn’t clear, but if you are a researcher looking for any of the papers that we cited and can’t find it, feel free to contact me. One particularly cool find was “The Relativistic Doppler Problem” (Zerby et al. 1961), which Peter obtained by writing to the Oak Ridge National Laboratory’s research library. Their staff were eventually able to find the paper in their records/archives, and subsequently scanned and uploaded the paper online. The paper is now publicly available here, on the United States Department of Energy’s Office of Scientific and Technical Information website.

Eventually, through significant effort from Jan, we came to understand Galtier et al.’s 2013 paper, “Integral Formulation of Null-Collision Monte Carlo Algorithms”, and were able to import the integral formulation into computer graphics and demonstrate how to derive both decomposition and spectral tracking directly from the radiative transfer equation using the integral formulation. This step also allowed Peter to figure out how to combine spectral and decomposition tracking into a single technique. With all of these pieces in place, we had the framework for our SIGGRAPH paper. We then put significant effort into working out remaining details, such as finding a good mechanism for bounding the free-path-sampling coefficient in spectral tracking. Producing all of the renders, results, charts, and plots in the paper also took an enormous amount of time; it turns out that producing all of this stuff can take significantly longer than the amount of time originally spent coming up with and implementing the techniques in the first place!

One major challenge we faced in writing the final paper was finding the best order in which to present the three main pieces of the paper: decomposition tracking, spectral tracking, and the integral formulation of null-collision algorithms. At one point, we considered first presenting decomposition tracking, since on a general level decomposition tracking is the easiest of the three contributions to understand. Then, we planned to use the proof of decomposition tracking to expand out into the integral formulation of the RTE with null collisions, and finally derive spectral tracking from the integral formulation. The idea was essentially to introduce the easiest technique first, expand out to the general mathematical framework, and then demonstrate the flexibility of the framework by deriving the second technique. However, this approach in practice felt disjointed, especially with respect to the body of prior work we wanted to present, which underpinned the integral framework but wound up being separated by the decomposition tracking section. So instead, we arrived on the final presentation order, where we first present the integral framework and derive out prior techniques such as delta tracking, and then demonstrate how to derive out new decomposition tracking and spectral tracking techniques. We hope that presenting the paper in this way will encourage other researchers to adopt the integral framework and derive other, new techniques from the framework. For Peter’s presentation at SIGGRAPH, however, Peter chose to go with the original order since it made for a better presentation.

Since our final paper was already quite long, we had to move some content into a separate supplemental document. Although the supplemental content isn’t necessary for implementing the core algorithms presented, I think the supplemental content is very useful for gaining a better understanding of the techniques. The supplemental content contains, among other things, an extended proof of the minimum-of-exponents mechanism that decomposition tracking is built on, various proofs related to choosing bounds for the local collision weight in spectral tracking, and various additional results and further analysis. We also provide a nifty interactive viewer for comparing our techniques against vanilla delta tracking; the interactive viewer framework was originally developed by Fabrice Rousselle, Jan Novák and Benedikt Bitterli at Disney Research Zürich.

One of the major advantages of doing rendering research at a major animation or VFX studio is the availability of hundreds of extremely talented artists, who are always eager to try out new techniques and software. Peter, Ralf, and I worked closely with a number of artists at WDAS to test our techniques and produce interesting scenes with which to generate results and data for the paper. Henrik Falt and Alex Nijmeh had created a number of interesting clouds in the process of testing our general volume rendering improvements, and worked with us to adapt a cloud dataset for use in Figure 11 of our paper. The following is one of the renders from Figure 11:

Image from paper Figure 11: an optically thick cloud rendered using decomposition tracking.

Henrik and Alex also constructed the cloudscape scene used as the banner image on the first page of the paper. After we submitted the paper, Henrik and Alex continued iterating on this scene, which eventually resulted in the more detailed version seen in our SIGGRAPH Fast Forward video. The version of the cloudscape used in our paper is reproduced below:

Image from paper Figure 1: a cloudscape rendered using spectral and decomposition tracking.

To test out spectral tracking, we wanted an interesting, dynamic, colorful dataset. After describing spectral tracking to Jesse Erickson, we arrived at the idea of a color explosion similar in spirit to certain visuals used in recent Apple and Microsoft ads, which in turn were inspired by the Holi festival celebrated in India and Nepal. Jesse authored the color explosion in Houdini and provided a set of VDBs for each color section, which we were then able to shade, light, and render using Hyperion’s implementation of spectral tracking. The final result was the color explosion from Figure 12 of the paper, seen at the top of this post. We were honored to learn that the color explosion figure was chosen to be one of the pictures on the back cover of this year’s conference proceedings!

At one point we also remembered that brute force path-traced subsurface scattering is just volume rendering inside of a bounded surface, which led to the translucent heterogeneous Stanford dragon used in Figure 15 of the paper:

Image from paper Figure 15: a subsurface scattering heterogeneous Stanford dragon rendered using spectral and decomposition tracking.

For our video for the SIGGRAPH 2017 Fast Forward, we were able to get a lot of help from a number of artists. Alex and Henrik and a number of other artists significantly expanded and improved the cloudscape scene, and we also rendered out several more color explosion variants. The final fast forward video contains work from Alex Nijmeh, Henrik Falt, Jesse Erickson, Thom Wickes, Michael Kaschalk, Dale Mayeda, Ben Frost, Marc Bryant, John Kosnik, Mir Ali, Vijoy Gaddipati, and Dimitre Berberov. The awesome title effect was thought up by and created by Henrik. The final video is a bit noisy since we were severely constrained on available renderfarm resources (we were basically squeezing our renders in between actual production renders), but I think the end result is still really great:

Here are a couple of cool stills from the fast forward video:

An improved cloudscape from our SIGGRAPH Fast Forward video.

An orange-purple color explosion from our SIGGRAPH Fast Forward video.

A green-yellow color explosion from our SIGGRAPH Fast Forward video.

We owe an enormous amount of thanks to fellow Hyperion teammate Patrick Kelly, who played an instrumental role in designing and implementing our overall new volume rendering system, and who discussed with us extensively throughout the project. Hyperion teammate David Adler also helped out a lot in profiling and instrumenting our code. We also must thank Thomas Müller, Marios Papas, Géraldine Conti, and David Adler for proofreading, and Brent Burley, Michael Kaschalk, and Rajesh Sharma for providing support, encouragement, and resources for this project.

I’ve worked on a SIGGRAPH Asia paper before, but working on a large scale publication in the context of a major animation studio instead of in school was a very different experience. The support and resources we were given and the amount of talent and help that we were able to tap into made this project possible. This project is also an example of the incredible value that comes from companies maintaining in-house industrial research labs; this project absolutely would not have been possible without all of the collaboration from DRZ, in both the form of direct collaboration from Jan and indirect collaboration from all of the DRZ researchers that provided discussions and feedback. Everyone worked really hard, but overall the whole process was immensely intellectually satisfying and fun, and seeing our new techniques in use by talented, excited artists makes all of the work absolutely worthwhile!

Subdivision Surfaces and Displacement Mapping

Two standard features that every modern production renderer supports are subdivision surfaces and some form of displacement mapping. As we’ll discuss a bit later in this post, these two features are usually very closely linked to each other in both usage and implementation. Subdivision and displacement are crucial tools for representing detail in computer graphics; from both a technical and authorship point of view, being able to represent more detail than is actually present in a mesh is advantageous. Applying detail at runtime allows for geometry to take up less disk space and memory than would be required if all detail was baked into the geometry, and artists often like the ability to separate broad features from high frequency detail.

I recently added support for subdivision surfaces and for both scalar and vector displacement to Takua; Figure 1 shows an ocean wave was rendered using vector displacement in Takua. The ocean surface is entirely displaced from just a single plane!

Figure 1: An ocean surface modeled as a flat plane and rendered using vector displacement mapping.

Both subdivision and displacement originally came from the world of rasterization rendering, where on-the-fly geometry generation was historically both easier to implement and more practical/plausible to use. In rasterization, geometry is streamed to through the renderer and drawn to screen, so each individual piece of geometry could be subdivided, tessellated, displaced, splatted to the framebuffer, and then discarded to free up memory. Old REYES Renderman was famously efficient at rendering subdivision surfaces and displacement surfaces for precisely this reason. However, in naive raytracing, rays can intersect geometry at any moment in any order. Subdividing and displacing geometry on the fly for each ray and then discarding the geometry is insanely expensive compared to processing geometry once across an entire framebuffer. The simplest solution to this problem is to just subdivide and displace everything up front and keep it all around in memory during raytracing. Historically though, just caching everything was never a practical solution since computers simply didn’t have enough memory to keep that much data around. As a result, past research work put significant effort into more intelligent raytracing architectures that made on-the-fly subdivision/displacement affordable again; notable papers include Matt Pharr’s 1996 geometry caching for raytracing paper, Brian Smits et al.’s 2000 direct raytracing of displacement mapped triangles paper, Johannes Hanika et al.’s 2010 reordered raytracing paper, and Takahiro Harada’s 2015 article on GPU raytraced vector displacement in GPU Pro 6.

In the past five years or so though, the story on raytraced displacement has changed. We now have machines with gobs and gobs of memory (at a number of studios, renderfarm nodes with 256 GB of memory or more is not unusual anymore). As a result, raytraced renderers don’t need to be nearly as clever anymore about managing displaced geometry; a combination of camera-adaptive tessellation and a simple geometry cache with a least-recently-used eviction strategy is often enough to make raytraced displacement practical. Heavy displacement is now common in the workflows for a number of production pathtracers, including Arnold, Renderman/RIS, Vray, Corona, Hyperion, Manuka, etc. With the above in mind, I tried to implement subdivision and displacement in Takua as simply as I possibly could.

Takua doesn’t have any concept of an eviction strategy for cached tessellated geometry; the hope is to just fit in memory and be as efficient as possible with what memory is available. Admittedly, since Takua is just my hobby renderer instead of a fully in-use production renderer, and I have personal machines with 48 GB of memory, I didn’t think particularly hard about cases where things don’t fit in memory. Instead of tessellating on-the-fly per ray or anything like that, I simply pre-subdivide and pre-displace everything upfront during the initial scene load. Meshes are loaded, subdivided, and displaced in parallel with each other. If Takua discovers that all of the subdivided and displaced geometry isn’t going to fit in the allocated memory budget, the renderer simply quits.

I should note that Takua’s scene format distinguishes between a mesh and a geom; a mesh is the raw vertex/face/primvar data that makes up a surface, while a geom is an object containing a reference to a mesh along with transformation matrices, shader bindings, and so on and so forth. This separation between the mesh data and the geometric object allows for some useful features in the subdivision/displacement system. Takua’s scene file format allows for binding subdivision and displacement modifiers either on the shader level, or per each geom. Bindings at the geom level override bindings on the shader level, which is useful for authoring since a whole bunch of objects can share the same shader but then have individual specializations for different subdivision rates and different displacement maps and displacement settings. During scene loading, Takua analyzes what subdivisions/displacements are required for which meshes by which geoms, and then de-duplicates and aggregates any cases where different geoms want the same subdivision/displacement for the same mesh. This de-duplication even works for instances (I should write a separate post about Takua’s approach to instancing someday…).

Once Takua has put together a list of all meshes that require subdivision, meshes are subdivided in parallel. For Catmull-Clark subdivision, I rely on OpenSubdiv for calculating subdivision stencil tables, evaluating the stencils, and final tessellation. As far as I can tell, stencil calculation in OpenSubdiv is single threaded, so it can get fairly slow on really heavy meshes. Stencil evaluation and final tessellation is super fast though, since OpenSubdiv provides a number of parallel evaluators that can run using a variety of backends ranging from TBB on the CPU to CUDA or OpenGL compute shaders on the GPU. Takua currently relies on OpenSubdiv’s TBB evaluator. One really neat thing about the stencil implementation in OpenSubdiv is that the stencil calculation is dependent on only the topology of the mesh and not individual primvars, so a single stencil calculation can then be reused multiple times to interpolate many different primvars, such as positions, normals, uvs, and more. Currently Takua doesn’t support creases; I’m planning on adding crease support later.

No writing about subdivision surfaces is complete without a picture of a cube being subdivided into a sphere, so Figure 2 shows a render of a cube with subdivision levels 0, 1, 2, and 3, going from left to right. Each subdivided cube is rendered with a procedural wireframe texture that I implemented to help visualize what was going on with subdivision.

Figure 2: A cube with 0, 1, 2, and 3 subdivision levels, going from left to right.

Each subdivided mesh is placed into a new mesh; base meshes that require multiple subdivision levels for multiple different geoms get one new subdivided mesh per subdivision level. After all subdivided meshes are ready, Takua then runs displacement. Displacement is parallelized both by mesh and within each mesh. Also, Takua supports both on-the-fly displacement and fully cached displacement, which can be specified per shader or per geom. If a mesh is marked for full caching, the mesh is fully displaced, stored as a separate mesh from the undisplaced subdivision mesh, and then a BVH is built for the displaced mesh. If a mesh is marked for on-the-fly displacement, the displacement system calculates each displaced face, then calculates the bounds for that face, and then discards the face. The displaced bounds are then used to build a tight BVH for the displaced mesh without actually having to store the displaced mesh itself; instead, just a reference to the undisplaced subdivision mesh has to be kept around. When a ray traverses the BVH for an on-the-fly displacement mesh, each BVH leaf node specifies which triangles on the undisplaced mesh need to be displaced to produce final polys for intersection and then the displaced polys are intersected and discarded again. For the scenes in this post, on-the-fly displacement seems to be about twice as slow as fully cached displacement, which is to be expected, but if the same mesh is displaced multiple different ways, then there are correspondingly large memory savings. After all displacement has been calculated, Takua goes back and analyzes which base meshes and undisplaced subdivision meshes are no longer needed, and frees those meshes to reclaim memory.

I implemented support for both scalar displacement via regular grayscale texture maps, and vector displacement from OpenEXR textures. The ocean render from the start of this post uses vector displacement applied to a single plane. Figure 3 shows another angle of the same vector displaced ocean:

Figure 3: Another view of the vector displaced ocean surface from Figure 1. The ocean surface has a dielectric refractive material complete with colored attenuated transmission. A shallow depth of field is used to lend added realism.

For both ocean renders, the vector displacement OpenEXR texture is borrowed from Autodesk, who generously provide it as part of an article about vector displacement in Arnold. The renders are lit with a skydome using’s HDRI Sky 193 texture.

For both scalar and vector displacement, the displacement amount from the displacement texture can be controlled by a single scalar value. Vector displacement maps are assumed to be in a local tangent space; which axis is used as the basis of the tangent space can be specified per displacement map. Figure 4 shows three dirt shaderballs with varying displacement scaling values. The leftmost shaderball has a displacement scale of 0, which effectively disables displacement. The middle shaderball has a displacement scale of 0.5 of the native displacement values in the vector displacement map. The rightmost shaderball has a displacement scale of 1.0, which means just use the native displacement values from the vector displacement map.

Figure 4: Dirt shaderballs with displacement scales of 0.0, 0.5, and 1.0, going from left to right.

Figure 5 shows a closeup of the rightmost dirt shaderball from Figure 4. The base mesh for the shaderball is relatively low resolution, but through subdivision and displacement, a huge amount of geometric detail can be added in-render. In this case, the shaderball is tessellated to a point where each individual micropolygon is at a subpixel size. The model for the shaderball is based on Bertrand Benoit’s shaderball. The displacement map and other textures for the dirt shaderball are from Quixel’s Megascans library.

Figure 5: Closeup of the dirt shaderball from Figure 4. In this render, the shaderball is tessellated and displaced to a subpixel resolution.

One major challenge with displacement mapping is cracking. Cracking occurs when adjacent polygons displace the same shared vertices different ways for each polygon. This can happen when the normals across a surface aren’t continuous, or if there is a discontinuity in either how the displacement texture is mapped to the surface, or in the displacement texture itself. I implemented an optional, somewhat brute-force solution to displacement cracking. If crack removal is enabled, Takua analyzes the mesh at displacement time and records how many different ways each vertex in the mesh has been displaced by different faces, along with which faces want to displace that vertex. After an initial displacement pass, the crack remover then goes back and for every vertex that is displaced more than one way, all of the displacements are averaged into a single displacement, and all faces that use that vertex are updated to share the same averaged result. This approach requires a fair amount of bookkeeping and pre-analysis of the displaced mesh, but it seems to work well. Figure 6 is a render of two cubes with geometric normals assigned per face. The two cubes are displaced using the same checkerboard displacement pattern, but the cube on the left has crack removal disabled, while the cube on the right has crack removal enabled:

Figure 6: Displaced cubes with and without crack elimination.

In most cases, the crack removal system seems to work pretty well. However, the system isn’t perfect; sometimes, stretching artifacts can appear, especially with surfaces with a textured base color. This stretching happens because the crack removal system basically stretches micropolygons to cover the crack. This texture stretching can be seen in some parts of the shaderballs in Figures 5, 7, and 8 in this post.

Takua automatically recalculates normals for subdivided/displaced polygons. By default, Takua simply uses the geometric normal as the shading normal for displaced polygons; however, an option exists to calculate smooth normals for the shading normals as well. I chose to use geometric normals as the default with the hope that for subpixel subdivision and displacement, a different shading normal wouldn’t be as necessary.

In the future, I may choose to implement my own subdivision library, and I should probably also put more thought into some kind of proper combined tessellation cache and eviction strategy for better memory efficiency. For now though, everything seems to work well and renders relatively efficiently; the non-ocean renders in this post all have sub-pixel subdivision with millions of polygons and each took several hours to render at 4K (3840x2160) resolution on a machine with dual Intel Xeon X5675 CPUs (12 cores total). The two ocean renders I let run overnight at 1080p resolution; they took longer to converge mostly due to the depth of field. All renders in this post were shaded using a new, vastly improved shading system that I’ll write about at a later point. Takua can now render a lot more complexity than before!

In closing, I rendered a few more shaderballs using various displacement maps from the Megascans library, seen in Figures 7 and 8.

Figure 7: A pebble sphere and a leafy sphere. Note the overhangs on the leafy sphere, which are only possible using vector displacement.

Figure 8: A compacted sand sphere and a stone sphere. Unfortunately, there is some noticeable texture stretching on the compacted sand sphere where crack removal occured.