Extracting sufficient coherency from path tracing in order to be able to get good SIMD utilization is a surprisingly difficult problem that much research effort has been poured into, and Moonray has a really interesting solution!
This paper is the first place I've found a production use of Knights' Landing / Xeon Phi, the Intel massively-multicore Atom-with-AVX512 accelerator system, outside of HPC / science use cases.
It seems like Dreamworks had to move acceleration to GPU using CUDA and OptiX instead of SIMD/AVX and Embree - the released OpenMoonRay code supports both.
It kind of feels like Dreamworks got burned a little by Intel here - they invested a ton of research effort (alongside Intel) in Embree, SIMD, and Knights' Landing, and then had to add another implementation for OptiX/GPU/SIMT anyway.
Probably not. Even if the code was entirely reliant on the compiler performing vectorization (i.e. GPU & CPU instruction sets are at very the least nominally very different), GPU programming is quite different even to "lots of little CPUs"-style.
Yes, the instruction set is quite commonly supported, but knights landing was a hole different beast. The main focus of KL was to maximize AVX512 throughput for use in scientific workloads, but the processor ended up being a business failure.
This is exactly why I jumped into the comments. I was hoping someone had some relevant implementation details that isn't just a massive GitHub repo (which is still awesome, but hard to digest in one sitting).
> Extracting sufficient coherency from path tracing in order to be able to get good SIMD utilization is a surprisingly difficult problem
Huh, I'd have assumed SIMD would just be exploited to improve quality without a perf hit, by turning individual paths into ever so slightly dispersed path-packets likely to still intersect the same objects. More samples per path traced...
If you only ever-so-slightly perturb paths, you generally don't get anywhere near as much of a benefit from monte carlo integration, especially for things like light transport at a global non-local scale (might plausibly be useful for splitting for scattering bounces or something in some cases).
So it's often worth paying the penalty of having to sort rays/hitpoints into batches to intersect/process them more homogeneously, at least in terms of noise variance reduction per progression.
But very much depends on overall architecture and what you're trying to achieve (i.e. interactive rendering, or batch rendering might also lead to different solutions, like time to first useful pixel or time to final pixel).
Is there any comparisons to GPU-accelerated rendering? It seems most people are going that direction rather than trying to optimize for CPUs these days, especially via AVX instructions.
CPUs are still king at the scale Dreamworks/Pixar/etc operate at, GPUs are faster up to a point but they hit a wall in extremely large and complex scenes. They just don't have enough VRAM, or the work is too divergent and batches too small to keep all the threads busy. In recent years the high-end renderers (including MoonRay) have started supporting GPU rendering alongside their traditional CPU modes, but the GPU mode is meant for smaller scale work like an artist iterating on a single asset, and then for larger tasks and final frame rendering it's still off to the CPU farm.
So the idea is one CPU can have hundreds of gigabytes of ram at a time and the speed of the cpu is no problem because you can scale the process over as many CPUs as you want?
Its more to do with the movies using higher fidelity assets than you'd typically use for a game, which is what GPUs are made for. In a movie, a single element of a scene might have as many polygons as an entire character in a video game, and its because of the differences in how you 'film' them.
Imagine a brick wall rendered for a video game vs one rendered for a movie. The one for the game is probably going to be a plane with a couple textures on it because the wall is something that's a background element that the player isn't going to be up close with. Whereas in the movie, the wall is more likely to be made of a handful of individually modeled bricks with much more detailed surface textures because maybe the director wants the camera to be really close to the surface of that wall and pull out to a wider shot, so that means the individual brick you start zoomed in on might have a 4k texture for itself alone, whereas the entire wall in the video game could easily be a single 4k texture since the player doesn't get close enough to notice the missing detail.
Now multiply that level of detail across every rendered thing in the scene, because the director may want to reframe the shot, or you need realistic lighting to sell that a rendered thing is integrated with filmed footage and 'real'. Every little bit of that detail adds more data you have to track. So in my wall example, you might have 2 or 3 4k textures vs literally hundreds for all the bricks, grout, defects, chipped faces, etc of a movie quality wall.
The work is always divided per frame yes. Everything is baked so that happens easily even with water simulations, they’ve been reduced down to some form of geometry or something similar that can be rendered frame by frame.
The VRAM is indeed one of the main issues. But as someone else I believe cost per final pixel is still lower in CPU. That was particularly true during the GPU shortage.
As someone said above: GPUs are fine & faster as long as your scene stays simple. As soon as you hit a certain scene complexity ceiling, they become much slower that CPU renderers.
I would also argue that for this specific task, i.e. offline rendering such frames, the engineering overhead to make stuff work on GPUs is better spent making stuff faster and scale more efficiently on CPUs.[1]
I worked in blockbuster VFX for 15 years. It's been a while but I have network of people in that industry, many working on these renderers. The above is kinda the consensus whenever I talk to them.
[1] With the aforementioned caveat: if the stuff you work on is always under that complexity ceiling targeting GPUs can certainly make sense.
It's easier said than done, there's consistently a huge gulf between CPU and GPU memory limits. Even run-of-the-mill consumer desktops can run 128GB of RAM, which exceeds even the highest end professional GPUs VRAM, and the sky is the limit with workstation and server platforms. AMD EPYC can support 2TB of memory!
It’s not just about memory. The path tracing algorithm is a natural fit for CPU threads but very difficult to design for efficient use of GPU threads. It’s very easy to leave many of your GPUs threads idle due to the divergence, or overflowing the registers, and any number of other things that are very natural to path tracing.
GPUs need RAM that can handle a lot of bandwidth, so that all of the execution units can remain constantly fed. For bandwidth, there is both a width and a rate of transfer (often bounded by the clock speed) which combined yield the overall bandwidth, i.e. a 384-bit bus at XXXX million transfers/second. It will never matter how much compute or RAM they have if these don't align and you can't feed the cores. Modern desktop DDR has bandwidth that is too low for this, in general, on desktop platforms, given the compute characteristics of a modern GPU, which has shitloads of compute. Despite all that, signal integrity on parallel RAM interfaces has very tight tolerances. DDR sockets are very carefully placed with this in mind on motherboards, for instance. GDDR, which most desktop class graphics cards use instead of normal DDR, has much higher bandwidth (e.g. GDDR6x offers 21gbps/pin while DDR5 is only around 4.8gbp/s total) but even tighter interface characteristics than that. That's one reason why you can't socket GDDR: the physical tolerances required for the interface are extremely tight and the signal integrity required means a socket is out of the question.
Here is an example, go compare the RAM interfaces between an Nvidia A100 with HBM versus an Nvidia 3080, and see how this impacts performance. On compute-bound workloads, an Nvidia A100 will absolutely destroy a 3080 in terms of overall efficiency. One reason for this is because the A100 will have a memory interface that is 3-4x wider which is absolutely vital for lots of workloads. That means 3-4x the amount of data can be fed into execution units in the same clock cycle. That means you can clock the overall system lower, and that means you're using less power, while achieving similar (or better) performance. The only way a 3080 with a 256-bit bus can compare to a A100 with a 1024-bit bus is by pushing the clocks higher (thus increasing the rate of transfers/second), but that causes more heat and power usage, and it scales very poorly in practice e.g. a 10% clock speed increase might result in a measly 1-2% improvement.
So now, a bunch of things fall out of these observations. You can't have extremely high-bandwidth RAM, today, without very tight interface characteristics. For desktops and server-class systems, CPUs don't need bandwidth like GPUs, so they can get away with sockets. That has some knock on benefits; CPU memory can benefit from economies-of-scale on selling RAM sticks, for example. Lots of people need RAM sticks so you're in a good spot to buy more. And because sockets exist "in three dimensions", there's a huge increase in "density per square-inch" on the motherboard. If you want a many-core GPU to remain fed, you need soldered RAM which necessitates a fixed SKU for deployment, or you need to cut down on the compute so lower-bandwidth memory can feed things appropriately, negating the reason you went to GPUs in the first place (more parallel compute). Soldered RAM also means that the compute/memory ratios are now fixed forever. One nice thing about a CPU with sockets is that you can more flexibly arbitrage resources over time; if you find a way to speed something up with more RAM, you can just add it assuming you aren't maxed out.
Note that Apple Silicon is designed for lower power profiles; it has good perf/watt, not necessarily overall best performance in every profile. It uses 256 or 512-bit LPDDR5X, and even goes as high as 1024-bit(!!!) on the Max series apparently. But they can't just ignore the laws of physics; at extremely high bandwidth and bus widths you're going to be very subject to signal interface requirements. You have physical limitations that prevent the bountiful RAM sticks that each have multiple, juicy Samsung DDR5 memory chips on them. The density suffers. So Apple is only limited to so much RAM; there's very little way around this unless they start stacking in 3-dimensions or something. That's one of the other reasons they likely have moved to soldered memory for so long now; it simply makes extremely high performance interfaces like this possible.
All in all the economies of scale for RAM sticks combined with their density means that GPUs will probably continue to be worse for workloads that benefit from lots of memory. You just can't meet the combined physical interface and bandwidth requirements at the same density levels.
Do you think there’s any hope for UMA on PC / x86 systems? Seems like Intel would have an incentive to offer parts, but would it be possible to remain Windows/legacy OS compatible with a UMA implementation?
Redshift is cool and i use It and i know many Studios that use It.
It Just feels "Young". As a 3D artist when i see the absurd level of shader detail and Edge case solutions you see in something like Vray, you know that level of software detail only comes from the years its been in the field, taking in and solving customer feedback.
Those are generally being used on much smaller productions, or at least "simpler" fidelity things (i.e. non-photo CG animation like Blizzards's Overwatch).
So for Pixar/Dreamworks style things (look great, but not photo-real) they're useable and provide a definite benefit in terms of iteration time for lookdev artists and lighters, but it's not there yet in terms of high-end rendering at scale.
I talked to some folks who worked there, years ago, and was surprised they didn't use GPUs. I got the impression that the software was largely based on code that dated back to the 1990s.
Quality 3d animation software is available to anyone with Blender. If someone gets this renderer working as an addon (which will obviously happen) artist will get a side by side comparison of what their work looks like with both cycles and a professional studio product, for free.
This is win, win, win for Blender, OSS and the community.
This. Pixar’s Renderman has been an “option” for a while. It was out of band though. The cycles team will look at the theory behind what’s going on in renderers like this and will make the tech work inside cycles. Maybe someone will port this as another render option but really the sauce is their lighting models and parallel vectorization which could improve cycles already abysmally slow render times.
Surprised nobody has mentioned this, but it looks like it implements the render kernels in ISPC^, which is a tool that exposes a CUDA-like SPMD model that runs over the vector lanes in the CPU.
Vectorization is the best part of writing Fortran. This looks like it makes it possible to write fortran-like code in C. I wonder how it compares to ifort / openMP?
OpenMP starts with a serial model, the programmer tags the loop they want to run in parallel with a directive, and the compiler tries to vectorize the loop. This can always fail, since it relies on a computer program reasoning about an arbitrary block of code. So you have to really dig into the SIMD directives and understand their limitations in order to write performant code that actually does get vectorized.
ISPC starts with an SPMD programming model, and exposes only operations that conform to the model. "Vectorization" is performed by the developer - it's up to them to figure out how to map their algorithm or problem to the vector lanes (just like in CUDA or OpenCL). So there's no unexpectedly falling off the vectorization path - you start on it, and you can only do stuff that stays on it, by design.
It's an offline 3D rendering software that turns a scene description into a photorealistic image. Usually such a description is for a single frame of animation.
Offline being the opposite of realtime. I.e. a frame taking possibly hours to render whereas in a realtime renderer it must take fractions of a second.
Maybe think of it like a physical camera in a movie. And a very professional one for that. But then a camera doesn't get you very far if you consider the list of people you see when credits roll by. :]
Similarly, at the very least, you need something to feed the renderer a 3D scene, frame by frame. Usually this is a DCC app like Maya, Houdini etc. or something created in-house. That's where you do your animation. After you created the stuff you want to animate and the sets where that lives ... etc., etc.
Moonray has a Hydra USD delegate. That is an API to send such 3D scenes to a renderer. There is one for Blender too[1]. That would be one way to get data in there, I'd reckon.
In the most casual sense, a renderer is what "takes a picture" of the scene.
A scene is made of objects, light sources, and a camera. The renderer calculates the reflection of light on the objects' surfaces from the perspective of the camera, so that it can decide what color each pixel is in the resulting image.
Objects are made up of a few different data structures: one for physical shape (usually a "mesh" of triangles); one for "texture" (color mapped across the surface); and one for "material" (alters the interaction of light, like adding reflections or transparency).
People don't write the scene data by hand: they use tools to construct each object, often multiple tools for each data structure. Some tools focus on one feature: like ZBrush for "sculpting" a mesh object shape. Other tools can handle every step in the pipeline. For example, Blender can do modeling, rigging, animation, texturing and material definition, rendering, post-processing, and even video editing; and that's leaving out probably 95% of its entire feature set.
If you are interested at all in exploring 3D animation, I recommend downloading Blender. It's free software licensed under GPLv3, and runs well on every major platform. It's incredibly full-featured, and the UI is excellent. Blender is competitive with nearly every 3D digital art tool in existence; particularly for animation and rendering.
It's by and large mathematical software, like all renderers. So it isn't interactive in a manner like software that allows moving a character model and sequencing frames to make an animation. It's a kind of a 'kernel' in some sense for animation and 3D modelling software.
The source files contain the algorithms/computations needed to solve various equations that people involved in Computer Graphics research have came up with to simulate various physical - optical phenomena (lighting, shadows, water reflections, smoke, waves) in the most efficient (fast) and and usually photorealistic sense for a single image (static scene) already created (character/landspace models, textures) in a program.
Since there are various different techniques for the simulation of one specific phenomenon, it's interesting to peek into the tricks used by a very large animation studio.
I have no experience with moonray, but it being a render, the answer would be.. No.
The renderer is only one piece of the entire animated movie production pipeline.
Modeling -> Texturing ~ rigging /Animation -> post processing effects -> rendering - > video editing
That's a simplified view of the visual part of producing a short or long cgi film
It is a lot of knowledge to aquire so a production team is likely made of specialists and sub specialists (lighting?) working to a degree together.
The best achieving software, especially given its affordability is likely Blender. Other tools lile cinema4d, Maya and of course 3d smax are also pretty good all in one products that cover the whole pileline, although pricey.
Start with modeling, then texturing, then animation. Etc. Then dive into the slice that attracts you the most. Realistically you aren't going to ship a professional grade film so you may as well just learn what you love, and who knows perhaps one day become a professional and appear in the long credit name list at the end of a Disney/Pixar, Dreamworks hit.
> Modeling -> Texturing ~ rigging /Animation -> post processing effects -> rendering - > video editing
In animation (and VFX), editing comes at the beginning. Throwing away frames (and all the work done to create them) is simply too expensive. Handles (the extra frames at the beginning and start of a shot) are usually very small. I'd say <5 frames.
Also modeling & texturing and animation usually happen in parallel. Later, animation and lighting & rendering usually happen in parallel as well.
Are there books that teach people about the sorts of systems used to make animated movies? I’ve seen game engine books and the like. Physically based rendering is on my list, but I wonder if there are other interesting reads I’m missing.
Yes, classic book where from 90's like "The Renderman Companion" or "Advanced RenderMan", and then there is toolings books, for each tool. I used to own many Maya books and 3DS Max books.
MoonRay is a renderer that creates photorealistic images of computer-generated 3D scenes, using a technique called Monte Carlo ray tracing. MoonRay can be used as part of an animation project, but it is not an animation tool itself. Instead, it is a rendering engine that produces the final images that make up the animation.
To create an animated movie using MoonRay, you would need to use other tools to create the 3D models, textures, and animations that make up the scenes in your movie. Some examples of these tools include Autodesk Maya, Blender, and Cinema 4D. These tools allow you to create and manipulate 3D models, animate them, and add textures and lighting to create the final look of your scenes.
In addition to these 3D modeling and animation tools, you would also need to have a basic understanding of computer graphics and animation principles. This includes concepts such as keyframe animation, camera movement, lighting, and composition.
Once you have created your 3D scenes, you can use MoonRay to render them into high-quality images that can be used in your final animated movie. MoonRay can render images on a single computer, or it can be used with cloud rendering services to speed up the rendering process.
In summary, MoonRay is a rendering engine that produces photorealistic images of 3D scenes created using other 3D modeling and animation tools. To create an animated movie using MoonRay, you would need to use additional tools to create the scenes and have a basic understanding of computer graphics and animation principles.
i'm curious: what is the incentive for dreamworks to open-source this? surely having exclusive access to a parallel renderer of this quality is a competitive advantage to other studios?
I can imagine a few reasons why they'd do this, but some of it may just be 'why not'. Studio Ghibli has done the same thing with their animation software and it hasn't turned into a disaster for them. Making movies, especially movies that people will pay to watch is hard, and any serious competitors already have their own solutions. If people use moonray and that becomes a popular approach, competitors who don't use it are at a disadvantage from a hiring perspective. Also, DreamWorks controls the main repo of what may become a popular piece of tooling. There's soft power to be had there.
The competitive advantage is in storytelling, not necessarily visual fidelity. People will watch a somewhat worse looking movie with a better story than a better looking movie with a worse story. And honestly, can anyone really tell slightly worse graphical quality these days when so many animated movies already look good?
The exception, of course, is James Cameron and his Avatar series. People will absolutely watch something that looks 10x better because the visual fidelity itself is the draw, it's the main attraction over the story. This is usually not the case in most movies however.
The rendering in the Avatar movies is at the cutting edge. But quite apart from the very uninteresting storytellying there's something there that just doesn't work for me visually - I don't know if it's the uncanny valley effect of the giant skinny blue people with giant eyes or what, but I'd definitely rather watching something creative and painterly like the Puss in Boots movie, or even something like the Last of Us with really well integrated CG visuals and VFX that aren't necessarily top of the line, but well integrated and support a good story.
Did you watch in IMAX 3D? I watched in both 3D and 2D and the 2D simply cannot compare to the 3D. The way most 3D movies work is the 3D effects are done after the fact in post-production. 3D in Avatar movies are done entirely in the shooting phase, through 3D cameras. Hence, the 3D in Avatar films is much more immersive to me than in something like Dr Strange 2, which simply could not compare.
I try to find 3d movies but so few are of them are made. And like you say, most of them are automated billboard extractions rather than actual 2 cameras.
I haven't seen the second Avatar film at all; my observations are merely from seeing the first one in 3D and the trailers for the second. I'm aware that it's shot entirely 3D as well. While I was wowed at the 3D effect when I saw the first one, the thrill of that entirely wore off within a week or two and is not a big enough draw for me to see the second. I don't think I'm in the minority here, 3D was huge in cinemas for a year or so after the first Avatar film and then interest from the general public waned as well, probably in part due to subsequent 3D films using the post-production method and I agree the effect is not as good.
3D (including "real 3D") just doesn't seem to be the drawcard though that the geek community seems to think it is - I think the public in general would prefer better story, CG that serves the film and the story etc. And that is why I probably won't see the second - the story is not strong or interesting enough for me, and even the "wow" full 3D effect is not strong enough to pull me back, given the uncanny valley effect of the characters and the lacklustre story.
I'm talking about 3D in the Avatar series specifically, not 3D (including real 3D) as a whole. Avatar has made billions of dollars so far so it's doing something right. But yes, I agree that generally speaking, people like better stories with worse CGI than vice versa, it's just that James Cameron's movies are an exception to that rule.
At this point every studio has their own renderer, Pixar has RenderMan, Illumination has one from MacGuff, Disney has their Hyperion, and Animal Logic has Glimpse.
They just released a feature film with this renderer, grossing $462 million and widely praised for its animation.
Large studios don't update so regularly vs e.g. a startup. They have very specific setups, which is in fact a large part of why it took them so long to release moonray vs when they said they would last year. And they are moving to Rocky Linux soon IIRC.
>dumping it into FOSS community
They are not "dumping" anything. Would it have hurt to look into the facts before commenting?
re: CentOS7, the VFX Reference Platform (https://vfxplatform.com/) is probably relevant here. Their latest Linux Platform Recommendations report from August last year already covers migrations off of CentOS 7 / RHEL7 before the end of maintenance in 2024
Studios don't want to upgrade fast (e.g. they're not interested in running Debian unstable or CentOS's streaming updates thing)... they're interested in stability for hundreds of artists' workstations.
Getting commercial Linux apps like Maya, Houdini, Nuke, etc. working well at scale is hard enough without the underlying OS changing all the time.
Major animated movies take years to develop, and they don’t like to change the build process during. I used to cover a major animation studio for a major Linux vendor and they did in fact use very old shit.
It's not a big deal to take something built for CentOS7 and port to a later Red Hat (or clone) distro. It appears that they released a setup for what they use, which is CentOS7.
The tool may be good, but the output visuals are only as good as the artists that use said tools. They can open source the tools all they want and try to hire all the talent that can use it. :)
> surely having exclusive access to a parallel renderer of this quality is a competitive advantage to other studios?
The renderer is an important of the VFX toolkit, but there are more than a few production-quality renderers out there, some of them are even FOSS. A studio or film's competitive advantage is more around storytelling and art design.
Unreal is eating everyone's lunch. If they cannot get anyone else to contribute to their renderer, it will wind up getting shelved for Unreal with a lot of smaller animation studios already using Unreal instead of more traditional 3D Rendering solutions like Maya.
Tons of studios are now using Unreal for final rendering, including Disney and several blockbuster movies.
The fantastic thing about Unreal is that you can do realtime rendering on-set (e.g. for directorial choices/actor feedback) and then post-production upscale it with the ceiling only being cost. Unreal in the TV/Movie industry is already huge and only getting bigger, year-on-year.
You've definitely seen a TV or Movie that used Unreal.
Which Disney films use Unreal for final render? Disney has two separate path tracing renderers that are in active development and aren’t in danger of being replaced by Unreal.
These renderers are comparable in use case & audience to MoonRay, which is why I don’t think you’re correct that MoonRay needs external contribution to survive.
“Used unreal” for on-set rendering is hand-wavy and not what you claimed. Final render is the goal post.
Hey that’s pretty cool! Thanks for the link, it’s helpful to see the shots in question. Am I understanding correctly that the K droid was rendered from behind using Unreal in those shots, and the front shots were rendered with the in-house renderer? If true, I’d love to hear what the reasons were for not being able to use it on all the shots in the sequence. Are there more recent examples? Is Unreal still being tested like this at ILM, or is the focus on the in-house real time renderer?
BTW I’m hugely in favor of pushing real-time rendering for film (and I work on high performance rendering tech, aiming squarely at film + real-time!) I only was disputing the broad characterization by @Someone1234 that Unreal is widely used today for final, and that film studio path tracers are in imminent danger of death by game engine.
So, it's been a while, but I'll try to add clarification from the best of my memory.
So, in this sequence, I think it is the case that K2-SO was only rendered from behind.
IIRC, the reasons for not using it on more shots, and specifically the front shots in the sequence were two-fold, Primarily, we only had one TD/Lighting Artist trained in our pipeline using Unreal, which was still a little clunky to fit into our pipeline, so we were time limited. Now, K2-SO was not rendered from the front in a close-up due to problems with complexities with his eyes. (Some details from Naty later in the talk) Specifically, K2's eyes require fully lit transparencies with the full lighting model, at-least at the time Unreal only supports their full lighting feature set in their deferred renderer, and their forward renderer, used for transparencies was a vastly simplified lighting model which isn't able to fully capture the effect of K2's eyes. We were building a capable Forward renderer inside of Unreal internally, but this was not finished in time for Rogue One.
As an aside, we had a parallel internal renderer we were building for use on Rogue One, that even at the time had advantages, but Unreal was chosen for what I saw as political reasons.
I do not know of more recent examples, but I'm not involved in this project anymore, I know they used Unreal for Season 1 of the Mandalorian, but moved to their internal real-time renderer for Season 2. The internal renderer has a few advantages, not having to deal with the complexity of merging significant changes with Epic's engine for example with the forward renderer was one major advantage, but my understanding is that the major win, is just being able to build a renderer that integrates much better in their existing pipeline. Unreal's renderer is pretty strongly integrated into the rest of their engine, and the engine itself is very opinionated regarding how content is managed. And as you can imagine, ILM has their own opinions going back to about 30+ years of history.
I agree with your dispute of the broad characterization, but thought the counter-example would be illustrative.
BTW, I'm starting a new project investigating real-time rendering for film, and always interested to new perspectives, hit me up if you want to chat real-time rendering sometime.
Yes, the example is very illustrative, thanks again for posting it, and thanks for the context here! This history is fun to read. I was partly curious if texture sampling is still one of the reasons for avoiding real-time tech. Back when I was in film production at PDI two decades ago, texture sampling was near the top of the list. It seems true still today that games tolerate (suffer from) texture aliasing and sizzling routinely, while film sups will not tolerate it at all, ever. High quality texture sampling was, at the time, one of the main reasons offline renders took a long time. I remember being blown away how sensitive the lighting sup was to the differences between texture filters, lanczos, sinc, Blackman, Guass, etc., and how quickly he could see it. Today maybe it’s more often about how many samples are used for path tracing.
The K-2S0 droid character in Rogue One (voiced by Alan Tudyk) was, in fact, rendered in real-time using Unreal, then composited into shots afterwards.
John Knoll from ILM gave a talk at GDC 2017 about it.
The catch, though, is that ILM took the Unreal Engine source code and modified it extensively in order to be able to render K-2S0 as he appeared in the film. It's not like they just downloaded it from the Epic Store and ran with it.
Yup. The state of the art for real-time rendering just isn't there yet for hero work. Even ILM's custom Helios renderer is only used for environments and environment-based lighting, as far as I've read. Assets, fx shots, and characters are still rendered offline.
Even with real-time rendering for environments, I'm sure there's plenty of post-processing "Nuke magic" to make it camera-ready. It's not like they're shooting UE straight to "film".
I have seen reports of Unreal Engine being used quite successfully for pre-viz, shot planning, animatics, etc., though.
Unreal is used in TV quite often, yes. But no major studios use it for theatric releases, and I'm not aware of any who plan to. (Partner is in the industry)
I'm not really sure if they are competing with Unreal. Large studios will probably never use real time rendering for the final render unless it achieves the same quality. Dreamworks have built a renderer specifically for render farms (little use of GPUs, for example) which means they are not targeting small studios at all, rather something like Illumination Entertainment or Sony (think Angry Birds movie).
It has a Hydra render delegate so that is nice. Does Blender support being a Hydra client yet? It would be nice to have it supported natively in Blender itself. If it did, one could easily switch renderers between this and others.
I understand Autodesk is going this way with its tooling.
> It would be nice to have it supported natively in Blender itself. If it did, one could easily switch renderers between this and others.
Blender in general is setup to work with different renderers, especially since the work of Eevee which is the latest renderer to be added. Some part of the work on integrating Eevee also put some groundwork for making it easier in the future to add more of them.
Most probably this renderer would be added as a addon (if someone in the community does it), rather than in the core of Blender.
I had the same question. There exists a USD addon for Blender that support Hydra, so probably you could get that to work with a bit of trial and error!
Is anybody else intrigued by the mention of multi-machine and cloud rendering via the Arras distributed computation framework.?
Is this something new? The code seems to be included as sub-modules of OMR itself, and all the repos[1][2][3] show recent "Initial Commit" messages, so I'm operating on the assumption that it is. If so, I wonder if this is something that might prove useful in other contexts...
I can maybe add a bit of context to this. I worked on Moonray/Arras at DWA about 8-9 years ago.
Arras was designed to let multiple machines work on a single frame in parallel. Film renderers still very much leverage the CPU for a lot of reasons, and letting a render run to completion on a single workstation could take hours. Normally this isn’t a problem for batch rendering, which typically happens overnight, for shots that will get reviewed the next day.
But sometimes it’s really nice to have a very immediate, interactive workflow at your desk. Typically you need to use a different renderer designed with a more real-time architecture in mind, and many times that means using shaders that don’t match, so it’s not an ideal workflow.
Arras was designed to be able to give you the best of both worlds. Moonray is perfectly happy to render frames in batch mode, but it can also use Arras to connect dozens of workstations together and have them all work on the same frame in parallel. This basically gives you a film-quality interactive lighting session at your desk, where the final render will match what you see pixel for pixel because ultimately you’re using the same renderer and the same shaders.
Neat! Parallelizing a single frame across multiple machines was something I'd wanted to try back when I was working on RenderMan. It used to be able to do it back in the REYES days via netrender, but was something we lost with the move to pathtracing on the RIS architecture.
Could you go into a bit more detail on how the work is distributed? Is it per tile (or some other screen-space division like macro-tiles or scan-lines)? Per sample pass? (Surely it's not scene distribution like the old Kilauea renderer from Square!) Dynamic or static scheduling? Sorry, so many questions. :-)
My knowledge is probably outdated at this point (the now open source code is probably a better reference than my memory!) but at the time it was exactly as you described. Each workstation loaded the scene independently and work was distributed in screen space tiles and final assembly of the tiles was done on the client. I can’t remember if we implemented a work stealing queue to load balance the tile queue or not… my brain may be inventing details on that part. :)
I built a scene distribution renderer similar to Kilauea for my masters thesis in school, except with a feed forward shader design which exploited the linear color space to never send the results of computations back up the call stack… kind of neat but yeah, all sorts of reasons why that kind of design would not work well under production workloads. And RAM has gotten so stinking cheap!
So, I've finally managed to compile Moonray and play with it.
TBH, this was really not as straightforward as it ought to have been:
- some Optix-using code failed to compile against latest Optix SDK, code had to be patched.
- the build instructions at [4] aren't completely foolproof either (don't quit the container before you manage to snapshot it, or else)
When I finally got a clean compile, I tried to render a USD scene [1] ... it turns out Moonray only reads its own proprietary format (RDL) and you need to convert like so (I haven't found this documented anywhere, [3] now fails to load for me)
MoonRay comes with a Hydra Render Delegate that is compatible with any DCC tool with Hydra support, for interactive preview rendering. Upon finalization of the Hydra API specification, MoonRay will provide support for final frame batch rendering, and its Hydra Render Delegate will be the supported path to transform USD into MoonRay's internal RDL scene format.
The conversion is not without hiccups:
Warning: in Tf_PyLoadScriptModule at line 122 of /build/USD-prefix/src/USD/pxr/base/tf/pyUtils.cpp -- Import failed for module 'pxr.Glf'!
ModuleNotFoundError: No module named 'pxr'
The render is also problematic, is spits a long list of stuff like this (fails to load textures, basically):
Invalid image file "/tmp/Attic_NVIDIA/Materials/PreviewSurfaceTextures/curtain_mat_inst_Roughness.png": OpenImageIO could not find a format reader for "/tmp/Attic_NVIDIA/Materials/PreviewSurfaceTextures/curtain_mat_inst_Roughness.png". Is it a file format that OpenImageIO doesn't know about?
Resulting render looks ugly (no textures)
In conclusion: fantastic that Dreamworks decided to release Moonray, but at this point, it's still got some very sharp edges.
I wrote the RDL2 library for Moonray when I worked at DWA about 8-9 years ago. At the time, USD was still very nascent, and we already had RDL (v1) as an internal reference point, so that’s ultimately why Moonray uses something “non-standard” by modern conventions.
RDL has two on-disk formats, RDLA (for “Ascii”) and RDLB (for “Binary”). The text format is literally just a Lua script which uses various function calls to instantiate scene objects and set their parameters. It’s great for spinning up test scenes and doing development work on shaders or the renderer itself.
The binary format (which at the time used Protobuf to serialize scene objects, not sure if that’s still true) is more suited to production workflows where you don’t want to deal with things like floating point to text precision issues and a more space efficient representation is preferred.
And it looks like the user documentation has some examples of how to do things, including instantiating various types of scene objects: https://docs.openmoonray.org/user-reference/
There isn't even a monopoly within Disney, they acquired Pixar 17 years ago but Disney Animation Studios still develop their own Hyperion renderer completely independently of Pixar/Renderman.
Hyperion has the advantage of being exclusive to WDAS, so they can tailor it to their exact workflow and requirements, while Renderman is a commercial product with many users outside of Pixar all with their own wants and needs.
Hyperion was designed to specifically take advantage of a specific ray batching architecture that's particularly good for out of core renderers. Which can be good for particularly large and complex scenes.
Can someone please explain the differences between real-time renderers and offline renderers? Do real-time renderers optimize frame by frame and focus on retaining some quality while prioritizing performance, using techniques like LOD and occlusion? Do offline renderers focus solely on quality? Are scene descriptions for both types of renderers different? What are the standard description files in games versus movies?
Like someone else said, real-time renders need to output at a reasonable frame-rate, which is the top priority. Therefore, per-frame image quality can take a fairly severe hit before things start being noticeable.
For the record, most real-time renderers are rasterisation-based, where geometry is assembled, rasterised, and then the fragments shaded. This is what almost all video games have been running on since the 1990s. Many so-called 'RTX' games you see today still do the bulk of their rendering using rasterisation and all the associated hacks to achieve photorealism, and only enable path-tracing for specular reflection, soft shadows, and diffuse-diffuse global illumination.
A high-quality real-time path-traced pipeline was impossible to achieve in playable framerates until very recently (~5 years ago). This is because we simply didn't have the hardware to do it, and denoising algorithms weren't very powerful until we got generative AI algorithms (OptiX, DLSS, etc). Even today, any real-time path-traced pipeline renders much fewer samples than any offline render does—usually 3 or 4 orders of magnitude less—simply because it would be too slow and a waste to render so many samples for a frame that would be displayed for several milliseconds and then promptly discarded.
Offline renderers do jack the quality up, and they use massive render-farms with hundreds of thousands of cores, with memory on the order of 10^14-10^15 bytes. The scales are completely off the charts; a single frame using an off-line renderer can take up to several hours to render on an average home computer.
Real time renderers focus on being real time as the most important constraint and therefore make lots of compromises and take a lot of shortcuts.
Offline renderers try to simulate light transport as exactly as possible within a time budget.
For example, one of the best algorithm to create high quality renders of scenes with very complicated light transport problems (something like this: [1]) uses ray-tracing monte-carlo integration techniques. Up until very recently, this was completely out of reach for a real time render.
While I'm not terribly familiar with the subject myself: Note that figure 2 in TFA calls out Astrid's character model as consisting of "1.67GB of geometry and 11.1GB of textures". That's stupidly massive compared to asset sizes for e.g. video game character models and texturing, and would probably choke a commercial real-time engine all on its own.
Man, I can't wait for this to be properly (luxrender-level) integrated to Blender.
Especially the shaders (materials), which I feel is currently the weakest part of all the open source renders Blender supports natively (eevee, cycles, lux)
Can you elaborate on what's not good about eevee/cycles shaders? By proper integration do you imagine it will use Blender's node shader system or a different system?
I'm not being combative, I'm in the process of learning enough of Blender's code to be able to contribute.
I could write a book about this, but to make it short, what most users of a 3D system like Blender want when they get to the shading/rendering part of their work is to simply:
- pick an object in the scene
- pick a pre-made rich and complicated shader (wood, glass, tar, etc...) from a huge library of shaders expressed in a standard form (as in: that will work with whichever render, be it cycles, luxrender, moonray, etc...)
- drop it to the object without having to screw around for hours with texture scaling, rotation, uvs
- rinse and repeat until all objects in the scene are shaded
Very few people have the skills/knowledge and patience to put their own shaders together themselves by assembling basic nodes into a large and complicated shader tree.
There has been some (feeble) attempts to build standard libraries of Shaders around Blender:
I've used Blender only a little, but surely you're aware of the Asset Browser introduced in the 3.0? They seem easy enough to use, but of course they'd need to be high-quality to work similarly with all renderers.
Granted I don't actually know where to find such libraries except for the small free asset bundles at https://www.blender.org/download/demo-files/#assets or perhaps by paying Blender organization for them, so maybe this doesn't really address your core point :).
http://www.tabellion.org/et/paper17/MoonRay.pdf
Extracting sufficient coherency from path tracing in order to be able to get good SIMD utilization is a surprisingly difficult problem that much research effort has been poured into, and Moonray has a really interesting solution!