In the previous article we covered general aspects and benefits of Dolby Atmos for music as well as considerations for planning your workflow. In this next article we’ll take a look at composition and mixing recommendations as it relates to music for games presented in Dolby Atmos. As a reminder, check out my video series found here which walks through most of this content with tangible examples.
Composition, but not really
When I say composition, I don’t mean actual notes, phrases, or anything musically structural that would be recommended. Most modern game composers use instruments and DAWs as part of their compositional process, so the line between composition and orchestration or instrumentation quickly becomes blurred. This is where monitoring in Dolby Atmos will help you decide if the Em arpeggio should be played by the piccolo or the double bass and what the panning implications would be for each.
Back to basics
First let’s do a quick review the fundamentals of what constitutes a Dolby Atmos mix; bed and objects.
The Dolby Atmos bed is a 10 channel stream that maps to your standard channel based configuration of 7.1 with the addition of left and right overhead channels (7.1.2). Any source can be routed to the bed so it’s great for instruments that don’t need highly accurate positioning, and especially useful for disperse or volumetric sound sources like pads, washes, and reverb returns. Because the bed is channel based, it can actually be a collection of beds mixed together, so you can stem out as many beds as you want during your pre-dub.
The 7.1.2 configuration for the bed was designed for maximum compatibility with legacy software and hardware pipelines found in linear media creation and distribution environments. In the gaming world, the spatial audio API for Windows devices actually supports a 7.1.4 bed and so do the largest audio middleware solutions; Wwise and FMOD. This adds an additional pair of overhead channels so you can split the top into quadrants. This allows for more spatial resolution for sources panned in the bed.
An object is a mono piece of audio that has metadata associated with it that describes its 3D position in space, along with a few other bits of info. The easiest way to think about it is that this metadata is panning automation data for its associated piece of audio. So instead of baking the panning into a channel-based mix, the Dolby Atmos Renderer will “render” the panning at the time of playback according to the number of speakers attached to that specific system by reading this metadata for each object. One of the main benefits of objects is that they can be panned with pinpoint accuracy.
There can be up to 128 objects in total, and objects are always mono, though you can group multiple objects together for stereo or multichannel sources that need to preserve their relative channel relationships. Objects can be static or dynamic, so you can choose to move them anywhere during your mix, or just sit in one place the whole time. In fact, the bed is actually a collection of objects that are statically placed in the optimum location of a 7.1.2 speaker array! The bed uses the first 10 objects for its “channels”, which leaves 118 objects available for other sources.
Object display of the Dolby Atmos Renderer
Early decision time
Now that we’re clear on beds and objects, let’s talk about some initial information that will help you decide whether a sound or instrument should be panned as an object or to the bed. First and foremost is how much panning accuracy do you need for an instrument? Pads, section mics, reverbs, textures, these sorts of sounds are best mixed to the bed due to their volumetric nature. They occupy a much larger amount of space without the need for hyper-specific localization. Multichannel sources are also fine candidates for the bed, whether 1-to-1 as channeled or if you’re panning them, because you’re usually creating a wide image with them based on relative relationship of their source channels. For many composers, mixing everything to the bed provides all the spatial resolution required for a rich and enveloping mix.
Objects are great for sounds that benefit from a more localized point source, like lead or melody instruments, or percussion details, or aleatoric samples. Really anything that positional fidelity is required or desired, which will be maintained as the number of speakers increases in the user’s venue. This really comes into play in larger playback environments or as virtualization rendering algorithms continue to improve.
How much spatial interactivity that will be needed during gameplay will affect whether an instrument should be an object or panned to the bus. For example, let’s say you have a non-diegetic soundtrack playing in the background of your 3D game and you also have a diegetic musician character that plays an instrument in sync with your soundtrack. The soundtrack is locked to the player’s perspective, but the musician’s instrument is panned relative to where the player points the camera. The musician’s instrument would need to be an object so that the panning can happen interactively instead of getting baked into the bed mix.
Game objects with attached audio emitters using Wwise and Unreal 4
Keep in mind that once the music has been either mastered or implemented, sounds that were mixed to the bed are baked in, just like traditional channel-based mixing. Objects on the other hand are separate and can be manipulated independently during implementation or at runtime. It’s also worth pointing out that objects are a finite resource that you may share with SFX, and VO, with only 32 active objects available at once. However, most audio engines dynamically assign objects based on priority, and if there are no objects available at that specific moment in time the sounds are routed to the bed so nothing ever gets lost.
Currently there is no direct method for importing a Dolby Atmos master into a commercially available game audio engine. This means that the panning metadata for your objects will not be imported and you will need to replicate that panning either in the game audio engine or the game engine itself. In the above example with the musician character, this isn’t a problem because the audio object position is derived by the game object. In other implementation circumstances this may be a challenge which may influence your decision to not use objects and just pan instruments in the bed.
Building your soundstage
Now that you’ve considered your implementation and resource requirements which have helped you decide whether certain sounds should be mixed as objects or to the bed, let’s look at placement. Regardless of whether an instrument is using object or bed panning, it needs to go somewhere in space, right? Well 90% of figuring this out is based on your existing skills as a composer/mixer to take the context of the game, movie, or VR experience and letting that guide your creative and mixing decisions. All of your instincts for mixing in stereo, 5.1, or 7.1 are just as applicable in Atmos, you just have one more dimension in your toolbox.
Beyond basic left and right, I like to initially approach my mixes by breaking my listening space into quadrants whose boundaries intersect at the listener position. That way I can still think of traditional concepts like “Front Left” or “Rear Right”, but to that I add the concept of fore, mid, and background. I consider the boundary of the 7.1 speakers the mid-ground, the space in the room between the speakers and the listener is the foreground, and the space beyond the speakers away from the listener as the background. Of course there are infinite layers between these rudimentary distinctions, which is where most of the action happens. With the addition of the height axis I use the terms horizon, mid-room, and overhead to describe the elevation plane.
One possible way to define spatial zones when planning your mix
Armed with this lexicon, I can start to plan and talk about what spaces a sound or instrument will occupy. One rule of thumb I also use is to expect a reduction of reliance on bass content the higher I position a sound within the room. What I mean by that is because bass frequencies do not spatialize well, sounds panned overhead should be weighted more on the higher end of the spectrum, around 2KHz and above. So if I have a bass line that I want to be positioned higher up in the room or have more of a sense of spatial motion, I will make sure there is enough existing or supplemental content in the higher frequencies so the ear can perceive the motion better. I’ll use a synth that has a bit of bite on the top end or add another higher frequency instrument that mirrors the notes and movement of the bass.
One of the major benefits of mixing in Dolby Atmos is the improved ability to pull your mix into the listening space so you can make room in the mix for all the musical elements as well as any on-screen action. For example, panning some of your instruments a bit wider into your front left and right mid-ground, some others perhaps up into the front left and right mid-room. Even within your instruments you can span things so they’re not all overlaid on each other, allowing room for each instrument to have its own space. This helps you create a more open and natural sound to your mix without as much need for severe EQ notching and dynamic compression just to make things fit.
Distribution of objects for a simple music mix
With all this talk about spreading instruments around to create room in your mix, this is a good time to bring up the concept of density. With an expanded soundstage there is the question of how much can or should you fill it? Again, the answer will lie within your own ears and creative judgement. As with channel-based mixing, you still have headroom and downmix limitations that you will need to factor in from a technical perspective. From a creative perspective you really just have more power to create a beautiful wall of noise or an intimate portrayal of negative space. I tend to follow the “less is more” school of thought and have found that as fun as it is to fill the space, it can get overwhelming quickly. It actually becomes underwhelming because if you have a ton of stuff moving all around or route everything to a 10-channel reverb to feed all the speakers, you lose perspective and contrast. Everything turns into unlocalized mush or fights for attention to the point of fatigue which will make the listener tune out or turn the volume down!
For more recommendations about mixing music for Dolby Atmos, check out the excellent resources here.