What is Spatial Audio? Asbjoern Andersen


What is spatial audio? It's a term we hear a lot, but it can be hard to get your head around what exactly it entails. To help shed some light on it, Ira Bolden from Embody talks spatialization, immersive audio formats, binaural audio for games, using AI to create personalized HRTFs (head-related transfer functions), and how they approach spatial audio at Embody:
Written by Ira Bolden, photos courtesy of Ira Bolden
Please share:

Spatial effects are a key element in creating clarity and dynamicity in any audio mix. In the world of stereo music, mix engineers will often create a sense of space and depth with techniques including panning, stereo widening, and the manipulation of time-based effects like reverb and delay. By shaping the space in which the music dwells, individual elements are allowed to breathe and interact in a way that creates a more engaging and polished listening experience.

The concept of spatialization in music has recently expanded thanks to the integration of binaural rendering engines into streaming services like Apple Music. Immersive audio formats such as Dolby Atmos, which were once the domain of the film industry, and hi-fi aficionados with ample disposable income, are now beginning to supplant traditional stereo as technologies advance, which enable their consumption over headphones. No longer limited to a stereo sound field or an audience restricted by access to surround sound speaker arrays, mix engineers now have the freedom and incentive to explore a sense of space that expands around, above, and behind the listener.

The rise of Spatial Audio in popular music signals a shift in the public consciousness that game designers would do well to take note of. As awareness of and demand for immersive sound increases, so too will quality expectations, applying pressure on game developers to come up with innovative new ways of improving the depth, accuracy, and engagement of their spatial sound design. Notable, too, is the fact that headphones are dominating as the primary medium by which listeners are consuming spatial audio. This means that as sound designers create in immersive mediums, they need to pay very close attention to how their work is going to translate within a binaural sound field.

 

BINAURAL AUDIO IN GAMES: A BRIEF HISTORY

Binaural Audio is a stereo format meant to emulate the way human ears experience sound in real-world environments. It can be produced by recording an audio source using a head-shaped microphone array or through digital rendering such as found in Apple Music, Dolby Atmos for Headphones, and Immerse Gaming software. Due to its effectiveness at translating an enveloping sense of sonic space with just two channels, as well as its assumption of a listening position at the center of the sound field, binaural audio is ideally suited to headphone playback. While approaches to binaural rendering vary widely from case to case, all of them rely on a Head Related Transfer Function (HRTF) at their core.

Game sound designers have been contending with 3D sound fields for a long time. Examples of binaural 3D audio implementation can be found as far back as 1998. In addition to being the year Brittany Spears secured her immortality as a global pop icon with the release of her single, “…Baby One More Time”, 1998 also saw the release of a slightly less lusty but no less influential piece of entertainment media: Half-Life.

Half-Life remains widely regarded as one of the most influential FPS titles of all time, lauded by critics and fans alike for its innovative contributions to the genre. Perhaps it was that same spirit of innovation that inspired the developers to integrate the Aureal A3D 2.0 audio engine into the game. Although the accuracy of its spatial definition pales in comparison to more modern examples, Half-Life’s use of binaural audio was groundbreaking for its time. To hear Aureal A3D 2.0 in action and to learn more about its tragic history, check out this YouTube video.

The untimely demise of Aureal A3D would not be the end of binaural audio in games. Many other companies entered the fray with their own 3D audio engines. From the practically ancient DirectSound3D, RealSpace3D, and Phonon 3D to more recent examples like Windows Sonic, Steam Audio, and Resonance, immersive audio technology has seen continuous innovation and improvement throughout its sorted yet unrelenting history. Yet, with nearly every iteration and permutation, one common thread has persisted: the use of generic HRTFs.

 

YOU’RE NO DUMMY: THE NECESSITY OF HRTF PERSONALIZATION

HRTF is the scientific designation given to the set of data that explains how you hear stuff. HRTFs are completely unique to every individual and include a mind-boggling array of anatomical idiosyncrasies, all of which our wrinkled human brains somehow manage to take into consideration when localizing sound. They’ve been the subject of extensive research and the linchpin of every “3D audio for headphones” solution you’ve ever heard. In the context of video games, an HRTF functions as the mathematical representation of “You” within a 3D sound field.

Embody_sound-01

No two HRTFs are alike. Can you spot the differences?

It goes without saying that you are not a B&K Head and Torso Simulator. You are also not the anatomical equivalent of your nearest neighbor unless perhaps you live in an underground facility populated with clones of yourself. And yet, sound designers have been forced time and time again to rely on generic HRTF modeling techniques, including the popular Nearest Neighbor approach featured in the Sony Tempest engine, to enable their players to experience immersive audio. This is tantamount to forcing you to listen with someone else’s ears; they may be ears, sure, but they’re not yours, and your brain knows it.

Have you ever listened to spatial audio on headphones and thought, “This doesn’t sound right?” Maybe you’re having a hard time distinguishing between sounds in front of you and behind you. Maybe everything sounds like it’s underwater, or maybe everything sounds like it’s above you for some reason. Although HRTF issues are not the sole contributing factor to negative spatial audio experiences like these, they rank very high on the list. This presents a significant problem both for the design phase and the end user experience, affecting both the designer’s ability to create immersive audio that will accurately translate to headphones and the player’s ability to… well, enjoy it.

So, if they’re so problematic, why use generic HRTFs at all?
 

AVOIDING YOUR PROBLEMS: PERSONALIZATION BY HALF-MEASURES

HRTF Personalization is not as easy as it might seem. In fact, it requires such a specialized base of knowledge that most companies choose to circumnavigate it entirely. Even those solutions that tout some degree of personalization are not actually delivering unique HRTFs to every user. Instead, they’re personalizing by half measures.

Take the aforementioned Nearest Neighbor method, for example. With this method, a user selects or is assigned an HRTF from a finite pool of premade HRTFs. The number of available HRTFs can vary, as can the sophistication of the matching methodology. For example, users may be asked to perform a sort of spatial imaging listening quiz that funnels them to one of three possible HRTF options based on their responses. Other solutions could require a full face scan from which certain identifiers are extracted to match the closest available HRTF, but these solutions are not usually very transparent with respect to what those identifiers are and how accurate the whole process really is across demographics.

Another half measure is to start with an artificial HRTF, usually based on one of those HATS dummy heads, and then allow the user to modify a limited set of parameters based on their personal measurements. Waves NX is one example of such an approach where users were able to enter their head circumference and interaural arc in place of the generic’s. These kinds of approaches, while better than no personalization at all, still fall short of true HRTF personalization because they operate using a very limited number of parameters that aren’t sufficient in covering the natural variety that occurs between every individual HRTF.

Granted, the traditional methods for actual personalization are quite cumbersome and in no way scalable for mass market distribution. Binaural microphones and anechoic chambers are cool and everything, but nobody wants to sit in a chair for hours while acoustic researchers measure impulse responses. Well, maybe not nobody – but you get the point.

Embody_sound-02

Nikhil Javeri, Manager – Machine Learning R&D at Embody, demonstrating his enthusiasm for anechoic chambers and mass market viability of traditional HRTF measurement processes.

 

ARTIFICIAL INTELLIGENCE: UNLOCKING SPATIAL AUDIO’S TRUE POTENTIAL

In addition to plagiarizing artists on a mass scale hitherto unimaginable, Artificial Intelligence is enabling us to achieve some pretty miraculous things. Among the list of AI’s accomplishments is the ability to generate entirely unique personalized HRTFs for anyone within 30 seconds, using just a smartphone and cloud computing. This relatively recent development has been deployed for the mass market in FINAL FANTASY XIV.

With the release of Immerse Gamepack, FINAL FANTASY XIV became the first MMORPG ever to integrate personalized binaurally rendered Higher Order Ambisonics (HOA) audio. With the exception of being limited to PC only (for now), this immersive experience is hardware agnostic – meaning players can hear it using any headphones. This was accomplished with the help of the Immerse™ AI Engine, developed by Embody – that’s us!

“With standard stereo audio technology, it can be difficult to fully grasp where the audio is coming from. Since each person has a unique head shape and ear positioning, audio coming from behind or above us is perceived differently by everyone. To address this, the Immerse Gamepack uses AI to analyze a photo of a person’s ear to tailor sounds specifically for that individual.” – Go “Kinugo” Kinuya, FFXIV Sound Team

Immerse employs a novel 3D reconstruction algorithm that is modeled taking into account the geometry of the human ear. A 2D image or video capture is fed into the algorithm from which a 3D model is extrapolated. The 3D output is then run through an Acoustic Scattering Neural Network (ASNN) designed on the principles of Boundary Element Method (BEM), which replicates how sound reflects and refracts off your ear from any direction and then outputs your HRTF. This algorithm deviates from other state-of-the-art 3D reconstruction techniques in that it’s specifically trained to analyze complex abstract ear structures as opposed to more common shape estimation problems.

Embody_sound-03

If that information caused your eyes to glaze over, you’re not alone. If you’re interested in learning more about this machine learning-based approach to creating personalized HRTFs, grab a cup of coffee, settle in, and dig into this research paper, which covers the topic in extensive technical detail.

It may not seem like it at first, but this development is actually a pretty big deal. With the HRTF personalization problem finally solved, sound designers now have access to a powerful tool that will help them create a new generation of binaural audio experiences that are far more accurate in their articulation of spatial detail – particularly in the height dimension.

 

PERCEPTUAL TUNING: EXPANDING THE CREATIVE ROLE OF HRTF TECHNOLOGY

Scalable HRTF personalization infrastructure isn’t the only advancement required to truly elevate quality standards for immersive soundscapes. HRTFs have traditionally been used as a generic variable for achieving a baseline binaural spatialization effect. However, new techniques and technologies are being developed, which dramatically expand the creative possibilities for HRTF implementation.

Much like HRTFs themselves, no two games’ mixes or sound field requirements are exactly the same. As such, spatial sound fields and HRTF tunings should be designed by taking into account factors including environment, fluctuations in object density, dynamic changes in camera perspective, front vs. rear vs. height imaging, gameplay benefits, and audio mix. In most scenarios, HRTFs are treated as a sort of one-size-fits-all tool and are not customized in a way that is specifically designed to reinforce and complement these critical gameplay considerations.


Popular on A Sound Effect right now - article continues below:


Trending right now:


Latest releases:

  • 54 sounds on fire! Another indispensable toolkit of fire, wood burning, flames and different fire ambiences that were recorded indoors and outdoors. Find the true sound of it with Vadi Sound Library.

     

     

    About Campfire, Fireplace and Stove

    From loopable fire, wood burning, fireplace and flames, bonfire, stove and campfire, this 96 kHz – 24bit collection has both organic Foley and sound design usability in stereo and mono format.

    You will get lots of organic firewood crackles, sizzles, hisses, whooshes and campfire ambiences of the forest, sometimes with owls hooting, dogs barking and the crickets. Fire bursts and igniting with spray and flamethrower, matches, magneto lighters, closing and opening of metal lids are included too.

    These 54 immersive sounds are windy, wild, fast or calm and peaceful and were recorded at different seasons, at night and day, indoors and outdoors and all fire burning sounds are loopable. You will get intuitive, detailed naming, UCS compatibility and the usual Vadi Sound craft and attention to detail.

    Keywords:

    Fire, fireplace, stove, campfire, bonfire, flame, burn, burst, crackle, sizzle, hiss, gas, ignite, forest, night, day, indoor, outdoor, match, lighter, whoosh, air, brush, debris.

     

     

    What else you may need

    You may want to check out Drag and Slide Pack for 477 sounds of dragging, sliding, scraping and friction sounds of different objects made of wood, plastic, metal on various surfaces.

    Lots Of Chains is another option with 450+ sounds that capture pretty much every material and action of the chain.

  • Blast off into a dark sci-fi world teeming with creatures and unknown wonders. Introducing the heart-pounding warfare and general FPS & Sci-fi game audio SFX pack filled with the intensity you need to start your first-person shooter game. Welcome to “Sci-Fi Shooter Game” – a game audio sound effects library featuring over 1100+ designed SFX, tailored specifically for game developers and sound designers seeking a solid foundation for their game audio. Build alien atmospheres, futuristic battlefields, UI sequences, gizmos, gadgets, doors, mechs, robots & more. Sci-fi Shooter Game covers almost all your needs, whether it’s a dark synth-wave loading screen music loop, the realistic clanks of robotic mechanical footsteps, the futuristic hum of UI elements, or the thunderous blast of weapons; every sound is expertly crafted for maximum impact and intensity. With over 78 minutes of finely tuned audio and a solid foundation of almost everything needed to complete a sci-fi shooter game, you’re in for an action-packed adventure.

    • 1173 files / 484 glued files
    • 6.69 GB of game audio assets
    • All in 96k 24bit .wav
    • Includes over 78 minutes of audio
  • Illumination Designed Bundle Vol. 1 features an experimental collection of sounds crafted from the electromagnetic signals of various light source.

    This bundle offers a variety of sci-fi inspired sounds, including cinematic elemenets, drones/ambiences and explosions. Our Audio Craftsmen used a Lite2Sound photodiode amplifier to capture the electric signal lights emit. These were then meticulously edited and manipulated to create a series of unique sounds.

    The 3.27GB collection includes three volumes:

     

    Illumination Designed Vol. 1:

    A series of unique cinematic elements that blend stingers, risers, impacts and low frequency rumbles.

    Illumination Designed Vol. 2:

    A series of unique ambiences and drones.

    Illumination Designed Vol. 3:

    A series of futuristic, hi-tech explosions.

     

    This bundle is designed to enhance film, TV, and game productions or trailers with an experimental, cyberpunk, or abstract edge. The sounds are suitable for genres like sci-fi, thriller, and fantasy, adding mystery, eeriness, and tension to projects.

    All sounds are delivered in a high-quality 24Bit 96kHz format, allowing for further sonic manipulation. They have been meticulously edited and tagged with extensive UCS compliant metadata for easy organization and use.

  • Car Sound Effects 2000s Cars 3 Play Track 2376 sounds included, 287 mins total $200

    Compilation of 10 different 2000s cars. Sounds are recorded with RØDE NTG1, RØDELink Lav, Line Audio Omni1, Shure KSM137, Shure VP88, Sonorous Objects SO.3, FEL Pluggy XLR EM272 and FEL Clippy XLR EM272 microphones, Sound Devices MixPre-6 II and Zoom F3 and Zoom H4n recorders. Library contains wav files of driving, interior and exterior foley, mechanical and electrical sounds. It is also available in UCS.

     

    Models:

     

    1. Fiat Bravo 2007 compact car

    2. Ford C-Max 2003 MPV minivan multi purpose vehicle

    3. Honda Civic Type S 2006 sport compact car

    4. Land Rover Freelander 2003 compact crossover SUV sport utility vehicle

    5. Mini Cooper 2006 subcompact supermini car

    6. Opel Vectra C 2005 large family car

    7. Renault Megane 2002 compact car

    8. Skoda Fabia 2007 subcompact supermini car

    9. Volkswagen Polo 2009 subcompact supermini car

    10. Zastava Yugo Koral In 2002 subcompact supermini car

  • ‘Shoot ‘em up’ has never been this exciting! From powerful laser and plasma blasts, to specialised sounds such as fire, rain and laser circles, Sci-Fi Weapons: Bullet Hell adds intensity to virtual battles. With our trusty Vaemi’s El-Ma electromagnetic field mic, we’ve captured some wild electronic shenanigans, blending them with our synthetic sounds to cook up weapon effects that’ll make your players go, “Whoa!”

Need specific sound effects? Try a search below:


Imagine that your sound team included a dedicated spatial audio mastering engineer whose primary responsibility was to tune the HRTF and spatial sound field to your exact requirements. The audiovisual field for on-screen vs off-screen action could be reinforced through adjustments to transitional angles and curves based on differing amounts of spatialization applied to front, side, rear, and height channels or to any available angle in the case of Ambisonic and object-based implementation. The amount and quality of early reflections in the HRTF could be customized in order to significantly aid in localization accuracy across the board. Finally, more traditional mastering options like EQ and gain adjustment could be applied for each angle or spatial region, helping to blend the immersive sound design more effectively with the game’s overall mix.

This is the process we undertook together with the Square Enix sound team when designing Immerse Gamepack. Our Immerse HRTF generation pipeline includes hundreds of customizable variables allowing for such granular control over spatial rendering, which we couple with our patented Clearfield™ technology to control the amount of spatialization applied to different parts of the sound field. By introducing sound designers to this technology and assisting them in its use and implementation, we hope to expand the industry’s understanding of the true creative potential of immersive sound.

 

INTEGRATION: ADDING PERSONALIZED SPATIAL AUDIO TO YOUR GAME

We currently have two options available for integrating Immerse personalized spatial audio technology into your game: Plugins for Wwise and game engines, and custom API integration. These tools provide the means for developers to both monitor channel or object-based audio on headphones with their own personalized HRTFs, and implement that same support on the front-end for their players. This allows you to accurately design and monitor immersive audio on headphones for a wide range of output formats within a binaural virtual environment that’s perfectly contiguous with the environment in which the majority of your players will hear it. If you’d like to learn more about this technology, you can visit our website at https://embody.co/gamedev.

Our mission is to empower sound designers to create better, more immersive spatial audio experiences. We hope you’ve found this article to be informative and that you’re left feeling inspired to further explore personalized, immersive sound design. We encourage you to contact us at dev@embodyvr.co. if you’re interested in exploring how our technology can be integrated into your next project.

 

A big thanks to Ira Bolden for sharing insights on Creating Better Spatial Audio Experiences

 

Please share this:


 



 
 
THE WORLD’S EASIEST WAY TO GET INDEPENDENT SOUND EFFECTS:
 
A Sound Effect gives you easy access to an absolutely huge sound effects catalog from a myriad of independent sound creators, all covered by one license agreement - a few highlights:

Explore the full, unique collection here

Latest sound effects libraries:
 
  • 54 sounds on fire! Another indispensable toolkit of fire, wood burning, flames and different fire ambiences that were recorded indoors and outdoors. Find the true sound of it with Vadi Sound Library.

     

     

    About Campfire, Fireplace and Stove

    From loopable fire, wood burning, fireplace and flames, bonfire, stove and campfire, this 96 kHz – 24bit collection has both organic Foley and sound design usability in stereo and mono format.

    You will get lots of organic firewood crackles, sizzles, hisses, whooshes and campfire ambiences of the forest, sometimes with owls hooting, dogs barking and the crickets. Fire bursts and igniting with spray and flamethrower, matches, magneto lighters, closing and opening of metal lids are included too.

    These 54 immersive sounds are windy, wild, fast or calm and peaceful and were recorded at different seasons, at night and day, indoors and outdoors and all fire burning sounds are loopable. You will get intuitive, detailed naming, UCS compatibility and the usual Vadi Sound craft and attention to detail.

    Keywords:

    Fire, fireplace, stove, campfire, bonfire, flame, burn, burst, crackle, sizzle, hiss, gas, ignite, forest, night, day, indoor, outdoor, match, lighter, whoosh, air, brush, debris.

     

     

    What else you may need

    You may want to check out Drag and Slide Pack for 477 sounds of dragging, sliding, scraping and friction sounds of different objects made of wood, plastic, metal on various surfaces.

    Lots Of Chains is another option with 450+ sounds that capture pretty much every material and action of the chain.

  • Blast off into a dark sci-fi world teeming with creatures and unknown wonders. Introducing the heart-pounding warfare and general FPS & Sci-fi game audio SFX pack filled with the intensity you need to start your first-person shooter game. Welcome to “Sci-Fi Shooter Game” – a game audio sound effects library featuring over 1100+ designed SFX, tailored specifically for game developers and sound designers seeking a solid foundation for their game audio. Build alien atmospheres, futuristic battlefields, UI sequences, gizmos, gadgets, doors, mechs, robots & more. Sci-fi Shooter Game covers almost all your needs, whether it’s a dark synth-wave loading screen music loop, the realistic clanks of robotic mechanical footsteps, the futuristic hum of UI elements, or the thunderous blast of weapons; every sound is expertly crafted for maximum impact and intensity. With over 78 minutes of finely tuned audio and a solid foundation of almost everything needed to complete a sci-fi shooter game, you’re in for an action-packed adventure.

    • 1173 files / 484 glued files
    • 6.69 GB of game audio assets
    • All in 96k 24bit .wav
    • Includes over 78 minutes of audio
  • Illumination Designed Bundle Vol. 1 features an experimental collection of sounds crafted from the electromagnetic signals of various light source.

    This bundle offers a variety of sci-fi inspired sounds, including cinematic elemenets, drones/ambiences and explosions. Our Audio Craftsmen used a Lite2Sound photodiode amplifier to capture the electric signal lights emit. These were then meticulously edited and manipulated to create a series of unique sounds.

    The 3.27GB collection includes three volumes:

     

    Illumination Designed Vol. 1:

    A series of unique cinematic elements that blend stingers, risers, impacts and low frequency rumbles.

    Illumination Designed Vol. 2:

    A series of unique ambiences and drones.

    Illumination Designed Vol. 3:

    A series of futuristic, hi-tech explosions.

     

    This bundle is designed to enhance film, TV, and game productions or trailers with an experimental, cyberpunk, or abstract edge. The sounds are suitable for genres like sci-fi, thriller, and fantasy, adding mystery, eeriness, and tension to projects.

    All sounds are delivered in a high-quality 24Bit 96kHz format, allowing for further sonic manipulation. They have been meticulously edited and tagged with extensive UCS compliant metadata for easy organization and use.

  • Car Sound Effects 2000s Cars 3 Play Track 2376 sounds included, 287 mins total $200

    Compilation of 10 different 2000s cars. Sounds are recorded with RØDE NTG1, RØDELink Lav, Line Audio Omni1, Shure KSM137, Shure VP88, Sonorous Objects SO.3, FEL Pluggy XLR EM272 and FEL Clippy XLR EM272 microphones, Sound Devices MixPre-6 II and Zoom F3 and Zoom H4n recorders. Library contains wav files of driving, interior and exterior foley, mechanical and electrical sounds. It is also available in UCS.

     

    Models:

     

    1. Fiat Bravo 2007 compact car

    2. Ford C-Max 2003 MPV minivan multi purpose vehicle

    3. Honda Civic Type S 2006 sport compact car

    4. Land Rover Freelander 2003 compact crossover SUV sport utility vehicle

    5. Mini Cooper 2006 subcompact supermini car

    6. Opel Vectra C 2005 large family car

    7. Renault Megane 2002 compact car

    8. Skoda Fabia 2007 subcompact supermini car

    9. Volkswagen Polo 2009 subcompact supermini car

    10. Zastava Yugo Koral In 2002 subcompact supermini car

  • ‘Shoot ‘em up’ has never been this exciting! From powerful laser and plasma blasts, to specialised sounds such as fire, rain and laser circles, Sci-Fi Weapons: Bullet Hell adds intensity to virtual battles. With our trusty Vaemi’s El-Ma electromagnetic field mic, we’ve captured some wild electronic shenanigans, blending them with our synthetic sounds to cook up weapon effects that’ll make your players go, “Whoa!”


   

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

129,973 Spambots Blocked by Simple Comments