Composing video game music for Virtual Reality: 3D versus 2D

In this article written for video game composers, Winifred Phillips is here pictured working in her music production studio.

Welcome!  I’m videogame composer Winifred Phillips, and this is the continuation of our four-part discussion of the role that music can play in Virtual Reality video games.  These articles are based on the presentation I gave at this year’s Game Developer’s Conference in San Francisco, entitled Music in Virtual Reality (I’ve included the official description of my talk at this end of this article).  If you missed the first article exploring the history and significance of positional audio, please go check that article out first.

Are you back?  Great!  Let’s continue!

During my GDC talk, I addressed three questions which are important to video game music composers working in VR:

  • Do we compose our music in 3D or 2D?
  • Do we structure our music to be Diegetic or Non-Diegetic?
  • Do we focus our music on enhancing player Comfort or Performance?

While investigating these topics, we looked at some examples from VR games that provide great demonstrations, including four of my own VR projects –the Bebylon: Battle Royale arena combat game from Kite & Lightning, the Dragon Front strategy game from High Voltage Software, the Fail Factory comedy game from Armature Studio, and the Scraper: First Strike shooter/RPG from Labrodex Inc. In these articles, I’ll be sharing the discussions and conclusions that formed the basis of my GDC talk, including the best examples from these four VR game projects.  So now let’s turn our attention to the first of our three top questions:

Should our music be 3D or 2D?

We know that spatial delivery of sound design is critical, but does that extend to the music? Do most listeners care if the music is 3D?  It’s vital that we keep listener impact in mind – and some scholarly studies from expert researchers can help us throw light on that subject.

Illustration of audio research at the University of Hull, from the article by Winifred Phillips for video game composersMeasuring galvanic skin responses, a study conducted at the University of Hull in the UK tested for emotional reactions to both spatially treated music and standard stereo music recordings. They found that spatial treatment had no effect on the emotional impact and enjoyment of music. So, using a traditional stereo mix for a VR game’s music isn’t necessarily a bad thing… but spatial positioning for music can be beneficial, and fun too.

For instance:

  • We can use 3D elements to help integrate a 2D musical score into the VR world.
  • We can use 3D music to grab the player’s attention.
  • We can have music transition from 2D to 3D for dramatic effect.

3D music elements can help the musical score feel better connected to the environment in VR. Let’s take a look at an example from the Fail Factory VR game, which demonstrates how 3D music elements can share the stage with a conventional stereo music mix.

An illustration for the Fail Factory game on the popular VR platform, from the article for video game composers by Winifred Phillips (game music composer).In this comedic video game, players go to work in a zany robot factory. Players build massive robots, while keeping up with the ever-increasing complexity and speed of the assembly line. The result is often a series of hilarious failures, inspiring the game’s name – Fail Factory. When Armature Studio hired me to compose the music for their Fail Factory game for the Samsung Gear VR, they described a project in which music took center stage.

By necessity, Fail Factory is set on a gigantic factory floor – but what makes this factory uniquely awesome is the musical nature of the environment.  All the machinery in the factory moves rhythmically with the musical score. So, the dev team asked me to create a jazzy score for this music-driven gameplay. Apart from the score, all of the sound design of Fail Factory is also created specifically to be musical. The bleeps and bloops are pitched to integrate with the score, and the bangs and clangs are timed to emphasize the tempo. While much of the music is delivered to the player in traditional stereo, there are also lots of separate rhythmic and pitched elements that are spatially positioned on the game’s factory floor. The sound design team and I worked hard on getting the balance right between these 2D and 3D components.

For instance, in one minigame, heavy machinery slams down to a conveyor belt – this became the central downbeat for the music on this level. We tried just having that big metallic bang issue solely in 3D from its in-game position, but that didn’t work. As a spatialized sound that was rhythmically synced to the 2D music, the 3D metallic bang felt disconnected from the rest of the 2D score – plus, the bang just needed more oomph. The team and I went back and forth with iterations on this until we settled on both a spatialized impact sound and a simultaneous metallic clang integrated into the stereo music mix. Here’s how that sounded during gameplay:

So you can see that 3D music and audio in VR can be a complicated issue.  While the majority of the music in Fail Factory is mixed in stereo, there are percussive and tonal components (such as that big clang) that are spread out in 3D across the VR space. These elements in 3D allow us to have a nice stereo music mix that also integrates well into the three-dimensional soundscape.

Now let’s take a look at a different example that shows how music in VR can transition from 2D to 3D for dramatic effect.

In this article for video game composers, Winifred Phillips explains her music composition work for the Dragon Front game for the famous Oculus Rift VR platform.The popular Dragon Front VR strategy game for Oculus Rift is a mix of the famous tradition of high fantasy storytelling with a dieselpunk, World War II-inspired aesthetic. Each game session is a self-contained battle on a playing field loaded with monsters, missiles and the machinery of war. With all this in mind, the music of Dragon Front had to convey a suitably bold and dramatic style.  When High Voltage Software hired me to compose the music for Dragon Front , one of their biggest priorities was an epic main theme. So, I composed a big victorious anthem, with the stereo mix piped directly to the player’s headphones. The theme music was designed to continue into the hub, but bombastic music in the hub area could be distracting. So at that point the music moves from a direct channel to the player and takes up a position in the environment, as if it were issuing from in-game speakers. Here’s how that worked:

So we’ve now taken a closer look at the first of the three important questions for video game composers creating music for VR games:

  • Do we compose our music in 3D or 2D?
  • Do we structure our music to be Diegetic or Non-Diegetic?
  • Do we focus our music on enhancing player Comfort or Performance?

We’ve just explored what it means to compose music with both 2D and 3D considerations in mind.  The next article will focus on the second of the three questions: whether music in VR should be diegetic or non-diegetic.  Thanks for reading, and please feel free to leave your comments in the space below!

 

 


 

Music in Virtual Reality

Illustration of the popular VR projects featuring music by game composer Winifred Phillips, to be discussed in a GDC talk presented by Winifred Phillips for video game composers.This lecture presented ideas for creating a musical score that complements an immersive VR experience. Composer Winifred Phillips shared tips from several of her VR projects. Beginning with a historical overview of positional audio technologies, Phillips addressed several important problems facing composers in VR.

Topics included 3D versus 2D music implementation, and the role of spatialized audio in a musical score for VR. The use of diegetic and non-diegetic music were explored, including methods that blur the distinction between the two categories.

The discussion also included an examination of the VIMS phenomenon (Visually Induced Motion Sickness), and the role of music in alleviating its symptoms.  Phillips’ talk offered techniques for composers and audio directors looking to utilize music in the most advantageous way within a VR project.

Takeaway

Through examples from several VR games, Phillips provided an analysis of music composition strategies that help music integrate successfully in a VR environment. The talk included concrete examples and practical advice that audience members can apply to their own games.

Intended Audience

This session provided composers and audio directors with strategies for designing music for VR. It included an overview of the history of positional sound and the VIMS problem (useful knowledge for designers.)

The talk was intended to be approachable for all levels (advanced composers may better appreciate the specific composition techniques discussed).

 

 

Photo of video game composer Winifred Phillips in her music production studio.Winifred Phillips is an award-winning video game music composer whose most recent projects are the triple-A first person shooter Homefront: The Revolution and the Dragon Front VR game for Oculus Rift. Her credits include games in five of the most famous and popular franchises in gaming: Assassin’s Creed, LittleBigPlanet, Total War, God of War, and The Sims. She is the author of the award-winning bestseller A COMPOSER’S GUIDE TO GAME MUSIC, published by the MIT Press. As a VR game music expert, she writes frequently on the future of music in virtual reality games. Follow her on Twitter @winphillips.

Composing video game music for Virtual Reality: The role of music in VR

In this article for video game composers, Winifred Phillips is pictured working in her music production studio.

By Winifred Phillips | Contact | Follow

Hey everybody!  I’m video game composer Winifred Phillips.  At this year’s Game Developers Conference in San Francisco, I was pleased to give a presentation entitled Music in Virtual Reality (I’ve included the official description of my talk at the end of this article). While I’ve enjoyed discussing the role of music in virtual reality in previous articles that I’ve posted here, the talk I gave at GDC gave me the opportunity to pull a lot of those ideas together and present a more concentrated exploration of the practice of music composition for VR games.  It occurred to me that such a focused discussion might be interesting to share in this forum as well. So, with that in mind, I’m excited to begin a four-part article series based on my GDC 2018 presentation!

Continue reading

The Virtual Reality Game Music Composer

Morpheus

Project Morpheus headset.

Ready or not, virtual reality is coming!  Three virtual reality headsets are on their way to market and expected to hit retail in either late 2015 or sometime in 2016.  These virtual reality systems are:

VR is expected to make a big splash in the gaming industry, with many studios already well underway with development of games that support the new VR experience.  Clearly, VR will have a profound impact on the visual side of game development, and certainly sound design and voice performances will be impacted by the demands of such an immersive experience… but what about music?  How does music fit into VR?

At GDC 2015, a presentation entitled “Environmental Audio and Processing for VR” laid out the technology of audio design and implementation for Sony’s Project Morpheus system.  While the talk concentrated mainly on sound design concerns, speaker Nicholas Ward-Foxton (audio programmer for Sony Computer Entertainment) touched upon voice-over and music issues as well.  Let’s explore his excellent discussion of audio implementation for a virtual space, and ponder how music fits into this brave new virtual world.

Nick2

Nicholas Ward-Foxton, during his GDC 2015 talk.

But first, let’s get a brief overview on audio in VR:

3D Positional Audio

All three VR systems feature some sort of positional audio, meant to achieve a full 3D Audio Effect.  With the application of the principles of 3D Audio, sounds will always seem to be originating from the virtual world in a realistic way, according to the location of the sound-creating object, the force/loudness of the sound being emitted, the acoustic character of the space in which the sound is occurring, and the influences of obstructing, reflecting and absorbing objects in the surrounding environment.  The goal is to create a soundscape that seems perfectly fused with the visual reality presented to the player.  Everything the player hears seems to issue from the virtual world with acoustic qualities that consistently confirm an atmosphere of perfect realism.

All three VR systems address the technical issues behind achieving this effect with built-in headphones that deliver spatial audio consistent with the virtual world.  The Oculus Rift licensed the  Visisonics RealSpace 3D Audio plugin to calculate acoustic spatial cues, then subsequently built their own 3D Audio plugin based on the RealSpace technology, allowing their new Oculus Audio SDK to generate the system’s impressive three-dimensional sound.  According to Sony, Project Morpheus creates its 3D sound by virtue of binaural recording techniques (in which two microphones are positioned to mimic natural ear spacing), implemented into the virtual environment with a proprietary audio technology developed by Sony.  The HTC Vive has only recently added built-in headphones to its design, but the developers plan to offer full 3D audio as part of the experience.

To get a greater appreciation of the power of 3D audio, let’s listen to the famous “Virtual Barber Shop” audio illusion, created by QSound Labs to demonstrate the power of Binaural audio.

Head Tracking and Head-Related Transfer Function

According to Nicholas Ward-Foxton’s GDC talk, to make the three-dimensional audio more powerful in a virtual space, the VR systems need to keep track of the player’s head movements and adjust the audio positioning accordingly.  With this kind of head tracking, sounds swing around the player when turning or looking about.  This effect helps to offset an issue of concern in regards to the differences in head size and ear placement between individuals.  In short, people have differently sized noggins, and their perception of audio (including the 3D positioning of sounds) will differ as a result.  This dependance on the unique anatomical details of the individual listener is known as Head-Related Transfer Function.  There’s an excellent article explaining Head-Related Transfer Function on the “How Stuff Works” site.

Head-Related Transfer Function can complicate things when trying to create a convincing three-dimensional soundscape.  When listening to identical binaural audio content, one person may not interpret aural signals the same way another would, and might estimate that sounds are positioned differently.  Fortunately, head tracking comes to the rescue here.  As Ward-Foxton explained during his talk, when we move our heads about and then listen to the way that the sounds shift in relation to our movements, our brains are able to adjust to any differences in the way that sounds are reaching us, and our estimation of the spatial origination of individual sounds becomes much more reliable.  So the personal agency of the gaming experience is a critical element in completing the immersive aural world.

Music, Narration, and the Voice of God

the-creation-of-adam-436007_1280

Now, here’s where we start talking about problems relating directly to music in a VR game.  Nicholas Ward-Foxton’s talk touched briefly on the issues facing music in VR by exploring the two classifications that music may fall into. When we’re playing a typical video game, we usually encounter both diegetic and non-diegetic audio content.  Diegetic audio consists of sound elements that are happening in the fictional world of the game, such as environment sounds, sound effects, and music being emitted by in-game sources such as radios, public address systems, NPC musicians, etc.  On the other hand, non-diegetic audio consists of sound elements that we understand to be outside the world of the story and its characters, such as a voice-over narration, or the game’s musical score.  We know that the game characters can’t hear these things, but it doesn’t bother us that we can hear them.  That’s just a part of the narrative.

VR changes all that.  When we hear a disembodied, floating voice from within a virtual environment, we sometimes feel, according to Ward-Foxton, as though we are hearing the voice of God.  Likewise, when we hear music in a VR game, we may sometimes perceive it as though it were God’s underscore.  I wrote about the problems of music breaking immersion as it related to mixing game music in surround sound in Chapter 13 of my book, A Composer’s Guide to Game Music, but the problem becomes even more pronounced in VR.  When an entire game is urging us to suspend our disbelief fully and become completely immersed, the sudden intrusion of the voice of the Almighty supported by the beautiful strains of the holy symphony orchestra has the potential to be pretty disruptive.

angel-328437_640

The harpist of the Almighty, hovering somewhere in the VR world…

So, what can we do about it?  For non-diegetic narration, Ward-Foxton suggested that the voice would have to be contextualized within the in-game narrative in order for the “voice of God” effect to be averted.  In other words, the narration needs to come from some explainable in-game source, such as a radio, a telephone, or some other logical sound conveyance that exists in the virtual world.  That solution, however, doesn’t work for music, so it’s time to start thinking outside the box.

Voice in our heads

During the Q&A portion of Ward-Foxton’s talk, an audience member asked a very interesting question.  When the player is assuming the role of a specific character in the game, and that character speaks, how can the audio system make the resulting spoken voice sound the way it would to the ears of the speaker?  After all, whenever any of us speak aloud, we don’t hear our voices the way others do.  Instead, we hear our own voice through the resonant medium of our bodies, rising from our larynx and reverberating throughout our own unique formantor acoustical vocal tract.  That’s why most of us perceive our voices as being deeper and richer than they sound when we hear them in a recording.

Ward-Foxton suggested that processing and pitch alteration might create the effect of a lower, deeper voice, helping to make the sound seem more internal and resonant (the way it would sound to the actual speaker).  However, he also mentioned another approach to this issue earlier in his talk, and I think this particular approach might be an interesting solution for the “music of God” problem as well.

Proximity Effect

“I wanted to talk about proximity,” said Ward-Foxton, “because it’s a really powerful effect in VR, especially audio-wise.”  Referencing the Virtual Barber Shop audio demo from QSound Labs, Ward-Foxton talked about the power of sounds that seem to be happening “right in your personal space.”  In order to give sounds that intensely intimate feeling when they become very close, Ward-Foxton’s team would apply dynamic compression and bass boost to the sounds, in order to simulate the Proximity Effect.

The Proximity Effect is a phenomenon related to the physical construction of microphones, making them prone to add extra bass and richness when the source of the recording draws very close to the recording apparatus.  This concept is demonstrated and explained in much more depth in this video produced by Dr. Alexander J. Turner for the blog Nerds Central:

So, if simulating the Proximity Effect can make a voice sound like it’s coming from within, as Ward-Foxton suggests, can applying some of the principles of the Proximity Effect make the music sound like it’s coming from within, too?

Music in our heads

This was the thought that crossed my mind during this part of Ward-Foxton’s talk on “Environmental Audio and Processing for VR.”  In traditional music recording, instruments are assigned a position on the stereo spectrum, and the breadth from left to right can feel quite wide.  Meanwhile, the instruments (especially in orchestral recordings) are often recorded in an acoustic space that would be described as “live,” or reverberant to some degree.  This natural reverberance is widely regarded as desirable for an acoustic or orchestral recording, since it creates a sensation of natural space and allows the sounds of the instruments to blend with the assistance of the sonic reflections from the recording environment.  However, it also creates a sensation of distance between the listener and the musicians.  The music doesn’t seem to be invading our personal space.  It’s set back from us, and the musicians are also spread out around us in a large arc shape.

So, in VR, these musicians would be invisibly hovering in the distance, their sounds emitting from defined positions in the stereo spectrum. Moreover the invisible musicians would fly around as we turn our heads, maintaining their position in relation to our ears, even as the sound design elements of the in-game environment remain consistently true to their places of origin in the VR world.  Essentially, we’re listening to the Almighty’s holy symphony orchestra.  So, how can we fix this?

One possible approach might be to record our music with a much more intimate feel.  Instead of choosing reverberant spaces, we might record in perfectly neutral spaces and then add very subtle amounts of room reflection to assist in a proper blend without disrupting the sensation of intimacy.  Likewise, we might somewhat limit the stereo positioning of our instruments, moving them a bit more towards the center.  Finally, a bit of prudently applied compression and EQ might add the extra warmth and intimacy needed in order to make the music feel close and personal.  Now, the music isn’t “out there” in the game world.  Now, the music is in our heads.

Music in VR

It will be interesting to see the audio experimentation that is surely to take place in the first wave of VR games.  So far, we’ve only been privy to tech demos showing the power of the VR systems, but the music in these tech demos has given us a brief peek at what music in VR might be like in the future.  So far, it’s been fairly sparse and subtle… possibly a response to the “music of the Almighty” problem.  It is interesting to see how this music interacts with the gameplay experience.  Ward-Foxton mentioned two particular tech demos during his talk.  Here’s the first, called “Street Luge.”

The simple music of this demo, while quite sparse, does include some deep, bassy tones and some dry, close-recorded percussion.  Also, the stereo breadth appears to be a bit narrow as well, but this may not have been intentional.

The second tech demo mentioned during Ward-Foxton’s talk was “The Deep.”

The music of this tech demo is limited to a few atmospheric synth tones and a couple of jump-scare stingers, underscored by a deep low pulse.  Again, the music doesn’t seem to have a particularly wide stereo spectrum, but this may not have been a deliberate choice.

I hope you enjoyed this exploration of some of the concepts included in Nicholas Ward-Foxton’s talk at GDC 2015, along with my own speculation about possible approaches to problems related to non-diegetic music in virtual reality.  Please let me know what you think in the comments!