Video Game Composers: The Art of Music in Virtual Reality (GDC 2018)

Video game composer Winifred Phillips, pictured in her music production studio.

 

By Winifred Phillips | Contact | Follow

Once again, the Game Developers Conference is almost upon us!  GDC 2018 promises to be an awesome event, chock full of great opportunities for us to learn and grow as video game music composers.  I always look forward to the comprehensive sessions on offer in the popular GDC audio track, and for the past few years I’ve been honored to be selected as a GDC speaker.  Last year I presented a talk that explored how I built suspense and tension through music I composed for such games as God of War and Homefront: The Revolution.  This year, I’m tremendously excited that I’ll be presenting the talk, “Music in Virtual Reality.” The subject matter is very close to my heart!  Throughout 2016 and 2017, I’ve composed music for many virtual reality projects, some of which have hit retail over the past year, and some of which will be released very soon.  I’ve learned a lot about the process of composing music for a VR experience, and I’ve given a lot of thought to what makes music for VR unique.  During my GDC talk in March, I’ll be taking my audience through my experiences composing music for four very different VR games –the Bebylon: Battle Royale arena combat game from Kite & Lightning, the Dragon Front strategy game from High Voltage Software, the Fail Factory comedy game from Armature Studio, and the Scraper: First Strike Shooter/RPG from Labrodex Inc.  I’ll talk about some of the top problems that came up, the solutions that were tried, and the lessons that were learned.  Virtual Reality is a brave new world for game music composers, and there will be a lot of ground for me to cover in my presentation!

In preparing my talk for GDC, I kept my focus squarely on composition techniques for VR music creation, while making sure to supply an overview of the technologies that would help place these techniques in context.  With these considerations in mind, I had to prioritize the information I intended to offer, and some interesting topics simply wouldn’t fit within the time constraints of my GDC presentation.  With that in mind, I thought it would be worthwhile to include some of these extra materials in a couple of articles that would precede my talk in March.  In this article, I’ll explore some theoretical ideas from experts in the field of VR, and I’ll include some of my own musings about creative directions we might pursue with VR music composition.  In the next article, I’ll talk about some practical considerations relating to the technology of VR music.

So, let’s get started!

VIMS

An illustration for a discussion of VIMS on the popular VR platform, by Winifred Phillips (video game composer)No discussion of virtual reality is complete without some time spent on the perils of Visually Induced Motion Sickness (a.k.a. VIMS).  My upcoming GDC talk will include research on this topic pointing to a specific music approach that can play an important role in alleviating VIMS symptoms. However, there is more to consider about the general role that music plays in relation to the famous VIMS phenomenon, apart from the technique that I’ll be describing in my GDC presentation.  So let’s take a look at the general relationship between music and VIMS, starting with the most basic question:

What causes VIMS?

An illustration of a theoretical discussion of the famous VIMS phenomenon, by video game composer Winifred PhillipsLet’s picture ourselves sitting in a movie theater.  We’re watching a silent film that shows a first-person perspective of a high-speed bicycle ride full of wild twists and turns.  It looks stressful, but as we sit and watch the visuals, we’re not really all that stressed.  Okay, so now let’s imagine that the film isn’t silent anymore.  We can hear the bumps and jogs in the road, the air whooshing by, the aurally chaotic soundscape.  It’s a bit more exciting to watch, but we’re still feeling comfortable in our movie-theater seats.  Now, let’s imagine that we aren’t looking at a flat 2D screen anymore.  Now it’s a 3D stereoscopic image of that wild bicycle ride.  Oncoming traffic leaps off the screen at us.  Obstacles seem to whip by our heads as the road before us corkscrews madly.  Are we still comfortable?  Or could all that dizzying 3D motion be finally getting to us?

In their study to better understand the causes of motion sickness, professors Behrang Keshavarz and Heiko Hecht gathered 69 experimental test subjects and exposed them to the visual presentation I described above.  There were two variables: viewing mode (2D or 3D stereoscopic) and sound (on or 0ff).  The 2D film didn’t cause a problem.  Likewise, the presence (or absence) of sound wasn’t an issue.  But when 3D visuals were introduced, motion sickness became a big problem.  The findings of the study support the conclusion that immersive visual stimuli has the potential to negatively impact our sense of balance and equilibrium.  However, there’s also a secondary conclusion that’s equally interesting to us as game audio folks: sound doesn’t seem to have anything to do with it.  Yes, the 3D bicycle ride with sound was pretty nausea-inducing, but according to the study, a silent 3D bike ride has just as much potential to cause motion sickness.

So what does that mean for audio and music in virtual reality?  Does it mean that the aural spectrum simply doesn’t matter?  Or does it present us with some interesting creative opportunities?  Let’s explore that idea a bit further.

Music and VIMS

First, let’s dispense with the notion that music and sound design doesn’t matter when it comes to VIMS.  Specifically, the presence of music actually has a powerful influence on the VIMS state, but that influence is therapeutic rather than harmful.

GDC 2018 Presentation Preview

In my upcoming GDC talk, I’ll be exploring the specific type of music that exerts the most beneficial effects when it comes to Visually Induced Motion Sickness.  By virtue of both my own experiences with multiple VR projects and the results of relevant scientific studies, I’ll be showing how video game composers can best alleviate the effects of VIMS through their musical compositions, and under which circumstances those compositions should be deployed.

While there’s a certain musical strategy that has the most beneficial effect (which I’ll define in my GDC talk), the mere presence of music is a proven therapeutic agent that has been shown to diminish nausea symptoms.  In a study conducted by the Arthur G. James Cancer Hospital and Research Center at Ohio State University, researchers found that the use of music during high-dose chemotherapy sessions led to a significant reduction in symptoms of nausea.  Music acts both as a diversion and a targeted therapeutic agent, shifting the listener’s attention away from physical discomfort while at the same time acting to reduce the symptoms.

In my talk I’ll be exploring how we can best employ an effective music strategy within the constructs of virtual reality in order to cushion VR players and make them more comfortable in the immersive environment.  There is, however, an additional dimension to the relationship between music, audio and VR exploration, which I didn’t have time to include in my upcoming GDC talk.  I’d like to share my thoughts on that here.

Presence

An illustration of "virtual presence" on the popular VR platform, from the article by award-winning video game music composer Winifred PhillipsWhat makes virtual reality so real?  It can’t be just the encompassing imagery, because then we wouldn’t need VR, we could just go to a 3D movie.  No, in order for VR to engage us, it has to make us feel as though we are personally present in the virtual world.  This phenomenon can be alternately called telepresence or virtual presence, but the end result is the same.  We feel as though we’re physically occupying the same world as the imaginary visuals we’re encountering.  How does the game make us feel this sense of presence?

According to MIT Professor Thomas B. Sheridan, the sensation of presence depends on the operation of three important factors: a “sufficiently high-fidelity display, a mental attitude of willing acceptance, and a modicum of motor “participation”.  In other words, we need to find the visuals to be sufficiently convincing, we have to be willing to be convinced that they’re real, and we must be able to move about freely and interact with the environment.  Unfortunately, it’s that third factor that causes the VIMS problem.  Moving around in VR opens us up to motion sickness.  How is this problem typically addressed?

According to Steve Bowler, cofounder of the VR game company CloudGate Studio, the community of VR game developers have “zero tolerance for user motion sickness.”  In an interview with ScienceNews.org, Bowler describes the way in which developers typically solve the problem.  By virtue of a system of in-game navigation that relies on a type of teleportation, developers allow us to wander their VR worlds.  An illustration of the Oculus Touch controller for the popular Oculus Rift headset, from the article by video game composer Winifred PhillipsWe point our controllers where we want to be, we hit the teleport button and zip!  We’re there in a flash.  It’s highly effective in avoiding the perils of VIMS.  However, it also sharply curtails our sensation of being able to “move about freely and interact with the environment.”

So, if developers are forced to limit the personal agency of players in wandering around the environment, is it possible for game audio folks to compensate by making it seem as though the environment is wandering around us?  This is a thought I’ve been considering lately, as I contemplate the movement limitations we experience in VR environments.  After all, before we had visual virtual reality, we had a kind of audio VR in the form of audio-only games like Papa Sangre.  In games like Papa Sangre, the environment presents a busy soundscape that invisibly drifts around us.  If we close our eyes, we’re suddenly fully enveloped in the world that the game developers have created.  Merely turning around becomes a radically dramatic act of personal agency as the sonic universe reacts to our movement.  I’ve included a non-interactive video clip below that demonstrates some gameplay from Papa Sangre.  In this clip, you can watch a gamer interacting with the game’s interface.  Notice the somewhat exaggerated nature of the sound design as the player is instructed how to play the game:

The effect of this audio-only universe can be very immersive, and its power depends on the sensation of an active soundscape that surrounds, enfolds and interacts with the player in ways that exaggerate and heighten reality.  Could these techniques make VR players feel more of a sense of presence in virtual reality, even if their physical mobility is limited?  Should we be thinking about opportunities to present a soundscape with moving components and an exaggerated sonic palette?

Conversely, if we decide to exaggerate our audio world, would this disrupt the impression of realism that virtual reality attempts to convey?  Here’s where we come to another interesting concept that I wasn’t able to include in my GDC talk, but that has some bearing on the train of thought we’re currently pursuing.

The Uncanny Aural Valley

Back in 2015, I wrote an article for Gamasutra about the “Uncanny Valley” – a concept that’s been a long-time fixture in the visual arts but has just started to be discussed in connection with audio.  An illustration of the famous "Uncanny Valley" concept, from the article by Winifred Phillips (video game music composer)When applied to the visual world, the “Uncanny Valley” pertains to representations of living things (most often humans, pictured left) that are impressively close to the real thing but that subtly miss the mark.  This imperfection leads to a deeply unsettling impression of wrongness.  In my Gamasutra article, I discussed how audio in the world of VR may be in danger of dipping into the “Uncanny Aural Valley,” in which sound gets impressively close to perfect realism but misses the mark.  Within the context of virtual reality, this subtle imperfection has the potential to impact players in a far more pronounced way than it would in a traditional video game.  That barely-perceptible sonic wrongness can (theoretically) pull virtual reality players out of the immersive experience.

So, is VR audio in danger of dipping into the “Uncanny Aural Valley” anytime soon?  Not according to Sean Earley, the Executive Editor of AR/VR Magazine.  “The visual uncanny valley will still be around for a while,” observes Earley. “Unless you are a super audiophile, however, digital audio has progressed to the point where a good engineer can make a recording that is very hard to distinguish from reality.  Spatial audio, when mixed with simple VR can add a totally new level of realism to an experience.”

That takes us back to our previous train of thought: if perfect sonic realism is achievable in VR, is it desirable?  Or would we rather aim for a form of hyperrealism that emphasizes aural motion and more fully envelops the player?  Do we want to dip into that Uncanny Aural Valley?

An illustration of immersive sound/music in popular VR games, from the article by video game composer Winifred Phillips“Pure, super smooth and natural spatialized sound may be not immersive enough to get the sort of user experience/effect needed for VR,” writes Gabor Szanto, the creator of the Superpowered audio software development kit for mobile. “You don’t want the most natural chirping bird sound, you actually want the cleanest and most 3D-like bird sound. You want to amaze the listener.”

So, in a virtual reality environment in which we’re forced to limit the physical mobility of players, is it possible that by making the aural environment hyperreal and supremely encompassing, we can compensate for any loss of presence that players might feel when they can’t move around exactly as they please?  I think it’s an interesting idea to ponder, and one to which we should give some consideration as VR audio moves forward and becomes more ambitious.  Also, as video game composers, we might want to consider how our music mixes can more fully surround players.  For more surreal, synthetic or ambient-driven musical scores, we might even introduce spatial motion into our musical mixes, letting sounds float around players to convey an even greater level of sonic immersion.

Conclusion

In my next article, I’ll be discussing some more down-to-earth technical issues that pertain to music and audio in VR.  While my GDC presentation on Music in Virtual Reality will include several important technical issues and topics, there simply wasn’t enough time to include everything that might be of interest.  With that in mind, I’ll be happy to explore some of these technical concerns in my next article!  I’ve included the official GDC description of my upcoming talk below. Please feel free to share your thoughts and insights in the comments section!

 


 

Music in Virtual Reality

Illustration of the VR projects featuring music by game composer Winifred Phillips, to be discussed in a GDC talk presented by Winifred Phillips for video game composers.This lecture will present ideas for creating a musical score that complements an immersive VR experience. Composer Winifred Phillips will share tips from several of her VR projects. Beginning with a historical overview of positional audio technologies, Phillips will address several important problems facing composers in VR.

Topics will include 3D versus 2D music implementation, and the role of spatialized audio in a musical score for VR. The use of diegetic and non-diegetic music will be explored, including methods that blur the distinction between the two categories.

The discussion will also include an examination of the VIMS phenomenon (Visually Induced Motion Sickness), and the role of music in alleviating its symptoms.  Phillips’ talk will offer techniques for composers and audio directors looking to utilize music in the most advantageous way within a VR project.

Takeaway

Through examples from several VR games, Phillips will provide an analysis of music composition strategies that help music integrate successfully in a VR environment. The talk will include concrete examples and practical advice that audience members can apply to their own games.

Intended Audience

This session will provide composers and audio directors with strategies for designing music for VR. It will include an overview of the history of positional sound and the VIMS problem (useful knowledge for designers.)

The talk will be approachable for all levels (advanced composers may better appreciate the specific composition techniques discussed).

 


Photo of video game composer Winifred Phillips in her music production studio.Winifred Phillips is an award-winning video game music composer whose most recent projects are the triple-A first person shooter Homefront: The Revolution and the Dragon Front VR game for Oculus Rift. Her credits include games in five of the most famous and popular franchises in gaming: Assassin’s Creed, LittleBigPlanet, Total War, God of War, and The Sims. She is the author of the award-winning bestseller A COMPOSER’S GUIDE TO GAME MUSIC, published by the MIT Press. As a VR game music expert, she writes frequently on the future of music in virtual reality games. Follow her on Twitter @winphillips.

 

 

 

 

 

Virtual Reality in the Uncanny Aural Valley

headphones-buried

Most visual artists in the game industry are familiar with a concept known as the “Uncanny Valley,” but it isn’t a problem that typically occupies the attention of sound designers and game music composers.  However, with the imminent arrival of virtual reality, that situation may drastically change.  Audio folks may have to begin wrestling with the problem right alongside their visual arts counterparts. I’ll explore that issue during the course of this blog, but first let’s start with a basic definition: what is the Uncanny Valley?

Here’s the graphic that is typically shown to illustrate the Uncanny Valley concept.  The idea is this: human physical attributes can be endearing.  We like human qualities when we see them attached to inhuman things like robots.  It makes them cute and relatable. However, as they start getting more and more human in appearance, the cuteness starts going away, and the skin-crawling creepiness begins.  The ick-factor reaches maximum in an amorphous no-man’s land right before absolute realism would theoretically be attained.  In this realm of horrors known as the “Uncanny Valley,” we see that the appearance of the human-like creature is not close enough to be real, but close enough to be really disturbing.  Don’t take my word for it, though.  Here’s a great video from the Extra Credits video series that explores the meaning of the Uncanny Valley in more detail:

So, now we’ve explored what the Uncanny Valley means to visual artists, but how does this phenomenon impact the realm of audio?

Spatial Audio – Reconstructing Reality or Creating Illusion?

The idea of an audio equivalent for the Uncanny Valley was suggested by Francis Rumsey during a presentation he gave in May 2014 at the Audio Engineering Society Chicago Section Meeting, which took place at Shure Incorporated in Niles, Illinois.  Francis Rumsey holds a PhD in Audio Engineering from the University of Surrey and is currently the chair of the Technical Council of the Audio Engineering Society.  His talk was entitled “Spatial Audio – Reconstructing Reality or Creating Illusion?”

Francis-Rumsey

Francis Rumsey, chair of the AES Technical Council

In his excellent 90 minute presentation (available for viewing in its entirety by AES members), Francis Rumsey explores the history of spatial audio in detail, examining the long-term effort to reach perfect simulations of natural acoustic spaces.  He examines the divergent philosophies of top audio engineers who approach the problem from a creative/artistic point of view, and acousticians who want to solve the dilemma mathematically by virtue of a perfect wave field synthesis technique. Along the way, he asks if spatial audio is really meant to recreate the best version of reality, or instead to conjure up an entertaining artistic illusion?  This leads him to the main thesis of his talk:

Sound Design in VR: Almost Perfect Isn’t Perfect Enough

Rumsey suggests that as spatial audio approaches the top-most levels of realism, it begins to stimulate a more critical part of the brain.  Why does it do this?  Because human listeners react very strongly to a quality we call “naturalness.”  We have a great depth of experience in the way environmental sound behaves in the world.  We know how it reflects and reverberates, how objects may obstruct the sound or change its perceived timbre. As a simulated aural environment approaches perfect spatial realism and timbral fidelity, our brains begin to compare the simulation to our own remembered experiences of real audio environments, and we start to react negatively to subtle defects in an otherwise perfect simulation.  “It sounds almost real,” we think, “but something about it is strange.  It’s just wrong, it doesn’t add up.”

Take as an example this Oculus VR video demonstrating GenAudio’s AstoundSound 3D RTI positional 3D audio plugin.  While the audio positioning is awesome and impressive, the demo does not incorporate any obstruction or occlusion effects (as the plugin makers readily admit).  This makes the demo useful for us in examining the effects of subtle imperfections in an otherwise convincing 3D aural environment.  The imperfections become especially pronounced when the gamer walks into the Tuscan house, but the sound of the outdoor fountain continues without any of the muffling obstruction effects one would expect to hear in those circumstances.

Voice in VR: The Uncanny Valley of Spatial Voice

During the presentation, Rumsey shared some of the research from Glenn Dickins, the Technical Architect of the Convergence Team at Dolby Laboratories.  Dickins had applied the theory of the Uncanny Valley to vocal recordings. The sound of the human voice in a spatial environment is exceedingly familiar to us as human beings, much in the same way that human appearance and movement are both ingrained in our consciousness.  Because of this familiarity, vocal recordings in a spatial environment such as 3D positional audio can be particularly vulnerable to the Uncanny Valley effect.  Very small and subtle degradation in the audio output of a spatially localized voice recording may trigger a sense of deep-rooted unease.

Glenn Dickins of Dolby Laboratories

Glenn Dickins of Dolby Laboratories

As we embark on three dimensional audio environments for virtual reality games, the sorts of sound compression typically used in video game design may become problematic, particularly in relation to voice recordings in games.  While a typical gamer might not recognize that a vocal recording had been compressed, the gamer might nevertheless feel that there was something “not quite right” in the sound of the character’s voices.  Compression of audio subtly changes the vocal sound in ways that are usually unnoticeable, but may become disruptive in a VR aural environment in which imperfections have the potential to nudge the audio into the Uncanny Valley.

Music in VR: Some Good News

While I’ve talked in this blog before about the importance of defining the role that music should play in the three-dimensional aural environment of a virtual reality game, Francis Rumsey offers an entirely different viewpoint in his talk.  He thinks that when it comes to music, listeners don’t really care about spatial audio.  That might be good news for game composers, because this may mean that music may play no role in the Uncanny Valley effect.

Describing a study that was conducted to determine how both naive and experienced listeners perceived spatial audio, Rumsey showed that when it came to listening to music, the spatial positioning wasn’t considered tremendously important.  Sound quality was held to be absolutely crucial, but this desire was neither heightened nor lessened by spatial considerations. So does this mean that when it comes to music, listeners have an enhanced suspension of disbelief?  Are they willing to accept music into their VR world, even if it isn’t realistically positioned within the 3D space?  If so, then this would mean that non-diegetic music (i.e. music that isn’t occurring within the fictional world of the game) may not need to be spatially positioned as carefully as either voice or sound design elements of the aural environment.  This may prove useful to audio teams, who may turn to music as a reassuring agent in the soundscape, binding the aural environment together and promoting emotional investment and immersion.  However, music’s role in virtual reality may not conform to the way in which listeners react to spatially positioned music in other situations.  At any rate, the issue certainly needs further study and experimentation to clarify the role that non-diegetic music should play in a VR game.

For other types of music in VR, the situation may be much simpler.  Music doesn’t always have to occupy the traditional “underscore” role that it typically serves during gameplay.  In a “music visualizer” VR experience, spatial positioning may become entirely unnecessary, because the music is serving the purpose of pure foreground entertainment (much the same way that music entertains listeners on its own).  Here’s a preview of a musically-reactive virtual world in the upcoming “music visualizer” game Harmonix Music VR, created by the developer of the famous and popular game series Rock Band and Dance Central:

In Conclusion

Rumsey concluded his talk with the observation that near accurate may be worse than not particularly accurate… in other words, if it’s supposed to sound real, then it had better sound perfectly real.  Otherwise, it might be better to opt for a stylized audio environment that exaggerates and heightens the world rather than faithfully reproducing it.  I hope you enjoyed this blog, and please let me know what you think in the comments below!
border-159926_640_white

Studio1_GreenWinifred Phillips is an award-winning video game music composer whose most recent project is the triple-A first person shooter Homefront: The Revolution. Her credits include five of the most famous and popular franchises in video gaming: Assassin’s Creed, LittleBigPlanet, Total War, God of War, and The Sims. She is the author of the award-winning bestseller A COMPOSER’S GUIDE TO GAME MUSIC, published by the Massachusetts Institute of Technology Press. As a VR game music expert, she writes frequently on the future of music in virtual reality video games. Follow her on Twitter @winphillips.