Why next-gen consoles need next-gen faces

DI4D's Colin Urquhart discusses the ever higher expectations of realism in games for the GamesIndustry.biz Academy

Academy by Colin Urquhart Contributor

Published on April 8, 2021

The graphical capabilities of consoles have reached unprecedented levels of detail. The days of pixelated environments and two-dimensional characters feel like a distant memory as we come closer to truly lifelike, virtual recreations.

In fact, for many genres, graphical fidelity is as important as gameplay mechanics -- as shown by the intense battle between the Sony and Microsoft consoles both claiming to be the most powerful.

But the reality is, as game engines improve, it's the finer details -- such as character emotions portrayed by in-game animation -- that will become the most distinctive features, especially with narrative-driven experiences taking a foothold. After all, what use is a photoreal character with robotic expressions?

With this in mind, I want to explain exactly why next-gen consoles need next-gen faces and how game developers can drive them. And not just photoreal faces of entirely fictional characters -- we're rapidly entering an era of ubiquitous, in-game, digi-doubles.

The power of modern game engines

In the past, the limiting factor in the quality of video game graphics was the capabilities of the engine driving them. But now, we've reached a point where game engines can portray so much graphical detail that the ability to create sufficiently realistic content for the engine is becoming the limiting factor.

Obtaining sufficiently realistic facial animation is particularly challenging, and can play a decisive role in making or breaking an in-game scene. It's much harder to cross the 'Uncanny Valley' with moving animation than it is for a still image.

Obtaining sufficiently realistic facial animation is particularly challenging, and can play a decisive role in making or breaking an in-game scene

As mentioned earlier, robotic or synthetic facial animation can suck the life out of a game character. Just like in movies, great writing can be overshadowed by a flat performance, or conversely, a great acting performance can transcend a mediocre script. This means game developers are taking facial performance and facial animation more seriously than ever before, and with incredible advancements in game engine technology -- such as the latest iterations of the Unreal Engine -- this should be straightforward, right? Well, not necessarily -- it depends on the approach taken.

The reality is that, with the traditional, control rig-based, approach to facial animation, it's becoming both more challenging and more time-consuming to achieve the required fidelity. In order to obtain greater levels of detail, rigs for in-game animation are reaching the complexity used for movie visual effects, with an increasing number of finer controls. As the rigs become more complex, it becomes ever more difficult to implement them for a particular character and then to achieve the level of animation "polish" required for modern graphics.

A radically different approach to facial animation, which is starting to gain traction with game developers, is to use advanced 4D facial capture technology. 4D capture does not require a traditional animation rig, but instead drives every vertex of a digital double character mesh directly from the actor's performance. It is therefore able to produce very precise animation without the need for extensive rigging or animation polish. It is becoming far easier and more effective to use 4D capture to animate digi-double characters than it is to create and animate traditional control rigs.

For example, DI4D's technology was used to reproduce actress Angela Bassett's performance in the Tom Clancy Rainbow Six Siege trailer -- the facial capture data was used to drive her character's facial expressions.

Story-driven gameplay and an intersection with Hollywood

A strong, cinematic narrative has never been more important to video games. Pre-rendered cinematics are now even spliced together on YouTube by fans to watch as short films. With next-gen, cinematic content is increasingly being rendered in-engine and the distinction between cinematic and in-game animation is fading quickly.

God of War is a recent example of a game with cinematic experiences built into its core gameplay. The CG scenes even merge seamlessly with playable sequences. Likewise, The Last of Us series is another example of a game celebrated for its story-driven universe, focusing in-detail on the facial animation of its in-game characters.

Some studios, such as Supermassive Games, have opted to craft video games that resemble more of an interactive movie than anything that's traditionally, action-orientated. Its recent horror series of games, House of Ashes, even features offline "movie nights" for multiple players to immerse themselves in the game. Video game players want great stories.

Advancements in engine technology merged with narrative-driven experiences have put a far greater emphasis on accurate portrayal of characters

On the other end, you've got animated movies using real-time game engine technology to create stunning features. The recent Netflix series, Love, Death and Robots, included several CG-films that wouldn't look out of place running on your Playstation or Xbox system, all while achieving incredible quality.

It exemplifies the high fidelity possible with current technology and raises the bar of expectation among gamers. Game engines are even driving the expansion of on-set, virtual production -- a solution set to expand significantly across entertainment in 2021.

To summarise, advancements in game engine technology merged with narrative-driven game experiences have put a far greater emphasis on the accurate portrayal of in-game characters. This is true for the design, the animation and the writing. By capturing the true performance of an actor with realistic 4D tracking technology, studios can achieve facial animation with a level of emotional authenticity approaching that of live action.

In-game acting and celebrity digital doubles

With 4D facial capture and photoreal animation, the performance of an actor becomes the most important aspect of a video game's facial animation. How long will it be before we talk about a lead game character's acting performance in the same way that we discuss actors in Hollywood? The crossover of Hollywood and video games is arguably closer than ever before.

With modern digital doubles, there's now an enhanced level of accuracy and faithfulness to the actors' portrayal. As a result, digital doubles of celebrity actors are starting to appear more frequently in video game titles. While Keanu Reeves' appearance in Cyberpunk 2077 is one of the most recent examples, there are several that predate him: Kit Harrington in Call Of Duty: Infinite Warfare, Shawn Ashmore in Quantum Break, Norman Reedus and Mads Mikkelsen in Death Stranding.

How long will it be before we talk about a lead game character's acting performance in the same way that we discuss actors in Hollywood?

It's likely that as more celebrity actors appear in video games, acting opportunities will increase, building interest in video game roles as a viable alternative to traditional cinematic acting.

Games can promote releases with the celebrity-lead role, just like cinema, and secondary characters will be increasingly played by traditionally trained actors. The key is quality. For actors to take these new roles seriously, the final result must be accurate and lifelike. This is why good facial capture is an integral part of this process.

As accurate, 4D facial capture technology replaces traditional rig-based animation, actors will have more confidence that their exact performance will be recreated in-game. Each individual's acting ability will be on-show, and actors will be able to feel as proud of their appearance in a video game as they might after completing a shoot for a film. There's a real incentive for up-and-coming actors to appear in video games.

The age of the digital double

Humans have evolved an amazingly sophisticated ability to perceive faces and facial expressions and we can recognise, often subconsciously, the most subtle emotional cues. This is why facial animation is so hard to get right!

Next-gen consoles have the power to render in-game characters with incredible levels of detail, but the pressure is on game developers to create assets that meet ever higher expectations of realism. It is becoming increasingly more effective to use 4D capture to apply the performance of an actor directly to their digital double than it is to re-target their performance to a traditional control rig.

In the future, we won't even think about facial animation, the exact performance captured will appear in the game. The realism debate will dissipate, it will become art in the same way an actor's performance is art. This is the age of the digital double, and game developers need to be ready.

Co-founder and CEO of DI4D, Colin Urquhart is a 20+ year industry veteran and innovator in the field of facial capture. The DI4D team have worked on a host of entertainment projects, including Blade Runner 2049, Call of Duty: Modern Warfare, Love, Death, and Robots, and Quantum Break.