Skip to main content

Developing out loud for Amazon Alexa

Amazon head of voice design education Paul Cutsinger talks about how independent developers are rising to the unique challenges of making games for voice

You can't necessarily beat Amazon Alexa as a listener, but you can join it for one of over a thousand other games for the voice-activated device -- and that number is growing steadily.

As Amazon's head of voice design education, Paul Cutsinger works with a group that Amazon refers to as Skillbuilders, the people who create Amazon Alexa's equivalent of apps, Skills, for the voice device. His job is to help those who want to build Skills understand how to do so, an endeavor that sees him working with developers of varying experience levels, from veteran game developers to total coding novices.

Speaking to GamesIndustry.biz at GDC 2019, Cutsinger says that when it comes to creating games for Alexa through Skills, there are basically two types of developers: AAA companies with strong IP like Skyrim or Destiny want to bring their brands to Alexa and make experiences will supplement existing, traditional games; and independent developers to pushing the limits of what voice games can be.

Cutsinger sees both groups as key to the growth of Alexa Skills, but he acknowledges that the independent developers are frequently the impetus behind new game forms arising and evolving. Voice, he says, is still relatively young, and developers are still going through a lot of trial and error to figure out what works well. He compares where voice is now to where mobile was over a decade ago.

"Rovio could have easily made a mobile app that looked like the web. But they didn't"

"I love to see people who have the kind of creativity that Angry Birds did in the day," he says. "Rovio could have easily made a mobile app that looked like the web, with a power up button and a power down button, and an angle up and angle down button and a shoot button. But they didn't.

"Rovio asked: 'What is touch? What is direct manipulation? How do you touch and drag?' In the web world that makes no sense, but in touch that makes all the sense. They were able to pull away and think differently, and that's what we're starting to see with smaller studios working with voice. They're thinking: 'What is voice? How is it different?' Instead of just trying to bring a website over."

The comparisons to mobile don't stop there. Cutsinger acknowledges that, at the moment, the gaming community (including developers, players, media, and others) has largely refused to acknowledge that voice-driven experiences like those on Alexa are proper games. There are, of course, lots of different Skills available on Alexa, and many of them aren't games at all. But for the ones that do seem to fall under that umbrella, Cutsinger is optimistic about their potential to be acknowledged as another viable game development medium, just as mobile games were. It is, after all, his job to spread that word.

"I remember mobile coming out and people said, 'Those aren't games.' Sometimes you ask, 'Is that interactive fiction or is that a game?' And that depends on how you define it. I think of a game as something where I think I can succeed, but there's a real chance I'll fail. Interactive fiction, for me, is entertainment less a game. I don't actually fail. If I 'die', it's part of the fun of finding all the dead ends."

When Skyrim was announced for Amazon Alexa, many regarded it as a joke. It turned out not to be, though the voice version doesn't look anything like this (it doesn't look like anything at all)

Though Cutsinger admits it seems a bit obvious, one of the major differences between a game made for voice and a game made for a console is that voice games use soundscapes instead of visual art. Cutsinger says there are clear differences users can hear between bigger budget Skills and those made with fewer resources, whether that's in the quality and profile of the voice talent or simply in background mixing.

"The other big difference is the way you think about the experience," he continues. "A lot of visual games have passive experiences, things glittering off to the side. That's not really the way it works with voice. It's more direct. You say something, Alexa responds, and you work through it. There is a lot more using your imagination."

"I think of a game as something where I think I can succeed, but there's a real chance I'll fail"

The straightforward shifts in thinking about a lack of visuals and the importance of sound aren't the only major ways developing games for Alexa is different from traditional development. Despite the removal of one entire sense from the equation, Cutsinger says that the specificity and range of voice commands can actually make certain components of game design easier for both the developers and the players.

"There are a few different ways developers can take advantage of voice technology," he says. "One is the extreme shortcut. Imagine you're in a game and you're immersed in it, then all of a sudden you have to go to a menu to teleport somewhere else. With Alexa, you can just say, 'Go to the Eiffel Tower,' and you can go there. Or 'Change my loadout," or 'Call my friends.' You shortcut all the menus, so Voice is really good for command and control.

"Another area that's really interesting is community. With my phone, if I start to play a game right now, I go into my own world and play my game. Or if I even look up some information, I go into my own world, get the information, then I come back and tell you. But with voice we're all here together. If you take Jeopardy, they're going to read a clue and we have to give an answer. We can do that together, even if it's not a multiplayer game. That communal aspect starts with quizzes, but there are also game buttons where you can buzz in.

"There's one called Bandit Buttons, where if all the colors are the same you hit the buzzer, but if they're red, you never hit the buzzer. It's a speed game, and you're trying to watch colors and be the first to buzz in. That's a traditional-style buzz-in game, but people have taken that and done things like putting one button on their hand and one on their chest, and running around the room and playing it as a sword fighting game. They're doing cool, interesting things that bring people together, and that communal piece is a cool part of what games could be."

"With Alexa, you can just say, 'Go to the Eiffel Tower,' and you can go there"

Co-op, in particular, has interesting implications, Cutsinger says. One is that co-op games on Alexa have the potential to be either synchronous or asynchronous, even across long distances. Local party games aside, Alexa Skills have the potential for two players hundreds of miles apart to take a narrative journey together and make decisions along the way, or play a trivia game with one another even if the two players give their individual answers days apart from one another.

Another way Alexa brings interesting potential to working with different types of people is in how it learns language, enabling it to respond correctly to the prompts players give it.

"If you build a screen that has two buttons on it, people are going to hit button one or two," Cutsinger says. "There's not much else they can do. But if you have an experience where people can say 'Yes' or 'No,' they might say, 'Yeah,' 'Mmhm,' 'Yup,' 'Sure,' 'Go for it,' 'Keep on going.'

Tellables includes a 'box of chocolates' that gives users a unique story every day, which are chosen from stories that can be submitted by anyone on the game's website

"One job the developer has to do is train the natural language understanding engine. They provide training data that helps it understand what the synonyms are and what the valid prompts are. That's a change for developers, first because the variety is way larger, and second because it's not a definitive list. They're literally training the machine learning systems. If people say something outside the training data, it still might work because it's similar."

Another is memory, which Cutsinger says is necessarily a bigger factor than in games. Cutsinger compares the game 'remembering' where you are at any given point to save files, but also says it can be used at a smaller level -- to remember things the player has done before, and keep them from repeating those things ad nauseum.

"If you have an experience where people can say 'Yes' or 'No,' they might say, 'Yeah,' 'Mmhm,' 'Yup,' 'Sure,' 'Go for it,' 'Keep on going'"

"With memory, the Skill itself is remembering what's going on, keeping track of the score, keeping track of who you are, picking up where it left off. That one seems obvious and maybe not super important at first, but in voice it's really annoying to do the same thing over and over again. In a screen-based game, when you load up the game, it had better be very familiar to you. In voice, if the welcome is the exact same thing, you just want to skip through it. You need that variety, and the sense of memory does that. There's a lot we're doing with developers right now around that."

Cutsinger says that, right now, so much of Alexa's appeal for games is the wide demographic range its Skills can attract. Unlike console, PC, or even mobile games, Alexa Skills attract all ages: parents, children, even grandparents. And that crowd of users is growing. At the last count, over 100 million Alexa-enabled devices had been purchased, and on the system the games category had increased its offerings by over 160% year-over-year (Consider that PlayStation VR has sold 4.2 million units, the Nintendo Switch has sold 35 million, and the PS4 is just now nearing 100 million). Thousands of gaming Skills are present on Alexa now, and hundreds of thousands of developers are working on new Skills, Cutsinger says.

And Cutsinger wants to bring even more developers in, regardless of their level of experience coding or even building game ideas. He told me about multiple instances of Alexa's most popular Skills having been made by developers with little to no experience, such as Kids Court, a Skill that helps kids settle disputes, and a story-telling Skill called Tellables, which is made by professional developers but often features stories from writers who are just that -- writers, not coders or game makers.

To facilitate more of these creatives trying their hand at Alexa, Amazon regularly holds live streams on its Alexa Twitch channel, featuring coders working on new projects and talking to the audience about their work. Cutsinger also encourages anyone who wants to give it a shot to reach out to Amazon Alexa via the company's website. After all, he believes that independent developers trying new and ridiculous things with the technology is exactly what Alexa needs to become a staple, yet unique entity in the gaming industry in its own right.

"I don't know what Alexa Skills are going to be like in five years," he says. "It's going to be crazy different."

For the record: The introduction to this piece has been ammended slightly from its original version to better ensure the piece's topical clarity.

Read this next

Rebekah Valentine avatar
Rebekah Valentine: Rebekah arrived at GamesIndustry in 2018 after four years of freelance writing and editing across multiple gaming and tech sites. When she's not recreating video game foods in a real life kitchen, she's happily imagining herself as an Animal Crossing character.
Related topics