Use Cases



Not Categorized

The Ultimate Guide to Building a Voice Application


As voice technology rises to the forefront of how consumers are searching or asking questions, we take it upon ourselves to learn more about this shift and bring it to the marketers and technologists who need to know this information. We’ve also built out this functionality in our platform so when you’re ready to add voice technology to your marketing strategy, you can do it from the SaaS CMS you know and love—Zesty.io.

What is a Voice Application?

A Voice Application, or voice-based application, is any application that relies on speech requests to process a query and will respond to it with the desired action. For our purposes, we're referring to voice applications such as Siri, Amazon Alexa, and Google Assistant. Voice assistants have taken off in recent years, and voice-first devices such as Amazon Echo or the Google Home are becoming more integrated into consumers' daily lives. Voice-enabled devices, and the apps that control them, are an exciting new field for marketers and developers alike. As brands continue to consider how they're going to take advantage of this new channel, they need to learn a few new device toolkits, the basics of Voice UI design, and a some emerging best practices for building and deploying on these diverse platforms.

The History of Voice Activated Devices

Smart speakers, i.e. voice command devices, are not a new concept. It actually dates all the way back to 1952.

1952 - Bell Labs

There are a few inventions prior to 1952 in the way of machines speaking (understanding a non-voice query, like a calculation, and responding with vocalizing an answer), but in 1952, a team at Bell Labs designed “Audrey”, the Automatic Digit Recognition machine. Audrey was special because it was a machine capable of understanding spoken digits, which was a monumental breakthrough in voice technology and arguably the first ancestor to modern-day Siri and Alexa.

1961 - IBM Shoebox

While Audrey only understood digits, the next major breakthrough was IBM's. At the 1962 Seattle World's Fair IBM unveils the Shoebox, a machine that can understand up to 16 spoken words in English. Ok, Shoebox...

1968 - 2001: A Space Odyssey

In the classic film 2001: A Space Odyssey, “Kubrick [...] determined during '2001' planning that in the future the large majority of computer command and communication inputs would be via voice, rather than via typewriter.” [Source]

1982 - Knight Rider

Voice applications become more concrete in television classic Knight Rider. Arguably the first Smart Car, "KITT is an artificially intelligent electronic computer module in the body of a highly advanced, very mobile, robotic automobile." [Source]

2011 - Apple unveils Siri

On October 4, 2011, Apple unveils the personal assistant for the masses: Siri. With her advanced capabilities, Siri not only understands speech, but she also translates it into a command and is able to take appropriate action.

2014 - Amazon unveils Echo; Google unveils Assistant

A key difference between Siri and the Echo is that Siri is a program used in multiple Apple devices. Amazon is actually the first company to create a device whose sole purpose is to understand, interpret, and execute voice commands: Echo. Google quickly follows suit with Google Assistant and Google Home devices.

How are People Using Voice Applications and Smart Speakers?

Enterprises are building out voice applications to provide a unique experience to their customers because more customers are using voice devices on a daily basis. But just how many people actually use voice devices? 


Source: Our friends at Voicebot.

Examining the increase in adoption, it's clear this trend has grown and shows no signs of stopping. The voice device market is exploding, with tens of millions of devices in homes across America. 

Additionally, it's important to consider how consumers are actually using smart speakers and voice applications. What kinds of questions are they asking? Our friends at Voicebot provide a breakdown:

Source: Our friends at Voicebot.

From asking simple questions to playing games or finding recipes, there are numerous applications for voice technology. As brands capitalize on this channel to provide additional uses for voice applications, so too will consumers use these different functionalities as smart speakers become more embedded in daily life. 

Different Applications, Different Devices

As voice devices and smart speakers continue to grow in popularity, different brands be vying for market share. The Voice Applications people are using most include Amazon Alexa, Google Assistant, and Siri. However, the voice devices that are leading the race are Amazon Echo, Google Home, and Apple HomePod. As these devices have become more commonplace in the home, naturally the market share is expected to diversify:

Source: Our friends at Voicebot

As this is just a projection, it should be noted that as more devices continue to enter the space or niche items such as Amazon Echo Auto are rolled out, one should expect even more device diversification than this projection.

The Argument for One Content Repository

As marketers continue to explore and plan for voice channels, IT and developers might be filled with dread. To provide the best customer experience, one must provide that to multiple voice softwares and devices consistently. Marketers can seamlessly manage content without making constant requests to IT/development teams to make changes or send content live. Instead IT/development teams can set up a few API calls to update content and they can focus their energy on other priorities. That's the beauty of headless content management.

Why Build a Voice Driven Application?

  1. Get ahead of (or compete directly with) competitors

Do a quick survey and see if your competitors are using voice as a channel. If they're not, you have a clear opportunity to gain a competitive advantage. As voice devices and smart speakers become more widely adopted, early adopters who build skills first have the first-mover advantage. 

  1. Increase audience reach

Customers use voice devices because they are easy. Delight your audience by reaching them in a new channel they’re using. Potentially reach a new audience too.

  1. Create or build authority

Brands that adopt early establish authority. A Google browser search returns a marketplace of results and relies on the user to choose an authoritative source. With a voice query users ask a question and are given one resource.

  1. "Free" publicity

Voice acts as a very unique and exciting PR tool. Since not many brands have voice applications, and those who do have very simple applications, this is an opportunity to seize good press for your brand.

  1. Provide customers better experiences

Customers are actively engaged in using voice as a new channel - leverage this trend to continue providing value. Delight customers with a new experience, quickly provide value by answering their questions, and keep your customer base returning to you.

Voice Assistant Development

How to Map Content for a Voice Application

Once you've decided that building a voice application is the right move for your brand, it's important to map out the content prior to requesting the development team to build the application. As you map content for your application, consider what kind of voice application is the best fit for your brand.

Different Kinds of Voice Applications

By and large, you'll find that most voice applications fall under one of these four categories:

  1. Informational: Informational: Provides basic information such as hours of operation, locations, contact information, etc. 
  2. Conversational: More complex than informational. Educate your clientele about how to use your product, rather than providing basic information.
  3. News: Is your market one that's always changing? Consider producing newsworthy, up-to-date content to stay contemporary with your target market.
  4. Fun Experiences: Think out of the box by encouraging your customers to engage with your product in new exciting ways. 

Download the Workbook

As you learn about different ways to craft your brand's voice application, check out our workbook. Our interactive workbook guides you through crafting your brand's story and delivering clients an unique experience while leveraging this new channel.

Informational Voice Applications

Prompt: Alexa, what are Burger King's hours today?

Voice Application: Burger King is open from 7am to 11:59pm today.

Informational content voice applications cover the basics. This app answers basic questions about your business. Simple informational content does not need a voice application to be built: you can enable informational content to be available to voice devices simply by installing schema on your website. Enabling schema on your website for Frequently Asked Questions means that voice devices will have content to answer to a prompt such as “What are your business hours?” or “Are there Vegetarian options?”

Conversational Voice Applications

Prompt: Alexa, how do I get red wine out of my white carpet?

Voice Application: First, pour club soda onto the wine stain. Then...

Conversational content is different from informational content. When planning out conversational voice interactivity, consider how people use your product, rather than discussing your product's features or attributes. The point of a conversational application is to provide value by teaching customers how to use your product. For example, Tide has a voice application that has different solutions to several pesky stains. By providing this content in a voice app, not only does Tide establish themselves as a leader in getting stains out of clothing, but they also provide value to their customers in a non-intrusive or "salesy" way.

News Voice Applications

Prompt: Alexa, what's the Allergycast in Mount Zion?

Voice Application: Today, the pollen count in Mount Zion are at a 3 out of 10. Looking good!

Newsworthy content needs to be fresh on a regular basis. This is perhaps one of the trickiest kinds of applications, but if it suits your brand, it's gold. Voice channels provide you an amazing opportunity to be in front of your customers on a regular basis. For example, Zyrtec has a voice application that is an Allergycast. It will tell you every day what the pollen counts are in your area and the likelihood that you'll be sensitive to them. Zyrtec is able to stay current and valuable to their consumers because they're receiving valuable information.

Fun Voice Applications

Prompt: Alexa, let's taste the Gold Label Reserve.

Voice Application: Ok, let's get started. First, you'll want to observe the color of the whisky...

Providing fun experiences through voice applications can be challenging to develop, but are a unique opportunity to delight your target market. The sky is the limit with fun voice applications. One example is Johnnie Walker's Guided Tasting, where they leverage Amazon Alexa to guide customers through their Whisky 101 course, a tasting experience, and more.

Once you have your content fully mapped out, it's time to file a ticket with IT to develop your application using the Amazon Alexa Skills Kit (ASK). 

Benefits of One Content Repository

We're going to take a moment here to step aside and reiterate that, although we're going to continue with this article using Amazon Alexa as an example, it's important to deploy the experience you're building across all voice channels and devices. This can seem like a daunting task if you don't have one content repository. 

What is a content repository?

Simply put, a content repository is one place where all of your content is stored. Each program your brand builds (Google Assistant, Amazon Alexa, Siri, etc.) are able to pull from one content repository. That way, when you make a change or add a new utterance, that change is syndicated across all of the platforms you're building experiences for, rather than asking for a developer to make that one change multiple times in multiple systems. 

  • Headless Empowerment: By separating content from the presentation, content can instantaneously be deployed anywhere with the press of a button. Having one content repository makes things much easier for the marketer to be able to control messaging to any device. 
  • Minimal IT Friction: Empower marketing teams to make changes to content across multiple channels, without filing a ticket with IT every time something needs to be done.
  • Cost and Resource Predictability: Every time developer resources are used, they are taking time and money away from the organization. Using a SaaS headless content management system such as Zesty.io means that costs become predictable. SaaSification and digital transformation typically lead to a much lower and more predictable monthly cost.

Developing for Voice Applications

Building a Voice Application for Alexa with Amazon Alexa Skills Kit (ASK)

Building an Alexa skill seems like a pretty in-depth project, but connecting a skill to Zesty.io to manage the content within Amazon Alexa Skills Kit (ASK) can be done in one hour. Watch as Zesty.io engineer Simon Pricket guides users through how to build an Alexa skill and hook it up to Zesty.io using the Instant Content API

In this one-hour webinar, Simon covers topics such as the high level architecture of Zesty.io and Amazon Alexa, showing the voice interface, building content in the CMS and organizing it, coding in Amazon Alexa Skills Kit (ASK), hooking the Amazon skill to Zesty.io, and showing a live demo of how to take content from a website and send it to that Amazon skill.

What this video covers in 60 minutes:

  • The high level architecture of Zesty.io and Amazon Alexa. Get a feel for building both in Amazon Skills Kit (ASK), Zesty.io, and learn where the two applications overlap. 
  • Building content in the CMS. Learn the ins and outs of Zesty.io and how to organize your content. Import content easily with a CSV import, if you're building out an application with loads of content. The content portion of this project can be done by the marketing team or someone without technical skills.
  • Coding in Amazon Alexa Skills Kit (ASK). Simon walks us through different options when building a skill with ASK. We choose Javascript, since this is the most accessible to developers with different skill levels, but developers can choose whichever language is most comfortable.
  • Hooking the Amazon skill to Zesty.io. Create endpoints using the Content API within Zesty.io, and plug and play into your Amazon Skill in ASK. New content will push with each "Publish" within Zesty.io. 
  • Send it live. Simon wraps up the demonstration by showing a live demo of Zesty.io's Horoscope site "ZestyScopes" and connects that website to an Amazon Skill. The end result is a website and content within an Amazon Skill, though Zesty.io can be used in an entirely headless way and not need to use Zesty.io's built in Web Engine.

Optimizing Voice Technology with Conversational Analytics

We partnered with our friends at Dashbot, an intuitive platform providing actional bot analytics, to learn how to evaluate analytics for voice applications. Dashbot is a bot analytics platform that enables publishers and developers to increase engagement, user acquisition, and monetization through actionable data and tools. In addition to traditional analytics like engagement and retention, Dashbot provides bot specific metrics like sentiment analysis, conversational analytics, and AI response effectiveness, as well as full chat transcripts for a wide variety of platforms. 

Dashbot provides analytics from Facebook Messenger.
Dashbot provides analytics from Slack channels.
Dashbot provides analytics from Amazon Alexa.
Dashbot provides analytics from Google Home.
Dashbot provides analytics from Kik.
Dashbot provides analytics from any API.

Why do Voice Applications need analytics?

Like any marketing initiative, it has to be measurable and attributed to generating results. Many voice application platforms provide very limited analytics out of the box. Dashbot offers deep analytics such as sentiment, missed utterances, etc. to provide marketers and data analysts with a deep understanding of how their voice application is performing across any number of channels.

How to Integrate Dashbot with my Voice Application

Integrating Dashbot with your voice application likely takes just a few minutes. Check out their documentation, or if you are building your app with Zesty.io, Zesty.io software engineer Simon Prickett will show you how to integrate Dashbot with his build in just five minutes.

Voice Application Analytics Dashboard in Dashbot

Dashbot provides a deep understanding of different aspects of your audience using Voice, such as:

  • Engagement and retention
  • Conversational analytics
  • User behavior
  • Audience demographics
  • Comparison metrics

and even helps generate tools to take action so you can continue iterating and optimizing your voice application. To learn more about our friends at Dashbot's capabilities, view their tour.

Bridging the Gap

Building a voice application certainly can be an exciting project for your brand, but like any other initiative, it requires a lot of preparation in order to execute well. Once you've planned your application and looked into developing, Zesty.io is the best headless content management system for your voice applications. Developers love using Zesty.io's content APIs, marketers can change content on the fly without filing a ticket with IT, and business are excited that they can execute on building a voice application in days, not weeks or months.

Ready to build?

If you need help getting started with using Zesty.io for your voice application, please reach out to us directly at hello@zesty.io

By Chloe Spilotro

Hooked onto the platform since first using it through the Zesty.io Incubator Program at the University of San Diego. Passionate about all things marketing, IoT, and helping businesses leverage technology to grow and become major players in their industries.

Related Articles

Subscribe to the zestiest newsletter in the industry

Get the latest from the Zesty team, from whitepapers to product updates.