Skip to main content


Give your Power Virtual Agents a voice with Speech and Telephony

When Power Virtual Agents were released, it was ensured that people who are closer to their customers can create bots for them. The team at Microsoft worked hard on tooling so that users could not just feel the ease of creating conversational experience but also utilize some useful features that'd keep conversation quite natural. The latest features include but not limited to using the rich media to natural language understanding to publish a bot to the channel of their own choice.


As we moved along, we witnessed both the advancements as well as the adoption of bots by different business verticals. These bots were not only developed on Power Virtual Agents but they were backed by multiple technologies. For example Microsoft Bot Framework, Google's DialogFlow, Rasa, Amazon Alexa and so on. Products developed on each frameworks offer unique set of capabilities but there are very few who do that consistently.   Out of many demanding features in Conversational AI space, Voice support stands firmly within top 3 mark. Adding the voice capabilities do not just give more way to interact your customers but also assuring the natural user experience which they have been dealing with from the past couple of years. It's equipped with the improved technology, rather than a mere IVR.


If you're interested in learning why IVRs are super bad at the customer service then I've covered it for you here.

Smart market analysts have already came to realize that 2024, the number of digital voice assistants will reach 8.4 billion units – a number higher than the world’s population.


Voice lands in Power Virtual Agents


Finally! The time has come when your customers will be able to call your business number and a bot built on Power Virtual Agents can now respond back with the help of Telephony (ACS) and Speech services (Azure Cognitive Services). This opens up a whole new spectrum for the businesses esp. those who are already using Power Virtual Agents to serve their customers. 


Speech Authoring


Firstly, let's have a look at the Speech. With so many other features released during this Build 2022, the new functionality allows you to author responses tailored for both text and voice. This means, you can add the multiple variations of the Speech to the 'Send Message' or 'Question'` in your Power Virtual Agents bot. The new functionality also includes the SSML support to control how the response is actually spoken.






With Speech, comes the capabilities to allow customers to interact over voice channels. One of the most used voice channel is still your phone today. The Power Virtual Agents product teams did a great job with timings so that they can release both co-related features at once. As per the latest Build 2022 announcement, the new telephony specific support including Caller ID and keypad input will be available and powered by Azure Communication Services.


Steps to setup


Before you get to publishing to the Telephony channel, you may want to configure some settings. First step is to choose the voice for your bot that users on the telephony channel will hear. There are some voice fonts already available but you can always use available neural voices and one of them is en-US-JennyNeural.


Some neural voices like Jenny's also has got a speaking style which you can choose to give a natural feel to your customer.






In order to work with Telephony channel, you should have Azure subscription under the same tenant. Once you're in the Azure portal, you can create a resource for Azure Communication Services. Go to Voice Calling - PSTN section and get a new phone number. This phone number will be tied with just one bot.






Once you have that active, then you can head back to your Power Virtual Agents. Publish your bot and then go to Channels tab. Click on Telephony (Preview) channel, add phone number and guess what? It's voice ready!




Upcoming Features


There are certain advanced capabilities that make your voice experiences really stand out;


  1. Your bot will be able to send a note or perform an action on detecting the silence
  2. While Speech to Text is quite robust, there's still some lacking on recognising non-native English accent. An improvement to fix these issues will be available later in the coming months / years.
  3. Adding non-interruptible messages
  4. Surfacing your existing bot to your Dynamics 365 Omnichannel for Customer Service or any other channel that natively supports voice.

The latest speech and telephony capabilities have made it so much easier and accessible that you can surface your bot anywhere, on almost any smart device that is in customer's demand. While this is really new feature, I did not face / hiccups while setting it up for voice. In fact, I was able to use it for about 2-3 minutes in a conversation I built and it magically worked great. I wish others integration could become this easier too but time will tell.


Until next time.


*This post is locked for comments