What’s next:
In the Labs

×

Nuance and DFKI help students create interactive appliances of the future with speech tools

Tools that allow developers to create speech interfaces and intelligent machines are contributing to the ongoing transformation of the ways in which people interact with the technology that surrounds them. By partnering with DFKI we were able to place easy to use speech development tools (via Nuance Mix) and a multimodal dialog platform capable of handling natural interactions (SIAM-dp) into the hands of some of the best and brightest students in computer science and computational linguistics. The result, as seen through hackathons like this one, shows that it is possible to see the sorts of smart, speech-enabled devices that may well be ubiquitous in the years to come.

By
DFKI students use nuance speech tools to create interactive IoT applications

In my last blog post, I explained how we use different types of Neural Networks for both ASR and NLU. We already touched upon DNNS, RNN, and NeuroCRF, and I did not even mention that we use CNNs (Convolutional Neural Networks) for the “intent” discovery aspect of NLU. Does this sound confusing? Fortunately for end-users everywhere, you don’t have to worry about keeping all of the terminology and machine learning concepts straight – you just see the added benefits of increasingly accurate ASR and NLU.

Now, here is even better news: if you are a developer who wants to create a great app for the Internet of Things using speech technology (such as ASR and NLU), you no longer have to worry about the mechanics behind advanced concepts like machine learning. The reason is that we have done the heavy lifting for you. Through Nuance Mix, we are able to utilise our knowledge and expertise around neural networks of various types and how to apply them to specific tasks in order to create intuitive spoken interactions.

This new developer platform provides you with everything you need to quickly create, assess and refine your own speech application ideas. Perhaps most importantly, it gives you an easy to use interface for the setup and maintenance of your speech application’s ontology. What this really means is that you can determine what the app is to be used for, and provide your own sample utterances as the nucleus of training data. Once you’re past this stage, you can then apply the machine learning training machinery, with just the press of a button.  Now that you have trained models unique to your app (which are basically the NNs we discussed earlier), you can deploy to a cloud-based runtime environment and have your app up and running.  Because you don’t have to be an expert in machine learning to use Mix, my colleague Kenn Harper called it “the democratisation of voice technology” recently.

By taking a lot of the hard work out of integrating speech into your app, we allow you to focus your creativity on the app you want to create- an area in which you are the expert. And a creative approach is especially important now, as more and more devices enter the IoT sphere that can make sense of speech and natural language. To help spark that creativity, we are holding a series of “hackathons” and similar events, addressing both the needs of industrial users as well as enabling students to experiment and innovate with speech technology.

We recently partnered with DFKI (the German Research Center for Artificial Intelligence), which is located on campus at the University of Saarland, to host a hackathon of our own. Having been a proud stakeholder in DFKI since 2014, and understanding the way in which DFKI can bring AI into the German industry, we knew we would see some exciting projects. On the first day, we saw great participation by industrial partners who learned first-hand how to use mix from Mix Masters Nirvana Tikku and Samuel Dion-Girardeau. After a thorough workshop, the group gave it a try on their own, having the chance to test out our web-based developer platform.

The second portion of this event was a student hackathon, which my colleagues Christian Gollan and Hendrik Zender, DFKI alumni, have just returned from. Running from 5:00 PM Friday until 5:00 PM Saturday, the students engaged in a 24-hour coding spree to speech enable devices using Nuance Mix and SIAM-dp (DFKI’s own dialogue platform). Having seen university students create some amazing championship winning inventions such as Lisa the robot, we had high expectations. We weren’t disappointed as every team involved came up with impressive solutions that would help address existing problems or areas of need by using speech, natural language and DFKI’s multimodal dialogue platform.

Overall, the event resulted in a number of captivating applications that worked to simplify the interactions between people and technology. However, especially of note were our prize-winning teams. Our top winners were as follows: in third place a chatbot that could act as a personal assistant; in second place a speech-enabled robot that could help children learn how to do math; and, in first place, an intelligent home solution that enabled would-be houseguests to use a voicemail box for when no one is home. For the announcement of the winning teams and the award ceremony, we were joined by Professor Wolfgang Wahlster, CEO and Scientific Director of DFKI. He congratulated the students for their excellent results and emphasised the importance of speech interfaces and artificial intelligence for the ongoing transformation of how people will interact with the technology that surrounds them. He also stressed the pivotal role that the collaboration between DFKI and Nuance plays in this transformation.

We agree and think this event gave students with an interest in speech technology the opportunity to learn and work with cutting-edge tools in a fun, yet challenging environment. Besides winning prizes, eating pizza and drinking a lot of coffee, everybody involved exemplified the ways in which tools such as Nuance Mix and SIAM-dp could very well help build the intelligent, interactive solutions of our future.

Read full article

More from the editor

Part 1 – AI for customer care: Human assisted virtual agents get smart with big knowledge
Machine learning and AI turn big data into big knowledge for a better customer experience
Why we’re using Deep Learning for our Dragon speech recognition engine
Unique application of Neural Nets results in greater productivity
How the technology transcribing your meetings actually works
Simple isn’t always as simple as it seems
call-center-customer-support-helps-virtual-agent
Part 1 – AI for customer care: Human assisted virtual agents get smart with big knowledge
Machine learning and AI turn big data into big knowledge for a better customer experience
Dragon uses deep learning for more accurate speech recognition.
Why we’re using Deep Learning for our Dragon speech recognition engine
Unique application of Neural Nets results in greater productivity
technology can transcribe meetings between colleagues
How the technology transcribing your meetings actually works
Simple isn’t always as simple as it seems
Show more articles