Continued progress in reinventing the relationship between people and technology.
It was 1982 when Dragon Systems was founded by Dr Jim Baker and Dr Janet Baker. They produced voice recognition software that could turn spoken words into text that appeared on screen. It was a major achievement considering the limitations of computers at the time. For example, here are some things that indicate the state of computing in 1982:
Early days
Even before computers there were examples of voice capture. For example in 1881 Alexander Graham Bell (inventor of the telephone) had a hand in developing a system for cutting grooves into a wax cylinder in response to the voice. Later, in the early 20th Century, came the Dictaphone recording onto wax, then plastic, then magnetic tape as technologies advanced.
But these were all just about recording speech to play it back. The big breakthrough – and what we would today understand as speech recognition came with computing systems. There were lots of parallel development strands going on in the 1950s and 1960s. For example:
The different developments going on at that time were based on matching spoken words up with voice patterns. They worked word-by-word, and were not able to produce sentences.
The next big breakthrough came in 1971 with Harpy. This was funded by Darpa (the US Department of Defense research agency), and was a joint effort that included Carnegie Mellon University, Stanford Research Institute and IBM. Harpy cold work with ordinary speech and pick out individual words, but it only had a vocabulary of around 1000 words.
Enter the Dragon
The biggest advance yet came in 1982 when Dr Jim Baker and Dr Janet Baker launched Dragon Systems and prototyped a voice recognition system that was based around mathematical models. The Bakers were mathematicians and the system they came up with was based a hidden Markov model – using statistics to predict words, phrases and sentences.
This allowed for much more than just identifying words. It also allowed for working with syntax and context. That’s really important for efficient general-purpose speech recognition, which needs to be able to produce meaningful sentences. For example, to produce a grammatically accurate sentence it is important to know which word is meant out of several that might share the same pronunciation but have different meaning and/or spelling.
In 1990, Dragon Dictate was launched as the first general purpose large vocabulary speech to text dictation system. This was a groundbreaking product for Dragon, but it required users to pause between individual words. By 1997, that problem had been overcome. Dragon Naturally Speaking v1 launched that year. It allowed for continuous speech recognition – users could speak in their natural way without leaving pauses between spoken words.
In Part 2, we will look at how Dragon has developed, embraced new technologies, improved accuracy, and enhanced productivity across a range of vertical sectors.
Sources: