Talking to machines
What does it take
State-of-art speech recognition technology is based on statistical models of human speech, and computations performed on digitized speech signals. Every machine that must respond to speech must be able to capture speech, digitize it, and perform these computations. Recognition of speech can be done through software or hardware. While the technology for software-based speech recognition is fairly advanced, only recently have researchers endevoured to create silicon chips that can perform large scale recognition. The scale of recognition is related to the number of words and the complexity of language that can be handled by a speech recognition system. So far, very small scale systems have been deployed in commercial applications quite successfully (such as voice-dialing on cellphones). These are a combination of hardware and software based systems, and are called embedded systems.
To enable a machine to understand and respond to speech, a suitable system that can be integrated with the machine needs to be developed. In addition, the response to speech can be a series of actions ranging from simple switching actions to complex task completion, and even through sysnthesized speech. The response mechanism must also be designed and integrated into a machine. The strategies used for all these very from application to application, and can be quite subjective even for a given machine, if it is to be used by human users that speak different langauges differently.
Machine categorization
The world of machines can be categorized and summarized in many ways. It all depends on the perspecive. From the speech recognition and response perspective, machines can be categorized most coarsely into those that are capable of high-level computing, and those that have very limited processing power. Each of these can be categorized as those that are mobile and those that are immobile, and all of the above can be categorized as those that communicate to other machines and those that don't. Why is this categorization efficient? It helps us take the first step towards creating a design for an application-specific system. It provides a simple yet strong perspective from which we can work. A mobile machine with a very limited set of resources can still perfrom large scale tasks if it can communicate with a powerful machine. In this particular case, we have an instance of distributed speech recognition. Imagine using your cellphone to browse the word wide web through voice - the categorization immediately suggests that this application must be designed for distributed speech recognition and response.
A clean categorization helps us create better designs and more efficient technologies