Since SpeechTEK is this week, I thought it was a good time to update a post I did several months ago on using speech recognition to capture a street address.

There are lots of reasons why you might want to collect a caller’s address over the phone.

In open government circles, there has been a lot of interest lately in using automated IVR systems to help gather non-emergency service requests for municipalities. This makes a lot of sense – many municipalities enable non-emergency service reporting through the use of a designated abbreviated dialing number – 3-1-1 – so there is a long history of reporting these issues using the telephone.

Address collection is used quite often in IVR systems, but typically relies on expensive proprietary or “black box” components that might not be suitable for all use cases. This is particularly true for municipalities and local governments who are under financial pressure and who need to do more with less.

In this post, I’ll show how to build a sophisticated address collection system that can be used for almost any city or town, large or small. All of the code for this example is on GitHub (and under active development) and many of the components I will use are free or open source.

Here is a screen cast demonstrating the system running on the Tropo platform.

How it works

The application demonstrated here relies on three primary ingredients:

  • A data source with all of the street names in a city (in this case, I used the San Francisco Basemap Street Centerlines file from DataSF.org)
  • A database that can store the data on street names and zip codes, and which can be queried to render a speech grammar. In this instance, I use CouchDB.
  • A telephony platform that supports speech recognition – in this case, Tropo.

The application is structured to ask the caller for a zip code – obtaining a zip code will help us constrain the number of choices in our speech grammar and help ensure a better recognition.

The application then builds an SRGS grammar in XML format using CouchDB’s view and list functions. This grammar contains a list of all of the streets in a particular zip code and allows a caller to add additional details, like house number and street direction (if applicable).

In the event that a successful match can’t be made (this is inevitable in some small percentage of calls), we ask the caller to say their full address and make a recording.

This recording can be transcribed after the call has ended to gather the caller’s address – this might be a manual step, or it could be automated using functionality provided by Tropo.

Building for the cloud

The example shown here is built to run on the Tropo cloud communication platform, and uses a cloud-based instance of CouchDB.

This same basic approach could be replicated with a more conventional architecture, and could also use a standard relational database (as I did in my previous post on this subject).

But using cloud-based components has a number of advantages that might be attractive to smaller governments that want to employ this approach, or even larger governments that face fiscal constraints or challenges.

Using a cloud-based platform like Tropo makes deployment and scaling easy. It also means that you get access to the latest and greatest technology to support the open specification for speech recognition grammars. The folks that work at Voxeo (the company behind Tropo) help write these standards.

Using CouchDB has a number of advantages too. Populating a CouchDB instance with street data is extremely easy with tools like shp2geocouch by Max Ogden. In addition, it’s actually pretty straightforward to write view and list functions to generate a speech grammar – after all it’s just JavaScript.

If you found this post and screen cast useful, head on over to the GitHub repo for this solution and sign on as a watcher – I’m going to be actively developing this with the goal of deploying it for a municipality in the near future.

Stay tuned!

©2011 The Tropo Blog. All Rights Reserved.

.

Related posts:

  1. Speech-Driven Phone Applications in the Cloud
  2. Talking to the Cloud: Build Speech Recognition Applications with Tropo
  3. Powerful Speech-Driven Tropo Applications

Originally from All Voxeo Blogs Feed