So an interesting requirement popped up recently and I was somewhat surprised that the VXML specification did not cover it. Why can't I record the utterance made within a field?

Say I want to store the result of a field in both a text and audio format, there is no way to do this right now. As anyone knows speech recognition can be patchy at times, wouldn't it be nice to compare the interpretaion to the actual utterance?

Originally from VXML, CCXML and a touch of SALT