December 01, 2004

Getting ccxml & voicexml support in to asterisk

I've been hacking with the open source voip / pbx system asterisk for a few months. One of the things which as always struck me as a big kludge was the way asterisk handles dial plans and call routing. It's basically a basic like proto-programming language masquerading as a massive configuration file.

There are standards for how to handle call routing, they are clean, agreed upon by the w3c, but sadly, only supported by the proprietary vendors.

There is a project to develop a ccxml engine for asterisk using a java engine with JNI classes called jasterisk. It's still in development, and not ready for deployment, but a positive step forward. Java isn't really my favorite language, and beyond that, i'd rather not deal with the deployment and server optimization headaches in coupling a jvm in to asterisk servers.

So, when i first came across res_perl, a mod_perl like system for asterisk, i was too busy to look closely at it. We were developing a call blast system for groups to use during the election, and had a very hard deadline. Now that some time has past, i have had a chance to look back at asterisk.

I also came across AstGuiClient, a gtk based client for managing asterisk servers in a call center type setup. It's pretty good, but at it's core, feels kludgy. It's of no fault of the astguiclient folks, the asterisk system is still immature in the ways we can interact with it. The manager is just now getting to be stable, and even then I'm not sure it's the right way to go for driving a kind of application platform like asterisk, perhaps something like xml-rpc or soap might be better.

Right now res_perl lets you redirect dialplans from extensions.conf file to a persistent perl process. This is similar to how asterisk in the past let you pass off call control to AGI which is a cgi like system. AGI is not very efficient or stable, in part because of the fact that the user hears a noticeable delay while a new perl process is launched for each call. Unlike in web environments, on the phone, a second or two delay is an unacceptable amount of silence.

ccxml using perl

First to understand ccxml, you have to know that it's a call control system meant to be build upon a larger voicexml spec, which includes things like voice recognition and text to speech. Up until now, nobody in the free software community has taken on implementing the voicexml spec completely because there is no open source voice recognition applications. Just recently, IBM has recently passed over it's Reuseable Dialog Components (RDC) to the Apache Jakarta Project. With this, it will eventually be possible to write a full voicexml/ccxml system using free software.

For now, i'm mostly interested in developing the ccxml application, because my applications for voice systems don't need the full power of commercial call center systems. Maybe in the future that'd be nice.

The primary stumbling block, is that ccxml is build upon ECMAScript (ie JavaScript). It appears there is a perl library, JavaScript::Runtime or the JavaScript::SpiderMonkey libraries which both just act as wrappers to javascript libraries. Maybe in the future the perl6 gurus will create a native javascript interpreter in perl using Parrot.

Once we've got javascript interpretation covered, we've got dialogs. The ccxml standard is only for call flow, and doesn't provide any mechanism for interacting with the caller directly. This is both good and bad. What ccxml does pass off the calls to a separate system for voice detection, dtmf, etc... Any useable ccxml system for asterisk would require at least a minimal dialog system to allow for dtmf detection and calling asterisk methods. An example, not to be copied, of what those tags might look like is in the SALT standard for embedding call xml in html.

CCXML is a event based system, so that means you need to at least implement a call back system. In perl the dominate system for event handling is POE. I wrote a script which uses POE during the voip project for the election, and i don't like it. It's a messy kludge, which is hard to debug and eats cpu cycles. Another option is using Event library, which is written in C (likely to be faster) and looks significantly simpler than POE. A third option is to drop the whole perl thing and use ruby which i suspect would have much better callback support than perl. The latter option, unfortunately, would require rewriting res_perl, there are good xml and javascript parsers in ruby already. For the moment, I'm not going to go down that path. Another option, probably a better one, is to use the Perl SAX event based xml parser.

The primary task beyond having the parser and event call back system working, is mapping ccxml methods down to asterisk dialplan commands. Ccxml requires that at a minimum, there be support for the JAIN Call Control (JCC) API, and after that other api's or extensions can be added at will. The basic methods wouldn't be too hard to do, just create mapping from the JCC methods to asterisk. All the functionality is there.

VoiceXML, the key next step

Once we've got the even loop and parsing for the ccxml app done, we still don't have anything useful. For that we need at least partial VoiceXML support. The primary goals of VoiceXML are to provide: "audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications. " Everything but the speech recognition is doable using free technology. Most of the VoiceXML examples seem to be expecting good and fast text to speech systems (tts), but using festival, you don't get a very good set of voices, so in our applications we'd mostly use the audio tags, which are supported.

Part of why we're doing this, is building on standards means we can work off of existing tools instead of just directly editing the dial plan.

Getting VoiceXML support involves writing the sax parser, walking through each method and mapping it to the dialplan application. This is very doable, just requires time.

Posted by rabble at December 1, 2004 10:12 PM | TrackBack
Comments

Is there something that these supposedly clean, W3C approved standards get you over the * style dial plans? I was rather taken w/ the dial plans when you showed them to us, and yet another W3C XML spec which no one has implemented except the major vendors raise a dozen or so red flags for me.

You sure it isn't a useless white elephant like SyncML?

Posted by: kellan at December 5, 2004 10:44 AM

Yeah, perhaps it's better to create a nice api for mod_perl like scripting of IVR's instead of a cumbersom standard.

The AGI's are kludgish, so are the dial plans, i was hoping for something cleaner.

Posted by: rabble at December 5, 2004 10:10 PM

I tried to do a trackback but for some reason my posting did not make it to the comments page.

Anyway I have a list of the current opensource CCXML projects on my blog. If anyone knows of any that I am missing please let me know:

http://www.rjauburn.com/

Posted by: RJ Auburn at December 13, 2004 05:37 PM

Hey i have been facing similar challenges for controlling asterisk from an outside application, so far what i worked out was based on a POE::Asterisk proxy sharing information with a perl XML-RPC server via Cache::Cache (using hard disk space) and finally talking to a JavaScript front end (Firefox) via JSOLAIT XMLRPC js libs. Right now we will be working with some Compiere/Java integrators. Im looking at Jasterisk, altho im also looking into dusting off the past solution and offering a clean SOAP interface.. Nice site btw

Posted by: Dude at March 4, 2005 09:39 AM
Post a comment









Remember personal info?