The Artima Developer Community
Sponsored Link

Python Buzz Forum
Wibbly Wobbly WBXML

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Ben Last

Posts: 247
Nickname: benlast
Registered: May, 2004

Ben Last is no longer using Python.
Wibbly Wobbly WBXML Posted: Jul 20, 2005 2:56 AM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Ben Last.
Original Post: Wibbly Wobbly WBXML
Feed Title: The Law Of Unintended Consequences
Feed URL: http://benlast.livejournal.com/data/rss
Feed Description: The Law Of Unintended Consequences
Latest Python Buzz Posts
Latest Python Buzz Posts by Ben Last
Latest Posts From The Law Of Unintended Consequences

Advertisement
Ok, so I couldn't think of a title that's as wilfully obscure as the usual ones.  Whatever.

For reasons of Commerce, I need to be able to generate WBXML messages within the guts of the mighty Python/Zope engine that powers the Mobile Phone Project[0].  What, I hear you ask, the blinking flip is WBXML?  Well, if you don't know, you probably want to keep it that way, but you did ask.

WBXML is a binary encoding of XML.  XML is, of course, a textual encoding of data... some of which may originally have been binary.  So it's sort of an extra level of complication added to something that's already complicated, but hey, that's what geeks do, isn't it?  The reason that it's a binary encoding is that XML is bulky.  Most of the time that bulk doesn't matter that much; I'll trade bandwidth, memory or CPU time for explicitness any day of the week.  But if you're trying to pack XML over a slow, laggy, prone-to-being-interrupted-by-trees-or-birds wireless link to a phone, bulk is bad.  It's even worse if you're trying to pack an XML Service Indication (essentially, a pushed URL) into the tiny size of a single SMS message.  Hence the binaryness.

WBXML isn't anything as simple as, say, a gzipped version of the XML stream.  Instead, it's a carefully rigorous specification of how individual single byte values map to either XML or text strings.  For example, the XML <SI> maps to the binary value 0x05, and <INDICATION> maps to 0x06.  But it's clever; if the HREF attribute of the INDICATION starts with "http://", then the whole attribute-starting-http maps to 0x0C.  If the HREF starts with "http://www" then it's mapped to 0x0D, saving another three bytes, and so on.  The more common the string, the more likely it is to have a fixed mapping.  There's also a neat string-table option; commonly used string can be folded into single-byte offsets into a string table (in effect, any repeated string longer than three bytes is worth string-table-izing).

This is non-trivial stuff to knock up in a hurry, so it's just as well that there's the libwbxml open-source library to handle it all.  That library, however, is in C, and I'm working in Python.  There appear to be no published Python binding to libwbxml, so it was time to dust off my ancient experience of #include <Python.h> and get to it.

Here's the C code that allows a Python call to libwbxml's xml2wbxml function:
static PyObject *wbxml_xml2wbxml(PyObject *self, PyObject *args) {
/*A WB_UTINY is an unsigned char, so we can allow conversion directly from the Python string*/
WB_UTINY *xml;
WB_UTINY *wbxml;
WB_ULONG wbxmllen;
int status;
WBXMLConvXML2WBXMLParams    params;
WB_UTINY *errstr;
 
    /* Verify and read a string arg (xml) */
    if (!PyArg_ParseTuple(args, "s", &xml))
        return NULL;
 
    /* Pass that to libwbxml2 */
    params.keep_ignorable_ws = FALSE;
    params.use_strtbl = TRUE;
    params.wbxml_version = WBXML_VERSION_11;
    status = wbxml_conv_xml2wbxml(xml,&wbxml,&wbxmllen,¶ms);
    if(status == WBXML_OK) {
        errstr = NULL;  /*we return None to mean no error*/
    } else {
        errstr = (WB_UTINY *)wbxml_errors_string(status);
    }
 
    /* Build the return tuple of wbxml, error.
    The wbxml string is binary, so we need to convert it with a z#
    rather than a z.*/
 
    return Py_BuildValue("(z#z)",wbxml,wbxmllen,errstr);
}


For details, I recommend you to the excellent on-line reference to the Python/C API (link goes directly to the section that explains conversion of values between C and Python).

You can download the whole source code (including a distutils setup.py) at http://hrsys.demon.co.uk/wbxml.zip.  There's a test.py included to show how it works.  Be aware that the setup.py contains a specific runtime-library path that my development boxes need; you may or may not need to delete that line.  Naturally, you'll also need libwbxml installed to have it work.

Right now, I don't need to convert WBXML back to XML, so there's no link to the reverse wbxml2xml function.  Like they say, open-source software scratches one's own itches.  And if you did want to send the resulting SI in one or more SMS messages to a mobile, you'd also need to wrap the binary WBXML in a WSP (Wireless Session Protocol) header, which is another topic entirely[1].

[0] I suppose I should start calling this the Mobile Phone Content Project, since that's a more accurate name.  Maybe when the tv adverts start...
[1] I do have working code to do this, so if anyone needs to wrangle WSP, feel free to drop me an email (or ask via blog comment) and I'll share what I know.

Read: Wibbly Wobbly WBXML

Topic: What happened to YAML? Previous Topic   Next Topic Topic: JavaScript Sucks (volume 2)

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use