MGrammar – Tokenizing

Listen with webReader
Published 08 August 09 09:52 PM | wmmihaa

I’m currently working on an EDI grammar, I came across some unexpected behavior upon tokenizing the input.

UNA:+.? '
UNB+UNOC:3+123456789:ZZ+987654321:ZZ+090804:0758+491944'
UNH+464009+APERAK:D:07B:UN:2.0b'
BGM+313+464009'
DTM+137:200908040758:203'
RFF+ACE:100048193285'
DTM+171:200908040606:203'
NAD+MS+123456789::ZZ'
NAD+MR+987654321::ZZ'
ERC+Z06'
FTX+ABO+++9904383000003'
RFF+ACE:100048193285'
UNT+11+464009'
UNZ+1+491944'

The above sample is an APERAK message. I won’t go into any details about the structure other then that there are a number of Segments such as (UNH, BGM, DTM etc). Each segment is separated by “’”. Every segment has elements separated by “+”, which in turn can have a number of component data elements separated by “:”. Some of the elements are optional and some are mandatory.

My problem occurred when elements are optional. Have a look at the sample grammar below:

syntax Main =   a:A? del? b:B? del? c:C? =>{A=>a,B=>b,C=>c} ;
token del = ","; 
token A = ("A".."Z" | "a".."z" | "0".."9")+;
token B = ("A".."Z" | "a".."z" | "0".."9")+;
token C = ("A".."Z" | "a".."z" | "0".."9")+;

The syntax above states that there are three tokens (A, B and C), and they are all optional.

Given the input: a,b,c the output will be:

{
  A => "a",
  B => "b",
  C => "c"
}

However, given the input of only a,b the output comes out:

{
  A => null,
  B => "a",
  C => "b"
}

This was somewhat unexpected for me. I would have expected the tokens to be filled from the left, leaving the “C” element empty. To solve this you need to complement the syntax with all possible combinations:

syntax Main =   a:A del b:B del c:C? =>{A=>a,B=>b,C=>c}
                    | a:A del? b:B?=>{A=>a,B=>b}
                    | a:A=>{A=>a} ;

Which gives the following output:

{
  A => "a",
  B => "b"
}

image

Filed under: ,

Comments

# Ola L, Logica Gbg said on August 28, 2009 12:28 PM:

Cool stuff!

If you ever finish this grammar

I would like to try it :-)

Leave a Comment

(required) 
(required) 
(optional)
(required) 

This Blog

News

    MVP - Microsoft Most Valuable Professional BizTalk User Group Sweden BizTalk blogdoc

    Follow me on Twitter Meet me at TechEd

    Visitors

    Locations of visitors to this page

    Disclaimer

    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

Syndication