No announcement yet.

Regex look-ahead and related expressions for grammars

  • Filter
  • Time
  • Show
Clear All
new posts

  • Regex look-ahead and related expressions for grammars


    I'm trying to improve the XML grammar - currently (v4.07 build 19761) it treats everything that isn't a comment, a string or an angle bracket as plain text. I want to highlight element and attribute names. However, there's no look-ahead or look-behind functionality in the regex engine. For instance, when I try

     *[-A-Za-z0-9_]+ *(?=\=)
    to identify an attribute, I get an error reporting that the string I've given is not a valid regex, even though it is, it just has look-ahead to match, but exclude the '='.

     *[-A-Za-z0-9]+ *\=
    does find all attributes, but of course the '=' is included in the match and so it gets the same highlighting as the actual attribute name. A similar problem exists for identifying elements. I need to look behind to see if the text is preceded by '<\/?' to know it's an element. I thought that perhaps the ordering of the grammar items would make a difference, but it doesn't.

    Is look-around functionality coming to the BC regex engine? Or will I just have to accept that I won't always be able to have the grammar match exactly (there are changes I want to make to other grammars as well, to be able to do things like identify method names differently)?


  • #2

    Look around Regular Expressions is something on our wishlist, but is not a feature we'll be able to tackle soon. For XML files, the later grammar that includes the = is probably your best bet. The ordering of the elements has an impact if there is a tie, but otherwise line position (if one is found to match first as the line is scanned) determines which is used.
    Aaron P Scooter Software