none
Regex backreference functionality

    Question

  • Good afternoon, Regex Gurus. A quickie question if I may ...

    Recent observations, and MSDN documentation, seems to suggest that when a backreference appears in a search& capture operation, that the backreference is updated.    I wondered if there is a way to prevent that update.

    Permit me to explain:    Imagine that one is working with XML files coming across a serial line.    One might wish to assure himself that the entire file made it across the serial link, and did so without extraneous characters attached.

    One way to do that would be to employ a regular expression in which one captures the first label, then searches the entire file (string) for a matching label, preceded by the forward slash.

    for instance, the file ...

    <Configuration ..... (plus 10 or 20 lines worth of xml statements, all in a single string) </Configuration>

    for such a task, one might use a regex such as the following ...

            Regex Continuity = new Regex(@"\<(?<opening>\w+).*(?<closing>\k<opening>)");
    

    One captures the closing label so that the indexing and length information are available, thus one can calculate the file length and compare that to the string length to detect extraneous characters 'tacked on' to the file.

    The regex above works, kinda, but there's a problem:    Suppose that almost the entire file makes it across the serial link, but further suppose the final 'n' in the word /Configuration is missing ... then one has "/Configuratio".    Close, but no cigar !

    And Yet, although they do not match, The Regex says they do match, and worse, the <opening>capture to which the backreference refers, HAS BEEN MODIFIED and now also contains "Configuratio" instead of "Configuration".

    As mentioned earlier, I found that update of the back reference mentioned in an example, so it doesn't come as a surprise.   However, I am looking for a way to prevent that update. If the two labels do not match, I want to know that they don't match ... I don't want the backreference to be changed so that they do match.

    So, bottom line, is there a way to prevent that update ?

     

    • Moved by Caillen Wednesday, October 15, 2014 7:29 AM
    Monday, October 13, 2014 7:47 PM

Answers

All replies

  • UPDATE:

    I still can't prevent the update of the 'opening' capture, but I can get a history of the captures by adding a single character, a quantifier, to the capture.

            Regex Continuity = new Regex(@"\<(?<opening>\w+)+.*(?<closing>\k<opening>)");
    

    See the quantifier ? ('+').    With that, I can get the count as follows...

                captureCount = match.Groups["opening"].Captures.Count;
    

    If captureCount's one, the opening and closing labels are identical (at least, that has been my experience so far, but I don't understand how it works and so I'm reluctant to call this a 'fix'). If it's other-than-one, the file failed to be transmitted properly.   SO FAR.

    If this is true for all instances, then this is the solution.  Regrettably, I have no faith that this 'captureCount' will hold true for all cases.

    Monday, October 13, 2014 11:44 PM
  • Hello Lincoln_MA,

    I'm sorry but your question is all about Regular Expression, which is out of the scope of Visual C# language. I recomment that you post in some Regular Expression related forums since it's a little bit deep into Regex.

    There was a Regular Expression forum ever, but it's archived. So I'm not sure where you should post. I'm moving it to [where is the forum for ...] forum, the moderator will direct you to the right forum.

    Thanks for your understanding.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, October 15, 2014 7:29 AM
  • Retired notice on Regex forum suggests posting here.

    https://social.msdn.microsoft.com/Forums/vstudio/en-US/home?forum=netfxbcl

     

     

     


    Regards, Dave Patrick ....
    Microsoft Certified Professional
    Microsoft MVP [Windows]

    Disclaimer: This posting is provided "AS IS" with no warranties or guarantees , and confers no rights.

    Wednesday, October 15, 2014 9:44 AM
    Moderator