Re: [code] Regular expression \t does not behave as expected.

From: Mitchell <m.att.foicica.com>
Date: Fri, 16 Mar 2018 16:38:16 -0400 (EDT)

Hi Danny,

On Wed, 1 Nov 2017, Danny MacMillan wrote:

> It seems that \t sometimes matches a literal t rather than a literal tab
> character (maybe?)
>
> I have a TSV file as such:
>
> DEMO_UPK_KNOW Procedure ADJUST_BLUESTONE_AU
> DEMO_UPK_KNOW Procedure ADJUST_MENTOR_PACKAGE
> DEMO_UPK_KNOW Procedure ADJUST_MENTOR_THREAD
> DEMO_UPK_KNOW Procedure ADJUST_NOTE
> DEMO_UPK_KNOW Procedure GETAUIDBYAUCODEANDAUSID
> DEMO_UPK_KNOW Procedure GETEXTERNALAUCHILDREN
> DEMO_UPK_KNOW Procedure GETEXTERNALTITLEAULIST
> DEMO_UPK_KNOW Procedure GETTITLEAULIST
> DEMO_UPK_KNOW Procedure GETUPKMAPAUS
> DEMO_UPK_KNOW Procedure RESETALLLOTRACKING
>
> If I do a search for the following regex in a file with the above contents,
> it finds nothing.
>
> ^([^\t]+)\t([^\t]+)\t([^\t]+)$
>
> The real file is much larger than the above example. It will eventually find
> a row - the PROJWBS row below.
>
> EAI_P6_SANDBOX_DASH Synonym PROJECT
> EAI_P6_SANDBOX_DASH Synonym PROJWBS
> EAI_P6_SANDBOX_DASH Synonym TASK
>
> My initial surmise was that the previous row ending with "T" accounted for
> this. But I don't believe this is so, or at least I think there must be
> something else wrong perhaps in addition to this. The next match in the file
> is composed of all but the first and last lines in the below (the match spans
> 5 lines, which should not happen with the ^ and $ in there).
>
> EAI_P6_SANDBOX_DASH Synonym TASKACTV
> EAI_P6_SANDBOX_DASH View EAI_GREEN_UP
> EAI_P6_SANDBOX_DASH View EAI_GREEN_UP_SCHED_VARIANCE
> EAI_P6_SANDBOX_DASH View EAI_GREEN_UP2
> EAI_P6_SANDBOX_DASH View EAI_SCHEDULE_VARIANCE_VIEW
> EAI_P6_SANDBOX_DASH View EAI_SCHEDULE_VARIANCE_VIEW_AVG
> EAI_P6_SANDBOX_DASH View INSPECTION_SUMMARY
>
> HOWEVER!!! The behavour of "Find Next" and the behaviour of "Find Prev" are
> different. Find next will find the 5 middle lines as a single match. If I
> find next past this block, then find prev, it will find each of those 5 lines
> as its own match, which is what the behaviour should be. Unfortunately
> neither find next nor find prev are finding everything they should.

I wanted to follow up on this. Right now the Textadept nightly builds are making use of C++11's new built-in regex capabilities, so TRE is no longer used. I get the expected behavior when I try your example using a nightly build.

Cheers,
Mitchell

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Fri 16 Mar 2018 - 16:38:16 EDT

This archive was generated by hypermail 2.2.0 : Sat 17 Mar 2018 - 06:25:36 EDT