Initially found on HN: https://news.ycombinator.com/item?id=27324265
Specifically, while this output is correct (since ^ is set to be in regex multi-line mode always):
$ printf 'a\nbaz\nabc\n' | rg -U '^b'
baz
It should be the case that using (?-m)^b or \Ab would not print baz as a match. But that's not the case here:
$ printf 'a\nbaz\nabc\n' | rg -U '(?-m)^b'
baz
$ printf 'a\nbaz\nabc\n' | rg -U '\Ab'
baz
The issue here is that in this case, ripgrep isn't memory mapping the input. In that case, ripgrep tries to be "smart" and not actually read the entire contents on to the heap if it knows the pattern can't match through a line terminator. But in this case, we can't quite make that assumption since anchors can match line terminators as look-around.
Initially found on HN: https://news.ycombinator.com/item?id=27324265
Specifically, while this output is correct (since
^is set to be in regex multi-line mode always):It should be the case that using
(?-m)^bor\Abwould not printbazas a match. But that's not the case here:The issue here is that in this case, ripgrep isn't memory mapping the input. In that case, ripgrep tries to be "smart" and not actually read the entire contents on to the heap if it knows the pattern can't match through a line terminator. But in this case, we can't quite make that assumption since anchors can match line terminators as look-around.