正则表达式具有量词

我正在回答这个问题，这里是我答案的直接链接。

你会注意到我使用了这个模式：

(\\?)?&?(TXT\\{[^}]++})(&)?

在以下代码中（添加了一些与我的问题相关的调试）：

 public static void main(final String[] args) throws Exception { final String[] loginURLs = { "http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS}", "http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&TXT{UE-IP,UE-Username,UE-Password}", "http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}&username=abcd&location={LOCATION}&password={PASS}", "http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}", "http://ip:port/path?username=abcd&password={PASS}"}; final Pattern patt = Pattern.compile("(\\?)?&?(TXT\\{[^}]++})(&)?"); for (final String loginURL : loginURLs) { System.out.printf("%1$-10s %2$s%n", "Processing", loginURL); final StringBuffer sb = new StringBuffer(); final Matcher matcher = patt.matcher(loginURL); while (matcher.find()) { final String found = matcher.group(2); System.out.printf("%1$-10s 1:%2$s,3:%3$s%n", "Groups", matcher.group(1), matcher.group(3)); System.out.printf("%1$-10s %2$s%n", "Found", found); if (matcher.group(1) != null && matcher.group(3) != null) { matcher.appendReplacement(sb, "$1"); } else { matcher.appendReplacement(sb, "$3"); } } matcher.appendTail(sb); System.out.printf("%1$-10s %2$s%n%n", "Processed", sb.toString()); } }

其中输出是：

 Processing http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS} Groups 1:null,3:& Found TXT{UE-IP,UE-Username,UE-Password} Processed http://ip:port/path?username=abcd&location={LOCATION}&password={PASS} Processing http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&TXT{UE-IP,UE-Username,UE-Password} Groups 1:null,3:null Found TXT{UE-IP,UE-Username,UE-Password} Processed http://ip:port/path?username=abcd&location={LOCATION}&password={PASS} Processing http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}&username=abcd&location={LOCATION}&password={PASS} Groups 1:?,3:& Found TXT{UE-IP,UE-Username,UE-Password} Processed http://ip:port/path?username=abcd&location={LOCATION}&password={PASS} Processing http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password} Groups 1:?,3:null Found TXT{UE-IP,UE-Username,UE-Password} Processed http://ip:port/path Processing http://ip:port/path?username=abcd&password={PASS} Processed http://ip:port/path?username=abcd&password={PASS}

哪个是完美的。

现在，我的问题

当我更改第一个匹配组时， (\\?)? ，使用占有量词，即(\\?)?+ ，第一项的输出变为：

 Processing http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS} Groups 1:?,3:& Found TXT{UE-IP,UE-Username,UE-Password} Processed http://ip:port/path?username=abcd&location={LOCATION}?password={PASS}

我不能为第一场比赛组中的问号来自哪里工作。

我没有看到模式正确匹配所需字符串并在第一组中获取问号的方法。

我只是缺少一些明显的东西吗？

如果重要的话我正在运行OS X Mavericks：

 java version "1.8.0" Java(TM) SE Runtime Environment (build 1.8.0-b132) Java HotSpot(TM) 64-Bit Server VM (build 25.0-b70, mixed mode)

我想，这与占有量量词的工作方式有关。首先，他们像贪婪的量词一样工作。从某种意义上说，他们会尝试尽可能多地匹配。但是与贪婪的量词不同，一旦他们匹配某些东西，他们就不会在回溯后放弃比赛。

所以，拿你的正则表达式：

 "(\\?)?+&?(TXT\\{[^}]++})(&)?"

它首先发现了? 在username之前，所以它匹配并将其存储在组1中。然后它发现下一个字符& username的u不匹配。所以它回溯，停在? 。由于这被称为占有量词，因此它们不会失去匹配。

现在，它进一步发展。此时，第1组还包含? 。现在它匹配部分：

 &TXT{UE-IP,UE-Username,UE-Password}&

从那以后? 是可选的，它不匹配。但它并没有取代第1组中的任何内容。

那意味着，你得到的? 来自第一次匹配的第1组。

这似乎是Java正则表达式引擎中的一个错误，就像在Perl中一样，这个组是未定义的。这是小提琴。

正则表达式具有量词

调整JTable的大小以适应行数

为什么openSession不起作用，但getCurrentSession在Spring Hibernate中工作

java，获取set方法

从当前外部类对象实例化内部类对象

sqlite，地理坐标索引？

Clojure STM（dosync）x Java同步块

等效于wsimport的org.apache.axis.components.net.SunFakeTrustSocketFactory

如何使用java util logging框架以特定格式为每条记录创建日志文件

菱形算法不起作用（从JS到JAVA重写代码）

封闭对象的引用通过匿名类java转义