Java Regular Expression Matcher找不到所有可能的匹配项

我正在看TutorialsPoint的代码,从那以后一直困扰着我……看看这段代码:

import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { public static void main( String args[] ){ // String to be scanned to find the pattern. String line = "This order was placed for QT3000! OK?"; String pattern = "(.*)(\\d+)(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); while(m.find( )) { System.out.println("Found value: " + m.group(1)); System.out.println("Found value: " + m.group(2)); System.out.println("Found value: " + m.group(3)); } } } 

此代码成功打印:

 Found value: This was placed for QT300 Found value: 0 Found value: ! OK? 

但根据正则表达式"(.*)(\\d+)(.*)" ,为什么不返回其他可能的结果,例如:

 Found value: This was placed for QT30 Found value: 00 Found value: ! OK? 

要么

 Found value: This was placed for QT Found value: 3000 Found value: ! OK? 

如果这段代码不适合这样做,那么如何编写一个可以找到所有可能匹配的代码呢?

这是因为*的贪婪和回溯 。

字符串:

 This order was placed for QT3000! OK? 

正则表达式:

 (.*)(\\d+)(.*) 

我们都知道.*贪婪,尽可能匹配所有角色。 所以第一个.*匹配最后一个字符的所有字符? 然后它按顺序回溯以提供匹配。 我们的正则表达式中的下一个模式是\d+ ,因此它回溯到一个数字。 一旦找到一个数字, \d+匹配该数字,因为这里满足条件( \d+匹配一个或多个数字 )。 现在第一个(.*)捕获This order was placed for QT300 ,以下(\\d+)捕获位于之前的数字0 ! 符号。

现在下一个模式(.*)捕获所有剩余的字符!OK?m.group(1)指的是组索引1和m.group(2)内存在的字符m.group(2)指的是索引2,就像它继续进行一样。

请在此处查看演示。

获得所需的输出。

 String line = "This order was placed for QT3000! OK?"; String pattern = "(.*)(\\d{2})(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); while(m.find( )) { System.out.println("Found value: " + m.group(1)); System.out.println("Found value: " + m.group(2)); System.out.println("Found value: " + m.group(3)); } 

输出:

 Found value: This order was placed for QT30 Found value: 00 Found value: ! OK? 

(.*)(\\d{2}) ,按顺序回溯最多两位数以提供匹配。

将您的模式更改为此,

 String pattern = "(.*?)(\\d+)(.*)"; 

为了获得输出,

 Found value: This order was placed for QT Found value: 3000 Found value: ! OK? 

? *迫使*进行非贪婪的比赛。

使用额外的捕获组来获取单个程序的输出。

 String line = "This order was placed for QT3000! OK?"; String pattern = "((.*?)(\\d{2}))(?:(\\d{2})(.*))"; Pattern r = Pattern.compile(pattern); Matcher m = r.matcher(line); while(m.find( )) { System.out.println("Found value: " + m.group(1)); System.out.println("Found value: " + m.group(4)); System.out.println("Found value: " + m.group(5)); System.out.println("Found value: " + m.group(2)); System.out.println("Found value: " + m.group(3) + m.group(4)); System.out.println("Found value: " + m.group(5)); } 

输出:

 Found value: This order was placed for QT30 Found value: 00 Found value: ! OK? Found value: This order was placed for QT Found value: 3000 Found value: ! OK? 
 (.*?)(\\d+)(.*) 

*贪婪的量词非贪婪放*?

因为你的第一组(.*)是贪婪的,它会捕获evrything并且只会留下一个0来捕获。如果你让它非贪婪它会给你预期的结果。参见演示。

https://regex101.com/r/tX2bH4/53