当提供正则表达式时,Java中的String.split()方法究竟是如何工作的?

我正在准备OCPJP考试,我遇到了以下示例:

class Test { public static void main(String args[]) { String test = "I am preparing for OCPJP"; String[] tokens = test.split("\\S"); System.out.println(tokens.length); } } 

这段代码打印16.我期待像no_of_characters + 1这样的东西。有人可以解释一下,split()方法在这种情况下实际上做了什么? 我只是不明白……

它在每个"\\S"上分裂,在正则表达式引擎中代表\S非空白字符。

所以让我们尝试在非空格( \S )上拆分"xx" 。 由于这个正则表达式可以匹配一个字符,让迭代它们来标记分割的位置(我们将使用管道|为此)。

  • 'x'非空白? 是的,所以让我们标记它 | x
  • ' '非空白? 不,所以我们保持原样
  • 是最后'x'非空白? 是的,所以让我们标记它 | |

因此,我们需要在开始和结束时拆分我们的字符串,最初给出结果数组

 ["", " ", ""] ^ ^ - here we split 

但是由于尾随空字符串被删除,结果将是

 [""," "] <- result ,""] <- removed trailing empty string 

所以split返回array ["", " "] ,它只包含两个元素。

BTW。 要关闭删除最后一个空字符串,您需要使用split(regex,limit)split(regex,limit)负值,如split("\\S",-1)


现在让我们回到你的例子。 如果您的数据是分裂的每一个

 I am preparing for OCPJP | || 

||

意思是

  ""|" "|""|" "|""|""|""|""|""|""|""|""|" "|""|""|" "|""|""|""|""|"" 

所以这代表了这个数组

 [""," ",""," ","","","","","","","",""," ","",""," ","","","","",""] 

但由于尾随空字符串""被删除(如果它们的存在是由分裂引起的 – 更多信息在: 混淆String.split的输出 )

 [""," ",""," ","","","","","","","",""," ","",""," ","","","","",""] ^^ ^^ ^^ ^^ ^^ 

你得到的结果数组只包含这部分:

 [""," ",""," ","","","","","","","",""," ","",""," "] 

这正是16个元素。