正则表达式分割嵌套的坐标字符串

我有一个格式为"[(1, 2), (2, 3), (3, 4)]"的字符串，具有任意数量的元素。我试图将它分割成分隔坐标的逗号，即检索(1, 2) ， (2, 3)和(3, 4) 。

我可以用Java正则表达式吗？我是一个完整的菜鸟，但希望Java正则表达式足够强大。如果不是，你能建议一个替代方案吗？

您可以使用String#split()来实现此目的。

 String string = "[(1, 2), (2, 3), (3, 4)]"; string = string.substring(1, string.length() - 1); // Get rid of braces. String[] parts = string.split("(?<=\\))(,\\s*)(?=\\()"); for (String part : parts) { part = part.substring(1, part.length() - 1); // Get rid of parentheses. String[] coords = part.split(",\\s*"); int x = Integer.parseInt(coords[0]); int y = Integer.parseInt(coords[1]); System.out.printf("x=%d, y=%d\n", x, y); }

(?<=\\)) 正向后视意味着它必须以（ ) 开头。 (?=\\() 正向前瞻意味着它必须被( 。 (,\\s*)意味着它必须在之后的任何空间上被分割。 \\在这里只是为了逃避特定于正则表达式的字符。

也就是说，特定的String可以识别为List#toString() 。你确定你做得对吗？ ;）

根据评论更新，您确实可以做另一种方式并摆脱非数字：

 String string = "[(1, 2), (2, 3), (3, 4)]"; String[] parts = string.split("\\D."); for (int i = 1; i < parts.length; i += 3) { int x = Integer.parseInt(parts[i]); int y = Integer.parseInt(parts[i + 1]); System.out.printf("x=%d, y=%d\n", x, y); }

这里的\\D意味着它必须在任何非数字上拆分（ \\d代表数字）。这个. after表示它应该消除数字后的任何空白匹配。但我必须承认，我不确定如何消除数字前的空白匹配。我还不是一个训练有素的正则表达大师。 嘿，巴特K，你能做得更好吗？

毕竟，为此最好使用解析器 。请参阅Huberts关于此主题的答案。

来自Java 5

 Scanner sc = new Scanner(); sc.useDelimiter("\\D+"); // skip everything that is not a digit List result = new ArrayList(); while (sc.hasNextInt()) { result.add(new Coord(sc.nextInt(), sc.nextInt())); } return result;

编辑：我们不知道字符串coords中传递了多少coords 。

如果您不需要表达式来validation坐标周围的语法，那么应该这样做：

 \(\d+,\s\d+\)

此表达式将返回多个匹配项（三个与您的示例中的输入相对应）。

在你的问题中，你声明你想要“retreive (1, 2) ， (2, 3)和(3, 4) 。如果你确实需要与每个坐标相关的值对，你可以放弃括号并修改正则表达式来做一些捕获：

 (\d+),\s(\d+)

Java代码看起来像这样：

 import java.util.regex.*; public class Test { public static void main(String[] args) { Pattern pattern = Pattern.compile("(\\d+),\\s(\\d+)"); Matcher matcher = pattern.matcher("[(1, 2), (2, 3), (3, 4)]"); while (matcher.find()) { int x = Integer.parseInt(matcher.group(1)); int y = Integer.parseInt(matcher.group(2)); System.out.printf("x=%d, y=%d\n", x, y); } } }

如果你使用正则表达式，你将会得到糟糕的错误报告，如果你的需求发生变化，事情会变得更加复杂（例如，如果你必须将不同方括号中的集合解析成不同的组）。

我建议你手工编写解析器，它就像10行代码，不应该很脆弱。跟踪你正在做的一切，打开parens，关闭parens，打开括号和关闭括号。它就像一个带有5个选项（和默认值）的switch语句，真的没那么糟糕。

对于最小的方法，可以忽略打开的parens和开括号，因此实际上只有3种情况。

这将是最低限度的。

 // Java-like psuedocode int valuea; String lastValue; tokens=new StringTokenizer(String, "[](),", true); for(String token : tokens) { // The token Before the ) is the second int of the pair, and the first should // already be stored if(token.equals(")")) output.addResult(valuea, lastValue.toInt()); // The token before the comma is the first int of the pair else if(token.equals(",")) valuea=lastValue.toInt(); // Just store off this token and deal with it when we hit the proper delim else lastValue=token; }

这并不比基于正则表达式的最小解决方案更好，除了它将更容易维护和增强。（添加错误检查，为paren和方括号匹配添加堆栈并检查错误的逗号和其他无效语法）

作为可扩展性的一个例子，如果你不得不将不同的方括号分隔组放到不同的输出集中，那么添加就像这样简单：

  // When we close the square bracket, start a new output group. else if(token.equals("]")) output.startNewGroup();

检查parens就像创建一堆字符并推送每个[或（在堆栈中，然后当你得到]或）时一样简单，弹出堆栈并断言它匹配。此外，完成后，请确保您的stack.size（）== 0。

是否总会有3组坐标需要分析？

你可以尝试：

\[(\(\d,\d\)), (\(\d,\d\)), (\(\d,\d\))\]

在正则表达式中，你可以拆分(?<=\)),它使用Positive Lookbehind ：

 string[] subs = str.replaceAll("\[","").replaceAll("\]","").split("(?<=\)),");

在simpe字符串函数中，您可以删除[和]并使用string.split("),") ，然后返回它。

正则表达式分割嵌套的坐标字符串

我的递归条件是否适合计算二叉树高度？

从WSDL文件生成Web服务

Selenium Scripts在命令行上

JPA – 找不到类型为enum的validation器

什么是Spring Framework中的EJB替代方案

在主表DataTable中显示Hashmap键和值

Android：从C ++ Native Activity调用Java类

解释Java并发中的“程序顺序规则”

相对文件路径问题

Java Swing：GUI冻结 – jstack解释