如何从一个XML JAVA中获取CDATA标记中包含的文本内容

我有以下XML：

    application/xml
 local-C++
  200 <![CDATA[]]>

我想从内容节点解析以下文本，如下所示：

<![CDATA[]]>

请注意，内容包含在CDATA标记中。如何使用任何方法在Java中完成此操作。

这是我的代码：

 @Test public void testGetDoOrchResponse() throws IOException { String path = "/Users/haddad/Git/Tools/ContentUtils/src/test/resources/testdata/doOrch_testfiles/doOrch_response.xml"; File f = new File(path); String response = FileUtils.readFileToString(f); String content = getDoOrchResponse(response, "content"); System.out.println("Content: "+content); }

//输出：内容：空白

 static String getDoOrchResponse(String xml, String tagFragment) throws FileNotFoundException { String content = new String(); try { Document doc = getDocumentXML(xml); NodeList nlNodeExplanationList = doc.getElementsByTagName("response"); for(int i=0;i<nlNodeExplanationList.getLength();i++) { Node explanationNode = nlNodeExplanationList.item(i); List titleList = getTextValuesByTagName((Element)explanationNode, tagFragment); content = titleList.get(0); } } catch (IOException e) { e.printStackTrace(); } return content; } static List getTextValuesByTagName(Element element, String tagName) { NodeList nodeList = element.getElementsByTagName(tagName); ArrayList list = new ArrayList(); for (int i = 0; i < nodeList.getLength(); i++) { String textValue = getTextValue(nodeList.item(i)); if(textValue.equalsIgnoreCase("") ) { textValue = "blank"; } list.add(textValue); } return list; } static String getTextValue(Node node) { StringBuffer textValue = new StringBuffer(); int length = node.getChildNodes().getLength(); for (int i = 0; i < length; i ++) { Node c = node.getChildNodes().item(i); if (c.getNodeType() == Node.TEXT_NODE) { textValue.append(c.getNodeValue()); } } return textValue.toString().trim(); } static Document getDocumentXML(String xml) throws FileNotFoundException { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db; Document doc = null; try { db = dbf.newDocumentBuilder(); doc = db.parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8")))); doc.getDocumentElement().normalize(); } catch (ParserConfigurationException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (SAXException e) { e.printStackTrace(); } return doc; }

我究竟做错了什么？为什么我输出空白？我只是看不到它……

如果要提取Element节点的内容，请使用getTextContent()方法。如果您确实需要或想要CDATA部分标记，那么您需要使用LSSerializer或类似程序序列化该节点：

  DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance(); docFactory.setNamespaceAware(true); DocumentBuilder docBuilder = docFactory.newDocumentBuilder(); Document doc = docBuilder.parse(new File("doc1.xml")); Element content = (Element)doc.getElementsByTagNameNS("http://comResponse.engine/response", "content").item(0); if (content != null) { System.out.println(content.getTextContent()); LSSerializer ser = ((DOMImplementationLS)doc.getImplementation()).createLSSerializer(); if (content.getFirstChild() != null) { System.out.println(ser.writeToString(content.getFirstChild())); } }

这就是理论，对我来说，Java JRE 1.8输出没有CDATA部分的结束标记，看起来LSSerializer与单个CDATA部分节点无法正常工作。

如何从一个XML JAVA中获取CDATA标记中包含的文本内容

所有具体方法的抽象类

如何在NetBeans 8.0中启用“JAX-RPC Web服务”插件

使用参数化IN子句时，N1QL查询超时

H2 Java插入忽略 – 允许exception

如何使用基于java.util.Map的类序列化Jackson

优化Spring-Data JPA查询

Java内存中的文件结构？

org.hibernate.HibernateException：找不到/hibernate.cfg.xml

如何让google guice注入一个自定义记录器，比如一个commons-logging或log4j logger

在Selenium中避免NoSuchElementException的最佳方法是什么？