Java,xml,XSLT:防止DTDvalidation

我使用Java(6)XML-Api对来自Web的html文档应用xslt转换。 这个文件格式正确xhtml,因此包含有效的DTD-Spec( )。 现在出现问题:Uppon转换XSLT-Processor尝试下载DTD并且w3-server通过HTTP 503错误拒绝这一点(由于w3的Bandwith限制 )。

如何防止XSLT-Processor下载dtd? 我不需要我的输入文档validation。

来源是:

 import javax.xml.transform.Source; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource; 

  String xslt = ""+ ""+ " "+ "  "+ " "+ " "+ " "+ ""; try { Source xmlSource = new StreamSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award"); Source xsltSource = new StreamSource(new StringReader(xslt)); TransformerFactory ft = TransformerFactory.newInstance(); Transformer trans = ft.newTransformer(xsltSource); trans.transform(xmlSource, new StreamResult(System.out)); } catch (Exception e) { e.printStackTrace(); } 

我在这里阅读了以下问题,但它们都使用了另一个XML-Api:

  • “在XOM中解析XHTML文档时出现DTD下载错误”

谢谢!

我最近在使用JAXB解组XML时遇到了这个问题。 答案是从XmlReader和InputSource创建一个SAXSource,然后将其传递给JAXB UnMarshaller的unmarshal()方法。 为了避免加载外部DTD,我在XmlReader上设置了一个自定义EntityResolver。

 SAXParserFactory spf = SAXParserFactory.newInstance(); SAXParser sp = spf.newSAXParser(); XMLReader xmlr = sp.getXMLReader(); xmlr.setEntityResolver(new EntityResolver() { public InputSource resolveEntity(String pid, String sid) throws SAXException { if (sid.equals("your remote dtd url here")) return new InputSource(new StringReader("actual contents of remote dtd")); throw new SAXException("unable to resolve remote entity, sid = " + sid); } } ); SAXSource ss = new SAXSource(xmlr, myInputSource); 

如上所述,如果曾要求解析实体以外的实体,而不是您想要解析的实体,则此自定义实体解析器将抛出exception。 如果您只是希望它继续并加载远程实体,请删除“throws”行。

尝试在DocumentBuilderFactory中设置一个function:

 URL url = new URL(urlString); InputStream is = url.openStream(); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); DocumentBuilder db; db = dbf.newDocumentBuilder(); Document result = db.parse(is); 

现在,当调用文档函数来分析外部XHTML页面时,我在XSLT(2)中遇到了同样的问题。

以前的答案让我找到了解决方案,但对我来说并不明显,所以这里有一个完整的答案:

 private void convert(InputStream xsltInputStream, InputStream srcInputStream, OutputStream destOutputStream) throws SAXException, ParserConfigurationException, TransformerFactoryConfigurationError, TransformerException, IOException { //create a parser with a fake entity resolver to disable DTD download and validation XMLReader xmlReader = SAXParserFactory.newInstance().newSAXParser().getXMLReader(); xmlReader.setEntityResolver(new EntityResolver() { public InputSource resolveEntity(String pid, String sid) throws SAXException { return new InputSource(new ByteArrayInputStream(new byte[] {})); } }); //create the transformer Source xsltSource = new StreamSource(xsltInputStream); Transformer transformer = TransformerFactory.newInstance().newTransformer(xsltSource); //create the source for the XML document which uses the reader with fake entity resolver Source xmlSource = new SAXSource(xmlReader, new InputSource(srcInputStream)); transformer.transform(xmlSource, new StreamResult(destOutputStream)); } 

如果你使用

 DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 

您可以尝试使用fllowing代码禁用dtdvalidation:

  dbf.setValidating(false); 

您需要使用javax.xml.parsers.DocumentBuilderFactory

 DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setValidating(false); DocumentBuilder builder = factory.newDocumentBuilder(); InputSource src = new InputSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award") Document xmlDocument = builder.parse(src.getByteStream()); DOMSource source = new DOMSource(xmlDocument); TransformerFactory tf = TransformerFactory.newInstance(); Transformer transformer = tf.newTransformer(xsltSource); transformer.transform(source, new StreamResult(System.out));