尝试使用Apache poi制作简单的PDF文档

我看到互联网上充斥着人们抱怨apache的pdf产品，但我在这里找不到我的特殊用法。我正在尝试用apache poi做一个简单的Hello World。现在我的代码如下：

public ByteArrayOutputStream export() throws IOException { //Blank Document XWPFDocument document = new XWPFDocument(); //Write the Document in file system ByteArrayOutputStream out = new ByteArrayOutputStream();; //create table XWPFTable table = document.createTable(); XWPFStyles styles = document.createStyles(); styles.setSpellingLanguage("English"); //create first row XWPFTableRow tableRowOne = table.getRow(0); tableRowOne.getCell(0).setText("col one, row one"); tableRowOne.addNewTableCell().setText("col two, row one"); tableRowOne.addNewTableCell().setText("col three, row one"); //create second row XWPFTableRow tableRowTwo = table.createRow(); tableRowTwo.getCell(0).setText("col one, row two"); tableRowTwo.getCell(1).setText("col two, row two"); tableRowTwo.getCell(2).setText("col three, row two"); //create third row XWPFTableRow tableRowThree = table.createRow(); tableRowThree.getCell(0).setText("col one, row three"); tableRowThree.getCell(1).setText("col two, row three"); tableRowThree.getCell(2).setText("col three, row three"); PdfOptions options = PdfOptions.create(); PdfConverter.getInstance().convert(document, out, options); out.close(); return out; }

并且调用它的代码是：

  public ResponseEntity convertToPDFPost(@ApiParam(value = "DTOs passed from the FE" ,required=true ) @Valid @RequestBody ExportEnvelopeDTO exportDtos) { if (exportDtos.getProdExportDTOs() != null) { try { FileOutputStream out = new FileOutputStream("/Users/kornhaus/Desktop/test.pdf"); out.write(exporter.export().toByteArray()); out.close(); } catch (IOException e) { e.printStackTrace(); } return new ResponseEntity(responseFile, responseHeaders, HttpStatus.OK); } return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR); } }

在这一行： out.write(exporter.export().toByteArray()); 代码抛出exception：

 org.apache.poi.xwpf.converter.core.XWPFConverterException: java.io.IOException: Unable to parse xml bean

我不知道造成这种情况的原因，甚至在哪里寻找这种文件。我已经编写了十年以上的编码，从来没有遇到过应该是一个简单的Java库的困难。任何帮助都会很棒。

这个问题的主要问题是那些PdfOptions和PdfConverter不是apache poi项目的一部分。它们由opensagres开发，第一个版本命名为org.apache.poi.xwpf.converter.pdf.PdfOptions和org.apache.poi.xwpf.converter.pdf.PdfConverter 。这些旧类自2014年以来未更新，需要使用版本3.9的apache poi 。

但是同样的开发人员提供了fr.opensagres.poi.xwpf.converter.pdf ，这是更新的，使用最新的稳定版本apache poi 3.17 。所以我们应该使用它。

但是，即使那些较新的PdfOptions和PdfConverter都不是apache poi项目的一部分， apache poi也不会测试那些发布它们的人。因此， apache poi创建的默认*.docx文档缺少PdfConverter所需的一些内容。

必须有样式文档，即使它是空的。
必须有至少设置页面大小的页面的节属性。
表必须具有表格网格集。

为了实现这一点，我们必须在程序中另外添加一些代码。不幸的是，这需要Faq-N10025中提到的所有模式ooxml-schemas-1.3.jar的完整jar。

并且因为我们需要更改底层的低级对象，所以必须编写文档以便提交底层对象。否则我们交出PdfConverter将不完整。

例：

 import java.io.*; import java.math.BigInteger; //needed jars: fr.opensagres.poi.xwpf.converter.core-2.0.1.jar, // fr.opensagres.poi.xwpf.converter.pdf-2.0.1.jar, // fr.opensagres.xdocreport.itext.extension-2.0.1.jar, // itext-2.1.7.jar import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions; import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter; //needed jars: apache poi and it's dependencies // and additionally: ooxml-schemas-1.3.jar import org.apache.poi.xwpf.usermodel.*; import org.apache.poi.util.Units; import org.openxmlformats.schemas.wordprocessingml.x2006.main.*; public class XWPFToPDFConverterSampleMin { public static void main(String[] args) throws Exception { XWPFDocument document = new XWPFDocument(); // there must be a styles document, even if it is empty XWPFStyles styles = document.createStyles(); // there must be section properties for the page having at least the page size set CTSectPr sectPr = document.getDocument().getBody().addNewSectPr(); CTPageSz pageSz = sectPr.addNewPgSz(); pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5" pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11" // filling the body XWPFParagraph paragraph = document.createParagraph(); //create table XWPFTable table = document.createTable(); //create first row XWPFTableRow tableRowOne = table.getRow(0); tableRowOne.getCell(0).setText("col one, row one"); tableRowOne.addNewTableCell().setText("col two, row one"); tableRowOne.addNewTableCell().setText("col three, row one"); //create CTTblGrid for this table with widths of the 3 columns. //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths. //values are in unit twentieths of a point (1/1440 of an inch) //first column = 2 inches width table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); //other columns (2 in this case) also each 2 inches width for (int col = 1 ; col < 3; col++) { table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); } //create second row XWPFTableRow tableRowTwo = table.createRow(); tableRowTwo.getCell(0).setText("col one, row two"); tableRowTwo.getCell(1).setText("col two, row two"); tableRowTwo.getCell(2).setText("col three, row two"); //create third row XWPFTableRow tableRowThree = table.createRow(); tableRowThree.getCell(0).setText("col one, row three"); tableRowThree.getCell(1).setText("col two, row three"); tableRowThree.getCell(2).setText("col three, row three"); paragraph = document.createParagraph(); //trying picture XWPFRun run = paragraph.createRun(); run.setText("The picture in line: "); InputStream in = new FileInputStream("samplePict.jpeg"); run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30)); in.close(); run.setText(" text after the picture."); paragraph = document.createParagraph(); //document must be written so underlaaying objects will be committed ByteArrayOutputStream out = new ByteArrayOutputStream(); document.write(out); document.close(); document = new XWPFDocument(new ByteArrayInputStream(out.toByteArray())); PdfOptions options = PdfOptions.create(); PdfConverter converter = (PdfConverter)PdfConverter.getInstance(); converter.convert(document, new FileOutputStream("XWPFToPDFConverterSampleMin.pdf"), options); document.close(); } }

使用XDocReport

另一种方法是使用最新版本的opensagres / xdocreport ，如Converter仅使用ConverterRegistry中所述：

 import java.io.*; import java.math.BigInteger; //needed jars: xdocreport-2.0.1.jar, // odfdom-java-0.8.7.jar, // itext-2.1.7.jar import fr.opensagres.xdocreport.converter.Options; import fr.opensagres.xdocreport.converter.IConverter; import fr.opensagres.xdocreport.converter.ConverterRegistry; import fr.opensagres.xdocreport.converter.ConverterTypeTo; import fr.opensagres.xdocreport.core.document.DocumentKind; //needed jars: apache poi and it's dependencies // and additionally: ooxml-schemas-1.3.jar import org.apache.poi.xwpf.usermodel.*; import org.apache.poi.util.Units; import org.openxmlformats.schemas.wordprocessingml.x2006.main.*; public class XWPFToPDFXDocReport { public static void main(String[] args) throws Exception { XWPFDocument document = new XWPFDocument(); // there must be a styles document, even if it is empty XWPFStyles styles = document.createStyles(); // there must be section properties for the page having at least the page size set CTSectPr sectPr = document.getDocument().getBody().addNewSectPr(); CTPageSz pageSz = sectPr.addNewPgSz(); pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5" pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11" // filling the body XWPFParagraph paragraph = document.createParagraph(); //create table XWPFTable table = document.createTable(); //create first row XWPFTableRow tableRowOne = table.getRow(0); tableRowOne.getCell(0).setText("col one, row one"); tableRowOne.addNewTableCell().setText("col two, row one"); tableRowOne.addNewTableCell().setText("col three, row one"); //create CTTblGrid for this table with widths of the 3 columns. //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths. //values are in unit twentieths of a point (1/1440 of an inch) //first column = 2 inches width table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); //other columns (2 in this case) also each 2 inches width for (int col = 1 ; col < 3; col++) { table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); } //create second row XWPFTableRow tableRowTwo = table.createRow(); tableRowTwo.getCell(0).setText("col one, row two"); tableRowTwo.getCell(1).setText("col two, row two"); tableRowTwo.getCell(2).setText("col three, row two"); //create third row XWPFTableRow tableRowThree = table.createRow(); tableRowThree.getCell(0).setText("col one, row three"); tableRowThree.getCell(1).setText("col two, row three"); tableRowThree.getCell(2).setText("col three, row three"); paragraph = document.createParagraph(); //trying picture XWPFRun run = paragraph.createRun(); run.setText("The picture in line: "); InputStream in = new FileInputStream("samplePict.jpeg"); run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30)); in.close(); run.setText(" text after the picture."); paragraph = document.createParagraph(); //document must be written so underlaaying objects will be committed ByteArrayOutputStream out = new ByteArrayOutputStream(); document.write(out); document.close(); // 1) Create options DOCX 2 PDF to select well converter form the registry Options options = Options.getFrom(DocumentKind.DOCX).to(ConverterTypeTo.PDF); // 2) Get the converter from the registry IConverter converter = ConverterRegistry.getRegistry().getConverter(options); // 3) Convert DOCX 2 PDF InputStream docxin= new ByteArrayInputStream(out.toByteArray()); OutputStream pdfout = new FileOutputStream(new File("XWPFToPDFXDocReport.pdf")); converter.convert(docxin, pdfout, options); docxin.close(); pdfout.close(); } }

2018年10月：此代码使用apache poi 3.17 。由于apache poi 4.0.0中的变化，直到现在才在fr.opensagres.poi.xwpf.converter以及fr.opensagres.xdocreport.converter使用，因此使用apache poi 4.0.0无法正常工作。

尝试使用Apache poi制作简单的PDF文档

强制或生成jvm核心转储（IBM JVM）

使用Erfc函数：commons.apache.org库

使用Java中的Apache OAuth客户端2.0库生成授权代码和用户令牌的问题

java中的Docx到Pdf转换器

Apache Lucene – 优化搜索