如何在Java中打开和操作Word文档/模板？

我需要打开一个.doc/.dot/.docx/.dotx （我不挑剔，我只是想让它工作）文档，解析它为占位符（或类似的东西），放入我自己的数据，然后返回生成.doc/.docx/.dotx/.pdf文件。

最重要的是，我需要工具来实现免费。

我一直在寻找适合我需要的东西，但我找不到任何东西。 Docmosis，Javadocx，Aspose等工具属于商业用途。根据我的阅读，Apache POI远未成功实现这一点（他们目前没有任何官方开发人员在Word框架部分工作）。

唯一看起来可以解决的问题是OpenOffice UNO API。但对于从未使用过此API的人来说，这是一个相当大的字节（就像我一样）。

所以，如果我要进入这个领域，我需要确保自己走在正确的道路上。

有人可以给我一些建议吗？

我知道自从我发布这个问题以来已经很长时间了，我说我会在完成后发布我的解决方案。所以在这里。

我希望有一天它会帮助某人。这是一个完整的工作类，您只需将它放在应用程序中，并将TEMPLATE_DIRECTORY_ROOT目录与.docx模板放在根目录中。

用法很简单。您将占位符（键）放在.docx文件中，然后传递文件名和包含该文件的相应键值对的Map。

请享用！

 import java.io.BufferedInputStream; import java.io.BufferedOutputStream; import java.io.BufferedReader; import java.io.Closeable; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.net.URI; import java.util.Deque; import java.util.Enumeration; import java.util.HashMap; import java.util.Iterator; import java.util.LinkedList; import java.util.Map; import java.util.UUID; import java.util.zip.ZipEntry; import java.util.zip.ZipFile; import java.util.zip.ZipOutputStream; import javax.faces.context.ExternalContext; import javax.faces.context.FacesContext; import javax.servlet.http.HttpServletResponse; public class DocxManipulator { private static final String MAIN_DOCUMENT_PATH = "word/document.xml"; private static final String TEMPLATE_DIRECTORY_ROOT = "TEMPLATES_DIRECTORY/"; /* PUBLIC METHODS */ /** * Generates .docx document from given template and the substitution data * * @param templateName * Template data * @param substitutionData * Hash map with the set of key-value pairs that represent * substitution data * @return */ public static Boolean generateAndSendDocx(String templateName, Map substitutionData) { String templateLocation = TEMPLATE_DIRECTORY_ROOT + templateName; String userTempDir = UUID.randomUUID().toString(); userTempDir = TEMPLATE_DIRECTORY_ROOT + userTempDir + "/"; try { // Unzip .docx file unzip(new File(templateLocation), new File(userTempDir)); // Change data changeData(new File(userTempDir + MAIN_DOCUMENT_PATH), substitutionData); // Rezip .docx file zip(new File(userTempDir), new File(userTempDir + templateName)); // Send HTTP response sendDOCXResponse(new File(userTempDir + templateName), templateName); // Clean temp data deleteTempData(new File(userTempDir)); } catch (IOException ioe) { System.out.println(ioe.getMessage()); return false; } return true; } /* PRIVATE METHODS */ /** * Unzipps specified ZIP file to specified directory * * @param zipfile * Source ZIP file * @param directory * Destination directory * @throws IOException */ private static void unzip(File zipfile, File directory) throws IOException { ZipFile zfile = new ZipFile(zipfile); Enumeration entries = zfile.entries(); while (entries.hasMoreElements()) { ZipEntry entry = entries.nextElement(); File file = new File(directory, entry.getName()); if (entry.isDirectory()) { file.mkdirs(); } else { file.getParentFile().mkdirs(); InputStream in = zfile.getInputStream(entry); try { copy(in, file); } finally { in.close(); } } } } /** * Substitutes keys found in target file with corresponding data * * @param targetFile * Target file * @param substitutionData * Map of key-value pairs of data * @throws IOException */ @SuppressWarnings({ "unchecked", "rawtypes" }) private static void changeData(File targetFile, Map substitutionData) throws IOException{ BufferedReader br = null; String docxTemplate = ""; try { br = new BufferedReader(new InputStreamReader(new FileInputStream(targetFile), "UTF-8")); String temp; while( (temp = br.readLine()) != null) docxTemplate = docxTemplate + temp; br.close(); targetFile.delete(); } catch (IOException e) { br.close(); throw e; } Iterator substitutionDataIterator = substitutionData.entrySet().iterator(); while(substitutionDataIterator.hasNext()){ Map.Entry pair = (Map.Entry)substitutionDataIterator.next(); if(docxTemplate.contains(pair.getKey())){ if(pair.getValue() != null) docxTemplate = docxTemplate.replace(pair.getKey(), pair.getValue()); else docxTemplate = docxTemplate.replace(pair.getKey(), "NEDOSTAJE"); } } FileOutputStream fos = null; try{ fos = new FileOutputStream(targetFile); fos.write(docxTemplate.getBytes("UTF-8")); fos.close(); } catch (IOException e) { fos.close(); throw e; } } /** * Zipps specified directory and all its subdirectories * * @param directory * Specified directory * @param zipfile * Output ZIP file name * @throws IOException */ private static void zip(File directory, File zipfile) throws IOException { URI base = directory.toURI(); Deque queue = new LinkedList(); queue.push(directory); OutputStream out = new FileOutputStream(zipfile); Closeable res = out; try { ZipOutputStream zout = new ZipOutputStream(out); res = zout; while (!queue.isEmpty()) { directory = queue.pop(); for (File kid : directory.listFiles()) { String name = base.relativize(kid.toURI()).getPath(); if (kid.isDirectory()) { queue.push(kid); name = name.endsWith("/") ? name : name + "/"; zout.putNextEntry(new ZipEntry(name)); } else { if(kid.getName().contains(".docx")) continue; zout.putNextEntry(new ZipEntry(name)); copy(kid, zout); zout.closeEntry(); } } } } finally { res.close(); } } /** * Sends HTTP Response containing .docx file to Client * * @param generatedFile * Path to generated .docx file * @param fileName * File name of generated file that is being presented to user * @throws IOException */ private static void sendDOCXResponse(File generatedFile, String fileName) throws IOException { FacesContext facesContext = FacesContext.getCurrentInstance(); ExternalContext externalContext = facesContext.getExternalContext(); HttpServletResponse response = (HttpServletResponse) externalContext .getResponse(); BufferedInputStream input = null; BufferedOutputStream output = null; response.reset(); response.setHeader("Content-Type", "application/msword"); response.setHeader("Content-Disposition", "attachment; filename=\"" + fileName + "\""); response.setHeader("Content-Length",String.valueOf(generatedFile.length())); input = new BufferedInputStream(new FileInputStream(generatedFile), 10240); output = new BufferedOutputStream(response.getOutputStream(), 10240); byte[] buffer = new byte[10240]; for (int length; (length = input.read(buffer)) > 0;) { output.write(buffer, 0, length); } output.flush(); input.close(); output.close(); // Inform JSF not to proceed with rest of life cycle facesContext.responseComplete(); } /** * Deletes directory and all its subdirectories * * @param file * Specified directory * @throws IOException */ public static void deleteTempData(File file) throws IOException { if (file.isDirectory()) { // directory is empty, then delete it if (file.list().length == 0) file.delete(); else { // list all the directory contents String files[] = file.list(); for (String temp : files) { // construct the file structure File fileDelete = new File(file, temp); // recursive delete deleteTempData(fileDelete); } // check the directory again, if empty then delete it if (file.list().length == 0) file.delete(); } } else { // if file, then delete it file.delete(); } } private static void copy(InputStream in, OutputStream out) throws IOException { byte[] buffer = new byte[1024]; while (true) { int readCount = in.read(buffer); if (readCount < 0) { break; } out.write(buffer, 0, readCount); } } private static void copy(File file, OutputStream out) throws IOException { InputStream in = new FileInputStream(file); try { copy(in, out); } finally { in.close(); } } private static void copy(InputStream in, File file) throws IOException { OutputStream out = new FileOutputStream(file); try { copy(in, out); } finally { out.close(); } } }

由于docx文件只是xml文件的zip存档（加上嵌入对象（如图像）的任何二进制文件），我们通过解压缩zip文件，将document.xml提供给模板引擎（我们使用freemarker ）来满足这一要求。它为我们合并，然后压缩输出文档以获取新的docx文件。

然后，模板文档只是一个带有嵌入式freemarker表达式/指令的普通docx，可以在Word中编辑。

由于（un）压缩可以使用JDK完成，而Freemarker是开源的，因此您不需要支付任何许可费，即使是单词本身也是如此。

限制是此方法只能发出docx或rtf文件，输出文档将具有与模板相同的文件类型。如果您需要将文档转换为其他格式（例如pdf），则必须单独解决该问题。

我最终依赖Apache Poi 3.12并处理段落（分别从表格，页眉/页脚和脚注中提取段落，因为XWPFDocument.getParagraphs（）不会返回这些段落）。

处理代码（ ~100行）和unit testing在github上。

我和你的情况差不多，我不得不一次修改一大堆MS Word合并模板。经过google搜索试图找到Java解决方案后，我终于安装了免费的Visual Studio 2010 Express并在C＃中完成了这项工作。

我最近处理过类似的问题：“一个接受模板’.docx’文件的工具，通过评估传递的参数上下文处理文件，并输出’.docx’文件作为进程的结果。”

终于上帝给我们带来了scriptlet4dox :)。该产品的主要function是：1。groovy代码注入作为模板文件中的脚本（参数注入等）2。循环遍历表中的集合项

还有很多其他function。但是当我检查项目的最后一次提交大约在一年前执行时，因此新function和新的错误修复程序不支持该项目。这是您选择使用与否。

如何在Java中打开和操作Word文档/模板？

在Java中将csv行转换为JSON对象

JTable中的页脚行

在java中创建一个简单的规则引擎

即将发布的Java 8版本中的虚拟扩展方法

JNI：将字节从c ++传递给java

如果它在jar文件中，则基于Spring Annotation的控制器无法正常工作

Java录制麦克风到字节数组和播放声音

格式化IP：端口字符串为

无法接收已发布的消息以订阅mqtt paho上的主题

如何从绝对文件路径制作CommonsMultipartFile？