如何使用Java iText检查所有使用的字体是否嵌入到PDF中？

如何检查PDF文件中使用的所有字体是否都嵌入到带有Java和iText的文件中？我有一些现有的PDF文档，我想validation他们只使用嵌入字体。

这需要检查是否使用了PDF标准字体，并且文件中嵌入了其他使用过的字体。

查看iText in Action中的ListUsedFonts示例。

http://itextpdf.com/examples/iia.php?id=287

看起来这将打印出pdf中使用的字体以及它们是否嵌入。

/* * This class is part of the book "iText in Action - 2nd Edition" * written by Bruno Lowagie (ISBN: 9781935182610) * For more info, go to: http://itextpdf.com/examples/ * This example only works with the AGPL version of iText. */ package part4.chapter16; import java.io.FileOutputStream; import java.io.IOException; import java.io.PrintWriter; import java.util.Set; import java.util.TreeSet; import part3.chapter11.FontTypes; import com.itextpdf.text.DocumentException; import com.itextpdf.text.pdf.PdfDictionary; import com.itextpdf.text.pdf.PdfName; import com.itextpdf.text.pdf.PdfReader; public class ListUsedFonts { /** The resulting PDF file. */ public static String RESULT = "results/part4/chapter16/fonts.txt"; /** * Creates a Set containing information about the fonts in the src PDF file. * @param src the path to a PDF file * @throws IOException */ public Set listFonts(String src) throws IOException { Set set = new TreeSet(); PdfReader reader = new PdfReader(src); PdfDictionary resources; for (int k = 1; k <= reader.getNumberOfPages(); ++k) { resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES); processResource(set, resources); } reader.close(); return set; } /** * Extracts the font names from page or XObject resources. * @param set the set with the font names * @param resources the resources dictionary */ public static void processResource(Set set, PdfDictionary resource) { if (resource == null) return; PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT); if (xobjects != null) { for (PdfName key : xobjects.getKeys()) { processResource(set, xobjects.getAsDict(key)); } } PdfDictionary fonts = resource.getAsDict(PdfName.FONT); if (fonts == null) return; PdfDictionary font; for (PdfName key : fonts.getKeys()) { font = fonts.getAsDict(key); String name = font.getAsName(PdfName.BASEFONT).toString(); if (name.length() > 8 && name.charAt(7) == '+') { name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7)); } else { name = name.substring(1); PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR); if (desc == null) name += " nofontdescriptor"; else if (desc.get(PdfName.FONTFILE) != null) name += " (Type 1) embedded"; else if (desc.get(PdfName.FONTFILE2) != null) name += " (TrueType) embedded"; else if (desc.get(PdfName.FONTFILE3) != null) name += " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded"; } set.add(name); } } /** * Main method. * * @param args no arguments needed * @throws DocumentException * @throws IOException */ public static void main(String[] args) throws IOException, DocumentException { new FontTypes().createPdf(FontTypes.RESULT); Set set = new ListUsedFonts().listFonts(FontTypes.RESULT); PrintWriter out = new PrintWriter(new FileOutputStream(RESULT)); for (String fontname : set) out.println(fontname); out.flush(); out.close(); } }

 /** * Creates a set containing information about the not-embedded fonts within the src PDF file. * @param src the path to a PDF file * @throws IOException */ public Set listFonts(String src) throws IOException { Set set = new TreeSet(); PdfReader reader = new PdfReader(src); PdfDictionary resources; for (int k = 1; k <= reader.getNumberOfPages(); ++k) { resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES); processResource(set, resources); } reader.close(); return set; } /** * Finds out if the font is an embedded subset font * @param font name * @return true if the name denotes an embedded subset font */ private boolean isEmbeddedSubset(String name) { //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7)); return name != null && name.length() > 8 && name.charAt(7) == '+'; } private void processFont(PdfDictionary font, Set set) { String name = font.getAsName(PdfName.BASEFONT).toString(); if(isEmbeddedSubset(name)) return; PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR); //nofontdescriptor if (desc == null) { PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS); if (descendant == null) { set.add(name.substring(1)); } else { for (int i = 0; i < descendant.size(); i++) { PdfDictionary dic = descendant.getAsDict(i); processFont(dic, set); } } } /** * (Type 1) embedded */ else if (desc.get(PdfName.FONTFILE) != null) ; /** * (TrueType) embedded */ else if (desc.get(PdfName.FONTFILE2) != null) ; /** * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" */ else if (desc.get(PdfName.FONTFILE3) != null) ; else { set.add(name.substring(1)); } } /** * Extracts the names of the not-embedded fonts from page or XObject resources. * @param set the set with the font names * @param resources the resources dictionary */ public void processResource(Set set, PdfDictionary resource) { if (resource == null) return; PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT); if (xobjects != null) { for (PdfName key : xobjects.getKeys()) { processResource(set, xobjects.getAsDict(key)); } } PdfDictionary fonts = resource.getAsDict(PdfName.FONT); if (fonts == null) return; PdfDictionary font; for (PdfName key : fonts.getKeys()) { font = fonts.getAsDict(key); processFont(font, set); } }

上面的代码可用于检索未嵌入给定PDF文件的字体。我在Action中改进了iText的代码，这样它也可以处理Font的DescendantFont节点。

创建Chunk时，声明您使用的字体。
从您要使用的字体创建BaseFont并声明为BaseFont.EMBEDDED。
请注意，如果未将option subset设置为true，则将嵌入整个字体。

请注意，嵌入字体可能会侵犯作者身份。

我不认为这是一个“iText”用例。使用PDFBox或jPod 。这些实现了PDF模型，因此您可以：

打开文件
从文档根据对象树递减
检查这是否是一个字体对象
检查字体文件是否可用

检查是否仅使用嵌入字体要复杂得多（即，未嵌入但未使用的字体很好）。

最简单的答案是用Adobe Acrobat打开PDF文件，然后：

单击文件
选择属性
单击“字体”选项卡

这将显示文档中所有字体的列表。嵌入的任何字体都会在字体名称旁边显示“（嵌入）”。

例如：

ACaslonPro-Bold（嵌入式）

其中ACaslonPro-Bold源自您嵌入它的文件名（例如FontFactory.register("/path/to/ACaslonPro-Bold.otf",...

如何使用Java iText检查所有使用的字体是否嵌入到PDF中？

Windows机器上iText-PDF中的中文字体问题

字体真棒与swing

Java PDFBox设置PDF表单中几个字段的自定义字体

字符显示/搜索Unicode字符

Java：获取具有特定高度（以像素为单位）的字体

如何将itext pdf文件的段落设置为带有Java背景颜色的矩形

无法将选项卡和空格插入PDBox PDF文档

从Java中的TTF文件加载一些TrueType字体会导致FontFormatException：找不到字体名称

为什么使用Arial Unicode MS无法正确呈现Gujarati-Indian文本？

Font.getNumGlyphs（）返回的数字