如何在实时语法荧光笔中处理多行注释?

我正在编写自己的文本编辑器,在Java中使用语法高亮显示,目前它只是在每次用户输入单个字符时解析并突出显示当前行。 虽然可能不是最有效的方式,但它足够好并且不会引起任何明显的性能问题。 在伪Java中,这将是我的代码的核心概念:

public void textUpdated(String wholeText, int updateOffset, int updateLength) { int lineStart = getFirstLineStart(wholeText, updateOffset); int lineEnd = getLastLineEnd(wholeText, updateOffset + updateLength); List foundTokens = tokenizeText(wholeText, lineStart, lineEnd); for(Token token : foundTokens) { highlightText(token.offset, token.length, token.tokenType); } } 

真正的问题在于多行注释。 要检查输入的字符是否在多行注释中,程序需要解析回最近出现的“/ *”,同时还要知道这种情况是出现在文字还是其他注释中。 如果文本量很小,这不会是一个问题,但如果文本包含20,000行代码,则可能必须在每个按键上扫描并重新突出显示20,000行代码,这将是非常低效的。

所以我的最终问题是:如何在语法高亮显示中处理多行令牌/注释,同时保持高效?

大约10年前(或更多)我尝试这样做(为了好玩)。 因为代码太旧了,所以我不记得代码的所有细节和代码中的逻辑条件。 这里的所有代码基本上都是一个powershell解决方案。 它决不会像rici那样试图保持每条线的状态。

我将尝试解释代码的高级概念。 希望对你有些意义。

目前,每次用户输入单个字符时,它只会解析并突出显示当前行。

这也是我的代码的基本前提。 但是,它确实可以处理多行代码。

如何在保持高效的同时处理语法高亮显示中的多行令牌/注释?

在我的解决方案中,当您输入"/*"以启动多行注释时,我将对以下所有代码行进行注释,直到我找到注释的结尾或另一个多行注释的开头或结束时文件。 然后,当您输入匹配的"*/"以结束多行注释时,我将重新突出显示以下行,直到下一行多行注释或文档结束。

因此,突出显示的数量取决于多行注释之间的代码量。

这是它如何工作的快速概述。 我怀疑它是100%准确,因为我只玩了一点点。 应该注意的是,这段代码是在我刚刚学习Java的时候编写的,所以我绝不会认为它是最好的方法,只是我当时所知道的最好的方法。

这是你娱乐的代码:)

只需运行代码并单击按钮即可开始使用。

 import java.awt.*; import java.awt.event.*; import java.io.*; import java.net.*; import java.util.*; import javax.swing.*; import javax.swing.event.*; import javax.swing.text.*; class SyntaxDocument extends DefaultStyledDocument { private DefaultStyledDocument doc; private Element rootElement; private boolean multiLineComment; private MutableAttributeSet normal; private MutableAttributeSet keyword; private MutableAttributeSet comment; private MutableAttributeSet quote; private Set keywords; private int lastLineProcessed = -1; public SyntaxDocument() { doc = this; rootElement = doc.getDefaultRootElement(); putProperty( DefaultEditorKit.EndOfLineStringProperty, "\n" ); normal = new SimpleAttributeSet(); StyleConstants.setForeground(normal, Color.black); comment = new SimpleAttributeSet(); StyleConstants.setForeground(comment, Color.gray); StyleConstants.setItalic(comment, true); keyword = new SimpleAttributeSet(); StyleConstants.setForeground(keyword, Color.blue); quote = new SimpleAttributeSet(); StyleConstants.setForeground(quote, Color.red); keywords = new HashSet(); keywords.add( "abstract" ); keywords.add( "boolean" ); keywords.add( "break" ); keywords.add( "byte" ); keywords.add( "byvalue" ); keywords.add( "case" ); keywords.add( "cast" ); keywords.add( "catch" ); keywords.add( "char" ); keywords.add( "class" ); keywords.add( "const" ); keywords.add( "continue" ); keywords.add( "default" ); keywords.add( "do" ); keywords.add( "double" ); keywords.add( "else" ); keywords.add( "extends" ); keywords.add( "false" ); keywords.add( "final" ); keywords.add( "finally" ); keywords.add( "float" ); keywords.add( "for" ); keywords.add( "future" ); keywords.add( "generic" ); keywords.add( "goto" ); keywords.add( "if" ); keywords.add( "implements" ); keywords.add( "import" ); keywords.add( "inner" ); keywords.add( "instanceof" ); keywords.add( "int" ); keywords.add( "interface" ); keywords.add( "long" ); keywords.add( "native" ); keywords.add( "new" ); keywords.add( "null" ); keywords.add( "operator" ); keywords.add( "outer" ); keywords.add( "package" ); keywords.add( "private" ); keywords.add( "protected" ); keywords.add( "public" ); keywords.add( "rest" ); keywords.add( "return" ); keywords.add( "short" ); keywords.add( "static" ); keywords.add( "super" ); keywords.add( "switch" ); keywords.add( "synchronized" ); keywords.add( "this" ); keywords.add( "throw" ); keywords.add( "throws" ); keywords.add( "transient" ); keywords.add( "true" ); keywords.add( "try" ); keywords.add( "var" ); keywords.add( "void" ); keywords.add( "volatile" ); keywords.add( "while" ); } /* * Override to apply syntax highlighting after the document has been updated */ public void insertString(int offset, String str, AttributeSet a) throws BadLocationException { if (str.equals("{")) str = addMatchingBrace(offset); super.insertString(offset, str, a); processChangedLines(offset, str.length()); } /* * Override to apply syntax highlighting after the document has been updated */ public void remove(int offset, int length) throws BadLocationException { super.remove(offset, length); processChangedLines(offset, 0); } /* * Determine how many lines have been changed, * then apply highlighting to each line */ public void processChangedLines(int offset, int length) throws BadLocationException { String content = doc.getText(0, doc.getLength()); // The lines affected by the latest document update int startLine = rootElement.getElementIndex(offset); int endLine = rootElement.getElementIndex(offset + length); if (startLine > endLine) startLine = endLine; // Make sure all comment lines prior to the start line are commented // and determine if the start line is still in a multi line comment if (startLine != lastLineProcessed && startLine != lastLineProcessed + 1) { setMultiLineComment( commentLinesBefore( content, startLine ) ); } // Do the actual highlighting for (int i = startLine; i <= endLine; i++) { applyHighlighting(content, i); } // Resolve highlighting to the next end multi line delimiter if (isMultiLineComment()) commentLinesAfter(content, endLine); else highlightLinesAfter(content, endLine); } /* * Highlight lines when a multi line comment is still 'open' * (ie. matching end delimiter has not yet been encountered) */ private boolean commentLinesBefore(String content, int line) { int offset = rootElement.getElement( line ).getStartOffset(); // Start of comment not found, nothing to do int startDelimiter = lastIndexOf( content, getStartDelimiter(), offset - 2 ); if (startDelimiter < 0) return false; // Matching start/end of comment found, nothing to do int endDelimiter = indexOf( content, getEndDelimiter(), startDelimiter ); if (endDelimiter < offset & endDelimiter != -1) return false; // End of comment not found, highlight the lines doc.setCharacterAttributes(startDelimiter, offset - startDelimiter + 1, comment, false); return true; } /* * Highlight comment lines to matching end delimiter */ private void commentLinesAfter(String content, int line) { int offset = rootElement.getElement( line ).getStartOffset(); // End of comment and Start of comment not found // highlight until the end of the Document int endDelimiter = indexOf( content, getEndDelimiter(), offset ); if (endDelimiter < 0) { endDelimiter = indexOf( content, getStartDelimiter(), offset + 2); if (endDelimiter < 0) { doc.setCharacterAttributes(offset, content.length() - offset + 1, comment, false); return; } } // Matching start/end of comment found, comment the lines int startDelimiter = lastIndexOf( content, getStartDelimiter(), endDelimiter ); if (startDelimiter < 0 || startDelimiter >= offset) { doc.setCharacterAttributes(offset, endDelimiter - offset + 1, comment, false); } } /* * Highlight lines to start or end delimiter */ private void highlightLinesAfter(String content, int line) throws BadLocationException { int offset = rootElement.getElement( line ).getEndOffset(); // Start/End delimiter not found, nothing to do int startDelimiter = indexOf( content, getStartDelimiter(), offset ); int endDelimiter = indexOf( content, getEndDelimiter(), offset ); if (startDelimiter < 0) startDelimiter = content.length(); if (endDelimiter < 0) endDelimiter = content.length(); int delimiter = Math.min(startDelimiter, endDelimiter); if (delimiter < offset) return; // Start/End delimiter found, reapply highlighting int endLine = rootElement.getElementIndex( delimiter ); for (int i = line + 1; i <= endLine; i++) { Element branch = rootElement.getElement( i ); Element leaf = doc.getCharacterElement( branch.getStartOffset() ); AttributeSet as = leaf.getAttributes(); if ( as.isEqual(comment) ) { applyHighlighting(content, i); } } } /* * Parse the line to determine the appropriate highlighting */ private void applyHighlighting(String content, int line) throws BadLocationException { lastLineProcessed = line; int startOffset = rootElement.getElement( line ).getStartOffset(); int endOffset = rootElement.getElement( line ).getEndOffset() - 1; int lineLength = endOffset - startOffset; int contentLength = content.length(); if (endOffset >= contentLength) endOffset = contentLength - 1; // check for multi line comments // (always set the comment attribute for the entire line) if (endingMultiLineComment(content, startOffset, endOffset) || isMultiLineComment() || startingMultiLineComment(content, startOffset, endOffset) ) { doc.setCharacterAttributes(startOffset, endOffset - startOffset + 1, comment, false); lastLineProcessed = -1; return; } // set normal attributes for the line doc.setCharacterAttributes(startOffset, lineLength, normal, true); // check for single line comment int index = content.indexOf(getSingleLineDelimiter(), startOffset); if ( (index > -1) && (index < endOffset) ) { doc.setCharacterAttributes(index, endOffset - index + 1, comment, false); endOffset = index - 1; } // check for tokens checkForTokens(content, startOffset, endOffset); } /* * Does this line contain the start delimiter */ private boolean startingMultiLineComment(String content, int startOffset, int endOffset) throws BadLocationException { int index = indexOf( content, getStartDelimiter(), startOffset ); if ( (index < 0) || (index > endOffset) ) return false; else { setMultiLineComment( true ); return true; } } /* * Does this line contain the end delimiter */ private boolean endingMultiLineComment(String content, int startOffset, int endOffset) throws BadLocationException { int index = indexOf( content, getEndDelimiter(), startOffset ); if ( (index < 0) || (index > endOffset) ) return false; else { setMultiLineComment( false ); return true; } } /* * We have found a start delimiter * and are still searching for the end delimiter */ private boolean isMultiLineComment() { return multiLineComment; } private void setMultiLineComment(boolean value) { multiLineComment = value; } /* * Parse the line for tokens to highlight */ private void checkForTokens(String content, int startOffset, int endOffset) { while (startOffset <= endOffset) { // skip the delimiters to find the start of a new token while ( isDelimiter( content.substring(startOffset, startOffset + 1) ) ) { if (startOffset < endOffset) startOffset++; else return; } // Extract and process the entire token if ( isQuoteDelimiter( content.substring(startOffset, startOffset + 1) ) ) startOffset = getQuoteToken(content, startOffset, endOffset); else startOffset = getOtherToken(content, startOffset, endOffset); } } /* * */ private int getQuoteToken(String content, int startOffset, int endOffset) { String quoteDelimiter = content.substring(startOffset, startOffset + 1); String escapeString = getEscapeString(quoteDelimiter); int index; int endOfQuote = startOffset; // skip over the escape quotes in this quote index = content.indexOf(escapeString, endOfQuote + 1); while ( (index > -1) && (index < endOffset) ) { endOfQuote = index + 1; index = content.indexOf(escapeString, endOfQuote); } // now find the matching delimiter index = content.indexOf(quoteDelimiter, endOfQuote + 1); if ( (index < 0) || (index > endOffset) ) endOfQuote = endOffset; else endOfQuote = index; doc.setCharacterAttributes(startOffset, endOfQuote - startOffset + 1, quote, false); return endOfQuote + 1; } /* * */ private int getOtherToken(String content, int startOffset, int endOffset) { int endOfToken = startOffset + 1; while ( endOfToken <= endOffset ) { if ( isDelimiter( content.substring(endOfToken, endOfToken + 1) ) ) break; endOfToken++; } String token = content.substring(startOffset, endOfToken); if ( isKeyword( token ) ) { doc.setCharacterAttributes(startOffset, endOfToken - startOffset, keyword, false); } return endOfToken + 1; } /* * Assume the needle will be found at the start/end of the line */ private int indexOf(String content, String needle, int offset) { int index; while ( (index = content.indexOf(needle, offset)) != -1 ) { String text = getLine( content, index ).trim(); if (text.startsWith(needle) || text.endsWith(needle)) break; else offset = index + 1; } return index; } /* * Assume the needle will the found at the start/end of the line */ private int lastIndexOf(String content, String needle, int offset) { int index; while ( (index = content.lastIndexOf(needle, offset)) != -1 ) { String text = getLine( content, index ).trim(); if (text.startsWith(needle) || text.endsWith(needle)) break; else offset = index - 1; } return index; } private String getLine(String content, int offset) { int line = rootElement.getElementIndex( offset ); Element lineElement = rootElement.getElement( line ); int start = lineElement.getStartOffset(); int end = lineElement.getEndOffset(); return content.substring(start, end - 1); } /* * Override for other languages */ protected boolean isDelimiter(String character) { String operands = ";:{}()[]+-/%<=>!&|^~*"; if (Character.isWhitespace( character.charAt(0) ) || operands.indexOf(character) != -1 ) return true; else return false; } /* * Override for other languages */ protected boolean isQuoteDelimiter(String character) { String quoteDelimiters = "\"'"; if (quoteDelimiters.indexOf(character) < 0) return false; else return true; } /* * Override for other languages */ protected boolean isKeyword(String token) { return keywords.contains( token ); } /* * Override for other languages */ protected String getStartDelimiter() { return "/*"; } /* * Override for other languages */ protected String getEndDelimiter() { return "*/"; } /* * Override for other languages */ protected String getSingleLineDelimiter() { return "//"; } /* * Override for other languages */ protected String getEscapeString(String quoteDelimiter) { return "\\" + quoteDelimiter; } /* * */ protected String addMatchingBrace(int offset) throws BadLocationException { StringBuffer whiteSpace = new StringBuffer(); int line = rootElement.getElementIndex( offset ); int i = rootElement.getElement(line).getStartOffset(); while (true) { String temp = doc.getText(i, 1); if (temp.equals(" ") || temp.equals("\t")) { whiteSpace.append(temp); i++; } else break; } return "{\n" + whiteSpace.toString() + "\t\n" + whiteSpace.toString() + "}"; } /* public void setCharacterAttributes(int offset, int length, AttributeSet s, boolean replace) { super.setCharacterAttributes(offset, length, s, replace); } */ public static void main(String a[]) { EditorKit editorKit = new StyledEditorKit() { public Document createDefaultDocument() { return new SyntaxDocument(); } }; // final JEditorPane edit = new JEditorPane() final JTextPane edit = new JTextPane(); // LinePainter painter = new LinePainter(edit, Color.cyan); // LinePainter2 painter = new LinePainter2(edit, Color.cyan); // edit.setEditorKitForContentType("text/java", editorKit); // edit.setContentType("text/java"); edit.setEditorKit(editorKit); JButton button = new JButton("Load SyntaxDocument.java"); button.addActionListener( new ActionListener() { public void actionPerformed(ActionEvent e) { try { long startTime = System.currentTimeMillis(); FileReader fr = new FileReader( "SyntaxDocument.java" ); // FileReader fr = new FileReader( "C:\\Java\\j2sdk1.4.2\\src\\javax\\swing\\JComponent.java" ); BufferedReader br = new BufferedReader(fr); edit.read( br, null ); System.out.println("Load: " + (System.currentTimeMillis() - startTime)); System.out.println("Document contains: " + edit.getDocument().getLength() + " characters"); edit.requestFocus(); } catch(Exception e2) {} } }); JFrame frame = new JFrame("Syntax Highlighting"); frame.getContentPane().add( new JScrollPane(edit) ); frame.getContentPane().add(button, BorderLayout.SOUTH); frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE); frame.setSize(800,300); frame.setVisible(true); } } 

注意:此代码不会检查注释分隔符是否在文字内,因此需要对其进行改进。

我真的不希望你使用这个代码,但我认为这可能会让你了解使用powershell方法时可能获得的性能。

一种常见的方法是在每行的开头保存词法分析器状态。 (通常,词法分析器状态将是一个小整数或枚举;对于类似Java的语言,它可能仅限于三个值:正常,内部多行注释和内部多行字符串常量。)

对行的更改可能会更改下一行开头的词法分析器状态,但它不能更改当前行开头的状态,因此行的重新标记可以从行的开头完成,使用当前行的词法分析器状态作为起始条件。 保持每行词法分析器状态可以很容易地处理光标移动到另一条线的情况,可能距离很远。

如果编辑更改了行末尾的词法分析器状态(也就是说下一行的开头),则可以重新扫描文件的其余部分。 但是,立即这样做对用户来说真的很烦人,因为这意味着每次输入引号时,整个scrern都会被重新绘制,因为它已经成为多行字符串的一部分(例如)。 由于大多数时候用户都会关闭字符串(或注释),因此通常最好延迟重新扫描。 例如,您可能要等到用户移动光标或完成词法元素或其他此类信号。 另一种comon方法是在光标插入一个“ghost”关闭符号,这将使lex保持同步。 如果用户明确键入了ghost,或者显式删除了ghost,则将删除该ghost。

您似乎将整个程序保持为单个字符串。 恕我直言,最好将其保留为行列表,以避免在插入或删除字符时复制整个字符串。 否则,编辑很长的文件会变得非常烦人。

最后,你永远不应该标记不可见的文字。 避免这种情况将限制大规模再次攻击的损害。