为什么这个Java方法泄漏 – 为什么内联它修复了泄漏？

我编写了一个最小的有点惰性（ int ）序列类GarbageTest.java ，作为一个实验，看看我是否可以在Clojure中处理Java中非常长的懒惰序列。

给定一个返回懒惰，无限，自然数序列的naturals()方法; drop(n,sequence)方法，它删除drop(n,sequence)的前n元素并返回sequence的其余部分; 和一个简单返回的nth(n,sequence)方法： drop(n, lazySeq).head() ，我写了两个测试：

 static int N = (int)1e6; // succeeds @ N = (int)1e8 with java -Xmx10m @Test public void dropTest() { assertThat( drop(N, naturals()).head(), is(N+1)); } // fails with OutOfMemoryError @ N = (int)1e6 with java -Xmx10m @Test public void nthTest() { assertThat( nth(N, naturals()), is(N+1)); }

请注意， dropTest()的主体是通过复制nthTest()的主体，然后在nth(N, naturals()) nthTest()调用上调用IntelliJ的“内联”重构来生成的。所以在我看来， dropTest()的行为应该与nthTest()的行为相同。

但它不完全相同！ dropTest()运行完成，N最多为1e8，而nthTest()失败， OutOfMemoryError为N，小到1e6。

我避免内心阶级。我已经尝试了我的代码ClearingArgsGarbageTest.java的变体，它在调用其他方法之前使方法参数为空。我已经应用了YourKit分析器。我看过字节码。我只是找不到导致nthTest()失败的泄漏。

哪里是“泄漏”？为什么nthTest()有泄漏而dropTest()没有？

这是GarbageTest.java的其余代码，如果你不想点击进入Github项目：

 /** * a not-perfectly-lazy lazy sequence of ints. see LazierGarbageTest for a lazier one */ static class LazyishSeq { final int head; volatile Supplier tailThunk; LazyishSeq tailValue; LazyishSeq(final int head, final Supplier tailThunk) { this.head = head; this.tailThunk = tailThunk; tailValue = null; } int head() { return head; } LazyishSeq tail() { if (null != tailThunk) synchronized(this) { if (null != tailThunk) { tailValue = tailThunk.get(); tailThunk = null; } } return tailValue; } } static class Incrementing implements Supplier { final int seed; private Incrementing(final int seed) { this.seed = seed;} public static LazyishSeq createSequence(final int n) { return new LazyishSeq( n, new Incrementing(n+1)); } @Override public LazyishSeq get() { return createSequence(seed); } } static LazyishSeq naturals() { return Incrementing.createSequence(1); } static LazyishSeq drop( final int n, final LazyishSeq lazySeqArg) { LazyishSeq lazySeq = lazySeqArg; for( int i = n; i > 0 && null != lazySeq; i -= 1) { lazySeq = lazySeq.tail(); } return lazySeq; } static int nth(final int n, final LazyishSeq lazySeq) { return drop(n, lazySeq).head(); }

在你的方法

 static int nth(final int n, final LazyishSeq lazySeq) { return drop(n, lazySeq).head(); }

参数变量lazySeq在整个drop操作期间保存对序列的第一个元素的引用。这可以防止整个序列被垃圾收集。

与…对比

 public void dropTest() { assertThat( drop(N, naturals()).head(), is(N+1)); }

序列的第一个元素由naturals()返回并直接传递给drop的调用，从而从操作数堆栈中删除，并且在执行drop期间不存在。

您尝试将参数变量设置为null ，即

 static int nth(final int n, /*final*/ LazyishSeq lazySeqArg) { final LazyishSeq lazySeqLocal = lazySeqArg; lazySeqArg = null; return drop(n,lazySeqLocal).head(); }

没有帮助，因为现在， lazySeqArg变量为null ，但lazySeqLocal保存对第一个元素的引用。

局部变量通常不会阻止垃圾收集，允许收集其他未使用的对象，但这并不意味着特定的实现能够执行此操作。

对于HotSpot JVM，只有优化的代码才能摆脱这些未使用的引用。但是在这里， nth不是热点，因为重要的事情发生在drop方法中。

这就是为什么在drop方法中没有出现相同问题的原因，尽管它还包含对其参数变量中第一个元素的引用。 drop方法包含执行实际工作的循环，因此很可能通过JVM进行优化，这可能导致它消除未使用的变量，从而允许收集已处理的序列部分。

有许多因素可能会影响JVM的优化。除了代码的不同形状之外，似乎在未优化阶段期间的快速存储器分配也可能减少优化器的改进。实际上，当我使用-Xcompile运行时，为了完全禁止解释执行，两个变体都成功运行，即使是int N = (int)1e9也不再有问题。当然，强制编译会增加启动时间。

我不得不承认，我不明白为什么混合模式表现得更糟，我会进一步调查。但通常，您必须意识到垃圾收集器的效率取决于实现，因此在一个环境中收集的对象可能会留在另一个环境中。

Clojure实施了一种策略来处理这种被称为“本地清理”的场景。在编译器中支持它，使其在纯Clojure代码中需要时自动启动（除非在编译时禁用 – 这有时对调试很有用）。然而，Clojure还在其Java运行时中的各个地方清除了本地人，并且它可以在Java库甚至应用程序代码中使用它的方式，尽管它无疑会有点麻烦。

在我进入Clojure所做的之前，这里是这个例子中发生的事情的简短摘要：

nth(int, LazyishSeq)是用drop(int, LazyishSeq)和LazyishSeq.head() 。
nth将其参数传递给drop并且没有进一步使用它们。
可以容易地实现drop ，以避免保持传入序列的头部。

这里nth仍然坚持其序列参数的头部。运行时可能会丢弃该引用，但不保证它会。

Clojure处理此问题的方法是在切换控制权之前明确清除对序列的引用。这是通过一个相当优雅的技巧完成的（从Clojure 1.9.0开始，链接到GitHub上面的以下片段）：

 // clojure/src/jvm/clojure/lang/Util.java /** * Copyright (c) Rich Hickey. All rights reserved. * The use and distribution terms for this software are covered by the * Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) * which can be found in the file epl-v10.html at the root of this distribution. * By using this software in any fashion, you are agreeing to be bound by * the terms of this license. * You must not remove this notice, or any other, from this software. **/ // … beginning of the file omitted … // the next line is the 190th in the file as of Clojure 1.9.0 static public Object ret1(Object ret, Object nil){ return ret; } static public ISeq ret1(ISeq ret, Object nil){ return ret; } // …

鉴于上述情况，可以将对nth内部的调用更改为

 drop(n, ret1(lazySeq, lazySeq = null))

这里lazySeq = null在控制转移到ret1之前被计算为表达式; 该值为null并且还存在将lazySeq引用设置为null的lazySeq 。然而， ret1的第一个参数将由此点进行评估，因此ret1在其第一个参数中接收对序列的引用并按预期返回，然后将该值传递给drop 。

因此drop接收lazySeq本地保存的原始值，但是在控制转移到drop之前，本地自身被清除。

因此， nth不再坚持序列的头部。

为什么这个Java方法泄漏 – 为什么内联它修复了泄漏？

java / clojure中的单字符控制台输入

在ExecutorService中hibernate一个线程（Java / Clojure）

Clojure STM（dosync）x Java同步块

创建一个用于从Java / Clojure执行的jar文件

Java：在同一个JVM中从运行时获取类的字节码

Clojure和HBase：通过扫描迭代懒惰

（require）函数的clojure类路径问题？

我是否可以在Java 8中将Clojure函数用作Lambdas？

如何在Clojure中捕获多个exception？

从JVM发送POSIX信号