在一棵非常大的树上执行DFS的最佳方法是什么?

这是情况:

  • 应用程序世界由数十万个州组成。
  • 给定状态,我可以计算出一组3或4个其他可达状态。 一个简单的递归可以构建一个非常快速的非常大的状态树。
  • 我需要从根状态执行此树中特定深度的DFS,以搜索包含“最小”状态的子树(计算节点的值与问题无关)。

使用单个线程执行DFS工作,但速度很慢。 覆盖15级可能需要几分钟,我需要改善这种糟糕的表现。 尝试为每个子树分配一个线程,创建了太multithreading并导致OutOfMemoryError 。 使用ThreadPoolExecutor并没有好多少。

我的问题:穿越这棵大树的最有效方法是什么?

我不相信导航树是你的问题,因为你的树有大约3600万个节点。 相反,您对每个节点所做的事情更有可能是昂贵的。

 import java.util.ArrayList; import java.util.List; import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicLong; public class Main { public static final int TOP_LEVELS = 2; enum BuySell {} private static final AtomicLong called = new AtomicLong(); public static void main(String... args) throws InterruptedException { int maxLevels = 15; long start = System.nanoTime(); method(maxLevels); long time = System.nanoTime() - start; System.out.printf("Took %.3f second to navigate %,d levels called %,d times%n", time / 1e9, maxLevels, called.longValue()); } public static void method(int maxLevels) throws InterruptedException { ExecutorService service = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors()); try { int result = method(service, 0, maxLevels - 1, new int[maxLevels]).call(); } catch (Exception e) { e.printStackTrace(); } service.shutdown(); service.awaitTermination(10, TimeUnit.MINUTES); } // single threaded process the highest levels of the tree. private static Callable method(final ExecutorService service, final int level, final int maxLevel, final int[] options) { int choices = level % 2 == 0 ? 3 : 4; final List> callables = new ArrayList>(choices); for (int i = 0; i < choices; i++) { options[level] = i; Callable callable = level < TOP_LEVELS ? method(service, level + 1, maxLevel, options) : method1(service, level + 1, maxLevel, options); callables.add(callable); } return new Callable() { @Override public Integer call() throws Exception { Integer min = Integer.MAX_VALUE; for (Callable result : callables) { Integer num = result.call(); if (min > num) min = num; } return min; } }; } // at this level, process the branches in separate threads. private static Callable method1(final ExecutorService service, final int level, final int maxLevel, final int[] options) { int choices = level % 2 == 0 ? 3 : 4; final List> futures = new ArrayList>(choices); for (int i = 0; i < choices; i++) { options[level] = i; final int[] optionsCopy = options.clone(); Future future = service.submit(new Callable() { @Override public Integer call() { return method2(level + 1, maxLevel, optionsCopy); } }); futures.add(future); } return new Callable() { @Override public Integer call() throws Exception { Integer min = Integer.MAX_VALUE; for (Future result : futures) { Integer num = result.get(); if (min > num) min = num; } return min; } }; } // at these levels each task processes in its own thread. private static int method2(int level, int maxLevel, int[] options) { if (level == maxLevel) { return process(options); } int choices = level % 2 == 0 ? 3 : 4; int min = Integer.MAX_VALUE; for (int i = 0; i < choices; i++) { options[level] = i; int n = method2(level + 1, maxLevel, options); if (min > n) min = n; } return min; } private static int process(final int[] options) { int min = options[0]; for (int i : options) if (min > i) min = i; called.incrementAndGet(); return min; } } 

版画

 Took 1.273 second to navigate 15 levels called 35,831,808 times 

我建议你限制线程数,并且只为树的最高级别使用单独的线程。 你有几个核心? 一旦你有足够的线程来保持每个核心繁忙,你就不需要创建更multithreading,因为这只会增加开销。

Java有一个内置的堆栈,线程安全,但我只使用更高效的ArrayList。

你肯定必须使用迭代方法。 最简单的方法是基于堆栈的DFS,其伪代码类似于:

 STACK.push(root) while (STACK.nonempty) current = STACK.pop if (current.done) continue // ... do something with node ... current.done = true FOREACH (neighbor n of current) if (! n.done ) STACK.push(n) 

时间复杂度为O(n + m),其中n(m)表示图中节点(边)的数量。 既然你有一棵树,这就是O(n),并且应该很容易在n> 1.000.000上快速工作……

    Interesting Posts