newFixedThreadPool与newSingleThreadExecutor的性能问题

我正在尝试对客户端代码进行基准测试。所以我决定编写一个multithreading程序来对我的客户端代码进行基准测试。我想测量下面方法需要多少time (95 Percentile)

attributes = deClient.getDEAttributes(columnsList);

下面是我编写的用于对上述方法进行基准测试的multithreading代码。我在两个场景中看到很多变化 –

1）首先，使用20 threads并running for 15 minutesmultithreading代码。我得到95百分位数为37ms 。我正在使用 –

 ExecutorService service = Executors.newFixedThreadPool(20);

2）但如果我使用以下方式运行同一个程序15 minutes –

ExecutorService service = Executors.newSingleThreadExecutor();

代替

ExecutorService service = Executors.newFixedThreadPool(20);

当我用newFixedThreadPool(20)运行我的代码时，我得到95百分位为7ms ，这比上面的数字小。

任何人都可以告诉我这样的高性能问题可能是什么原因 –

newSingleThreadExecutor vs newFixedThreadPool(20)

通过这两种方式，我运行我的程序15 minutes 。

以下是我的代码 –

 public static void main(String[] args) { try { // create thread pool with given size //ExecutorService service = Executors.newFixedThreadPool(20); ExecutorService service = Executors.newSingleThreadExecutor(); long startTime = System.currentTimeMillis(); long endTime = startTime + (15 * 60 * 1000);//Running for 15 minutes for (int i = 0; i < threads; i++) { service.submit(new ServiceTask(endTime, serviceList)); } // wait for termination service.shutdown(); service.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS); } catch (InterruptedException e) { } catch (Exception e) { } }

下面是实现Runnable接口的类 –

 class ServiceTask implements Runnable { private static final Logger LOG = Logger.getLogger(ServiceTask.class.getName()); private static Random random = new SecureRandom(); public static volatile AtomicInteger countSize = new AtomicInteger(); private final long endTime; private final LinkedHashMap tableLists; public static ConcurrentHashMap selectHistogram = new ConcurrentHashMap(); public ServiceTask(long endTime, LinkedHashMap tableList) { this.endTime = endTime; this.tableLists = tableList; } @Override public void run() { try { while (System.currentTimeMillis() <= endTime) { double randomNumber = random.nextDouble() * 100.0; ServiceInfo service = selectRandomService(randomNumber); final String id = generateRandomId(random); final List columnsList = getColumns(service.getColumns()); List<DEAttribute> attributes = null; DEKey bk = new DEKey(service.getKeys(), id); List list = new ArrayList(); list.add(bk); Client deClient = new Client(list); final long start = System.nanoTime(); attributes = deClient.getDEAttributes(columnsList); final long end = System.nanoTime() - start; final long key = end / 1000000L; boolean done = false; while(!done) { Long oldValue = selectHistogram.putIfAbsent(key, 1L); if(oldValue != null) { done = selectHistogram.replace(key, oldValue, oldValue + 1); } else { done = true; } } countSize.getAndAdd(attributes.size()); handleDEAttribute(attributes); if (BEServiceLnP.sleepTime > 0L) { Thread.sleep(BEServiceLnP.sleepTime); } } } catch (Exception e) { } } }

更新：-

这是我的处理器规范 – 我从Linux机器运行我的程序，其中2个处理器定义为：

 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz stepping : 7 cpu MHz : 2599.999 cache size : 20480 KB fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm arat pln pts bogomips : 5199.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:

谁能告诉我newSingleThreadExecutor与newFixedThreadPool(20)如此高性能问题的原因是什么……

如果你并行运行更多的任务（在这种情况下为20）而不是你有处理器（我怀疑你有20多个处理器盒）那么是的，每个单独的任务将需要更长的时间来完成。计算机更容易一次执行一个任务，而不是在同时运行的多个线程之间切换。即使您将池中的线程数限制为您拥有的CPU数量，每个任务可能会运行得更慢，尽管稍微有点。

但是，如果您比较不同大小的线程池的吞吐量 （完成许多任务所需的时间），您应该看到20线程吞吐量要大得多。如果你用20个线程执行1000个任务，那么它们总体上会比只用1个线程更快完成。每项任务可能需要更长时间，但它们将并行执行。给定的线程开销等可能不会快20倍，但它可能会快15倍。

您不应该担心单个任务的速度，而应该通过调整池中的线程数来尝试最大化任务吞吐量。使用多少线程在很大程度上取决于IO的数量，每个任务使用的CPU周期，锁，同步块，OS上运行的其他应用程序以及其他因素。

人们经常使用1-2倍的CPU数量作为池中线程数量的最佳起点，以最大化吞吐量。然后，更多IO请求或线程阻塞操作会添加更multithreading。然后，更多的CPU绑定会减少线程数，使其更接近可用的CPU数量。如果您的应用程序与服务器上的其他更重要的应用程序竞争OS周期，则可能需要更少的线程。

newFixedThreadPool与newSingleThreadExecutor的性能问题

如何让Java套接字使用公共IP？

Eclipse项目中缺少.classpath文件

Hibernate将两个表映射到一个类

我什么时候应该在Java中使用IntStream.range？

在Javardd排序

在junit测试中获取一个类作为javax.lang.model.element.Element

AspectJ可以穿过sun.net。*包吗？

将两个不同的servlet映射到相同的URL模式

JScrollPane中的Java JPanel？

如何使用Spring-Boot和OAuth2限制特定域登录