插入排序,MergeSort和快速排序的测试用例

我已经实现了(在Java中)Insertion Sort,MergeSort,ModifiedMergeSort和Quick Sort:

ModifiedMergeSort具有元素“绑定”的变量。 当要排序的元素小于或等于“bound”时,请使用Insertion Sort对它们进行排序。

为什么版本1比版本3,4和5更好?

版本2和6的结果是否真实?

这是我的结果(以毫秒为单位):

Version 1 - Insertion Sort: Run-Times over 50 test runs Input Size Best-Case Worst-Case Average-Case N = 10000 14 19 14.96 N = 20000 59 60 59.3 N = 40000 234 277 243.1 Version 2 - Merge Sort: Run-Times over 50 test runs Input Size Best-Case Worst-Case Average-Case N = 10000 1 15 1.78 N = 20000 3 8 3.4 N = 40000 6 9 6.7 Version 3 - Merge Sort and Insertion Sort on 15 elements: Run-Times over 50 test runs Input Size Best-Case Worst-Case Average-Case N = 10000 41 52 45.42 N = 20000 170 176 170.56 N = 40000 712 823 728.08 Version 4 - Merge Sort and Insertion Sort on 30 elements: Run-Times over 50 test runs Input Size Best-Case Worst-Case Average-Case N = 10000 27 33 28.04 N = 20000 113 119 114.36 N = 40000 436 497 444.2 Version 5 - Merge Sort and Insertion Sort on 45 elements: Run-Times over 50 test runs Input Size Best-Case Worst-Case Average-Case N = 10000 20 21 20.68 N = 20000 79 82 79.7 N = 40000 321 383 325.64 Version 6 - Quick Sort: Run-Times over 50 test runs Input Size Best-Case Worst-Case Average-Case N = 10000 1 9 1.18 N = 20000 2 3 2.06 N = 40000 4 5 4.26 

这是我的代码:

 package com.testing; import com.sorting.InsertionSort; import com.sorting.MergeSort; import com.sorting.ModifiedMergeSort; import com.sorting.RandomizedQuickSort; /** * * @author mih1406 */ public class Main { private static final int R = 50; // # of tests private static int N = 0; // Input size private static int[] array; // Tests array private static int[] temp; // Tests array private static long InsertionSort_best = -1; private static long InsertionSort_worst = -1; private static double InsertionSort_average = -1.0; private static long MergeSort_best = -1; private static long MergeSort_worst = -1; private static double MergeSort_average = -1.0; private static long ModifiedMergeSort_V3_best = -1; private static long ModifiedMergeSort_V3_worst = -1; private static double ModifiedMergeSort_V3_average = -1.0; private static long ModifiedMergeSort_V4_best = -1; private static long ModifiedMergeSort_V4_worst = -1; private static double ModifiedMergeSort_V4_average = -1.0; private static long ModifiedMergeSort_V5_best = -1; private static long ModifiedMergeSort_V5_worst = -1; private static double ModifiedMergeSort_V5_average = -1.0; private static long RandomizedQuickSort_best = -1; private static long RandomizedQuickSort_worst = -1; private static double RandomizedQuickSort_average = -1.0; public static void main(String args[]) { StringBuilder InsertionSort_text = new StringBuilder( "Version 1 - Insertion Sort: Run-Times over 50 test runs\n"); StringBuilder MergeSort_text = new StringBuilder( "Version 2 - Merge Sort: Run-Times over 50 test runs\n"); StringBuilder ModifiedMergeSort_V3_text = new StringBuilder( "Version 3 - Merge Sort and Insertion Sort on 15 elements: Run-Times over 50 test runs\n"); StringBuilder ModifiedMergeSort_V4_text = new StringBuilder( "Version 4 - Merge Sort and Insertion Sort on 30 elements: Run-Times over 50 test runs\n"); StringBuilder ModifiedMergeSort_V5_text = new StringBuilder( "Version 5 - Merge Sort and Insertion Sort on 45 elements: Run-Times over 50 test runs\n"); StringBuilder RandomizedQuickSort_text = new StringBuilder( "Version 6 - Quick Sort: Run-Times over 50 test runs\n"); InsertionSort_text.append("Input Size\t\t" + "Best-Case\t\tWorst-Case\t\tAverage-Case\n"); MergeSort_text.append("Input Size\t\t" + "Best-Case\t\tWorst-Case\t\tAverage-Case\n"); ModifiedMergeSort_V3_text.append("Input Size\t\t" + "Best-Case\t\tWorst-Case\t\tAverage-Case\n"); ModifiedMergeSort_V4_text.append("Input Size\t\t" + "Best-Case\t\tWorst-Case\t\tAverage-Case\n"); ModifiedMergeSort_V5_text.append("Input Size\t\t" + "Best-Case\t\tWorst-Case\t\tAverage-Case\n"); RandomizedQuickSort_text.append("Input Size\t\t" + "Best-Case\t\tWorst-Case\t\tAverage-Case\n"); // Case N = 10000 N = 10000; fillArray(); testing_InsertionSort(); testing_MergeSort(); testing_ModifiedMergeSort(15); testing_ModifiedMergeSort(30); testing_ModifiedMergeSort(45); testing_RandomizedQuickSort(); InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best + "\t\t\t" + InsertionSort_worst + "\t\t\t" + InsertionSort_average + "\n"); MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best + "\t\t\t" + MergeSort_worst + "\t\t\t" + MergeSort_average + "\n"); ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t" + ModifiedMergeSort_V3_average + "\n"); ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t" + ModifiedMergeSort_V4_average + "\n"); ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t" + ModifiedMergeSort_V5_average + "\n"); RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t" + RandomizedQuickSort_average + "\n"); // Case N = 20000 N = 20000; fillArray(); testing_InsertionSort(); testing_MergeSort(); testing_ModifiedMergeSort(15); testing_ModifiedMergeSort(30); testing_ModifiedMergeSort(45); testing_RandomizedQuickSort(); InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best + "\t\t\t" + InsertionSort_worst + "\t\t\t" + InsertionSort_average + "\n"); MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best + "\t\t\t" + MergeSort_worst + "\t\t\t" + MergeSort_average + "\n"); ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t" + ModifiedMergeSort_V3_average + "\n"); ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t" + ModifiedMergeSort_V4_average + "\n"); ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t" + ModifiedMergeSort_V5_average + "\n"); RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t" + RandomizedQuickSort_average + "\n"); // Case N = 40000 N = 40000; fillArray(); testing_InsertionSort(); testing_MergeSort(); testing_ModifiedMergeSort(15); testing_ModifiedMergeSort(30); testing_ModifiedMergeSort(45); testing_RandomizedQuickSort(); InsertionSort_text.append("N = " + N + "\t\t" + InsertionSort_best + "\t\t\t" + InsertionSort_worst + "\t\t\t" + InsertionSort_average + "\n"); MergeSort_text.append("N = " + N + "\t\t" + MergeSort_best + "\t\t\t" + MergeSort_worst + "\t\t\t" + MergeSort_average + "\n"); ModifiedMergeSort_V3_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V3_best + "\t\t\t" + ModifiedMergeSort_V3_worst + "\t\t\t" + ModifiedMergeSort_V3_average + "\n"); ModifiedMergeSort_V4_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V4_best + "\t\t\t" + ModifiedMergeSort_V4_worst + "\t\t\t" + ModifiedMergeSort_V4_average + "\n"); ModifiedMergeSort_V5_text.append("N = " + N + "\t\t" + ModifiedMergeSort_V5_best + "\t\t\t" + ModifiedMergeSort_V5_worst + "\t\t\t" + ModifiedMergeSort_V5_average + "\n"); RandomizedQuickSort_text.append("N = " + N + "\t\t" + RandomizedQuickSort_best + "\t\t\t" + RandomizedQuickSort_worst + "\t\t\t" + RandomizedQuickSort_average + "\n"); System.out.println(InsertionSort_text); System.out.println(MergeSort_text); System.out.println(ModifiedMergeSort_V3_text); System.out.println(ModifiedMergeSort_V4_text); System.out.println(ModifiedMergeSort_V5_text); System.out.println(RandomizedQuickSort_text); } private static void fillArray() { // (re)creating the array array = new int[N]; // filling the array with random numbers // using for-loop and the given random generator instruction for(int i = 0; i < array.length; i++) { array[i] = (int)(1 + Math.random() * (N-0+1)); } } private static void testing_InsertionSort() { // run-times arrays long [] run_times = new long[R]; // start/finish times long start; long finish; for(int i = 0; i < R; i++) { copyArray(); // recording start time start = System.currentTimeMillis(); // testing the algorithm InsertionSort.sort(temp); // recording finish time finish = System.currentTimeMillis(); run_times[i] = finish-start; } InsertionSort_best = findMin(run_times); InsertionSort_worst = findMax(run_times); InsertionSort_average = findAverage(run_times); } private static void testing_MergeSort() { // run-times arrays long [] run_times = new long[R]; // start/finish times long start; long finish; for(int i = 0; i < R; i++) { copyArray(); // recording start time start = System.currentTimeMillis(); // testing the algorithm MergeSort.sort(temp); // recording finish time finish = System.currentTimeMillis(); run_times[i] = finish-start; } MergeSort_best = findMin(run_times); MergeSort_worst = findMax(run_times); MergeSort_average = findAverage(run_times); } private static void testing_ModifiedMergeSort(int bound) { // run-times arrays long [] run_times = new long[R]; // setting the bound ModifiedMergeSort.bound = bound; // start/finish times long start; long finish; for(int i = 0; i < R; i++) { copyArray(); // recording start time start = System.currentTimeMillis(); // testing the algorithm ModifiedMergeSort.sort(temp); // recording finish time finish = System.currentTimeMillis(); run_times[i] = finish-start; } if(bound == 15) { ModifiedMergeSort_V3_best = findMin(run_times); ModifiedMergeSort_V3_worst = findMax(run_times); ModifiedMergeSort_V3_average = findAverage(run_times); } if(bound == 30) { ModifiedMergeSort_V4_best = findMin(run_times); ModifiedMergeSort_V4_worst = findMax(run_times); ModifiedMergeSort_V4_average = findAverage(run_times); } if(bound == 45) { ModifiedMergeSort_V5_best = findMin(run_times); ModifiedMergeSort_V5_worst = findMax(run_times); ModifiedMergeSort_V5_average = findAverage(run_times); } } private static void testing_RandomizedQuickSort() { // run-times arrays long [] run_times = new long[R]; // start/finish times long start; long finish; for(int i = 0; i < R; i++) { copyArray(); // recording start time start = System.currentTimeMillis(); // testing the algorithm RandomizedQuickSort.sort(temp); // recording finish time finish = System.currentTimeMillis(); run_times[i] = finish-start; } RandomizedQuickSort_best = findMin(run_times); RandomizedQuickSort_worst = findMax(run_times); RandomizedQuickSort_average = findAverage(run_times); } private static long findMax(long[] times) { long max = times[0]; for(int i = 1; i < times.length; i++) { if(max < times[i]) { max = times[i]; } } return max; } private static long findMin(long[] times) { long min = times[0]; for(int i = 1; i  times[i]) { min = times[i]; } } return min; } private static double findAverage(long[] times) { long sum = 0; double avg; for(int i = 0; i < times.length; i++) { sum += times[i]; } avg = (double)sum/(double)times.length; return avg; } private static void copyArray() { temp = new int[N]; System.arraycopy(array, 0, temp, 0, array.length); } } 

您目前正在采取的方法似乎存在一些系统性错误。 我将陈述您面临的一些最明显的实验性问题,即使它们可能不会直接成为您实验结果的罪魁祸首。

JVM编译

正如我之前在评论中所述,JVM默认会以解释模式运行您的代码。 这意味着在您的方法中找到的每个字节码指令将在现场解释,然后运行。

这种方法的优点是,它允许您的应用程序比在每次运行启动时编译为本机代码的Java程序具有更快的启动时间。

不利的一面是,虽然没有启动性能,但是你会得到一个性能较慢的程序。

为了改善这两个问题,JVM团队进行了权衡。 最初您的程序将被解释,但JVM将收集有关哪些方法(或方法的一部分)被密集使用的信息,并且将仅编译那些似乎被大量使用的方法。 在编译时,你会获得很小的性能,但代码将比以前更快。

在进行测量时,您必须考虑这一事实。

标准方法是“预热JVM”,即稍微运行算法以确保JVM完成它需要执行的所有编译。 让JVM变暖的代码与您想要进行基准测试的代码相同非常重要,否则在您对代码进行基准测试时可能会进行一些编译。

测量时间

测量时间时,应使用System.nanoTime()而不是System.currentTimeMillis 。 我不会详细介绍,可以在这里访问。

你也应该小心。 以下代码块起初可能看起来相同,但在大多数情况下会产生不同的结果:

 totalDuration = 0; for (i = 0; i < 1000; ++i) startMeasure = now(); algorithm(); endMeasure = now(); duration = endMeasure - startMeasure; totalDuration += duration; } //... TRIALS_COUNT = 1000; startMeasure = now(); for (i = 0; i < TRIALS_COUNT; ++i) algorithm(); } endMeasure = now(); duration = endMeasure - startMeasure / TRIALS_COUNT; 

为什么? 因为特别是对于更快的algorithm()实现,它们越快,获得准确结果就越困难。

输入值大

渐近符号有助于理解不同算法如何针对n大值进行升级。 将它们应用于小输入值通常是荒谬的,因为在这种程度上,您通常需要的是计算精确的操作次数及其相关成本(类似于Jakub所做的那样)。 Big O表示法大多数时候只对大Os有意义。 Big O将告诉您算法如何处理难以忍受的输入值大小,而不是它对小数字的行为方式。 规范示例例如是QuickSort,对于大数组而言,它将是王者,但对于大小为4或5的数组而言,这通常比选择或插入排序更慢。 但是,您的输入大小似乎足够大。

最后一点

正如Chang先前所说,Jakub完成的数学练习是完全错误的,不应该被考虑在内。

自己计算复杂性。 我假设10000个样本用于以下计算:

插入排序: O(n ^ 2),10 000 * 10 000 = 100 000 000。

合并排序: O(nlogn),10 000 * log10 000 = 140 000。

合并插入(15): 15介于9(大小20的数组)和10(大小10的数组)2 ^ 10插入排序(大小10),然后2 ^ 10 * 10 000合并:1 024 * 10 * 10(插入)+ 1 024 * 10 000(合并)= 10 342 400

合并插入(30): 30介于8(大小40的数组)和9(大小20的数组)2 ^ 9插入排序(大小20),然后2 ^ 9 * 10 000合并:512 * 20 * 20 (插入)+ 512 * 10 000(合并)= 5 324 800

合并插入(45): 45介于7(大小为80的数组)和8(大小为40的数组)2 ^ 8插入排序(大小为40),然后2 ^ 8 * 10 000合并:256 * 40 * 40 (插入)+ * 10 000(合并)= 2 969 600

Quicksort:虽然最坏情况的快速排序需要O(n ^ 2),但最坏的情况必须经过精心设计才能达到这个极限。 大多数情况下,使用radomly生成的算法,平均为O(nlogn):10 000 * log10 000 = 140 000。

测量排序算法性能会变得非常痛苦,因为您需要在尽可能广泛的输入数据上有效地进行测量。

您在插入排序中看到的数字可能会因输入数字而大幅偏差。 如果在数组中仅使用0和1,并且数组是随机生成的,那么实际上您可以更容易地解决算法问题。 对于给定的情况,平均一半的数组已经排序,并且您不需要将0和1相互比较。 问题是减少到向左传输所有0,平均只需要(log(n / 2))!+ n时间。 对于10 000,实际时间是5 000!+10 000 = 133 888。