创建长度为N的新数组的最快方法,并通过重复给定数组来填充它

我想分配一个长度为N的新数组,并通过重复给定的数组来填充它。 界面如下所示:

 T[] repeat(T[] array, int n); 

澄清我的意思是一个小例子:

 String a = {"a", "b", "c"}; // b = {"a", "b", "c", "a", "b", "c", "a", "b", "c", "a"} String b = repeat(a, 10); 

大多数程序员都会提出以下解决方案(为了简化数组生成,选择了特定类型):

 public String[] repeat(String[] array, int n) { String[] repeated = new String[n]; for (int i = 0; i < n; i++) { repeated[i] = array[i % array.length]; } return repeated; } 

有更快的方法吗?

我想出了这个通用的解决方案:

 public static  T[] repeat(T[] arr, int newLength) { T[] dup = Arrays.copyOf(arr, newLength); for (long last = arr.length; last != 0 && last < newLength; last <<= 1) { System.arraycopy(dup, 0, dup, (int) last, (int) (Math.min(last << 1, newLength) - last)); } return dup; } 

理论

System.arraycopy是本机调用。 因此它非常快,但并不意味着它是最快的方式。

每个其他解决方案都按元素复制数组元素。 我的解决方案复制更大的块。 每次迭代都会复制数组中的现有元素,这意味着循环最多只能运行log2(n)次。

分析报告

输入

 TEST_ARRAY = { "a", "b", "c", "d", "e", "f" } NEW_LENGTH = 10000 

以下是重现结果的基准代码:

 import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.BenchmarkMode; import org.openjdk.jmh.annotations.Fork; import org.openjdk.jmh.annotations.Measurement; import org.openjdk.jmh.annotations.Mode; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.annotations.Threads; import org.openjdk.jmh.annotations.Warmup; @Fork(3) @BenchmarkMode(Mode.AverageTime) @Measurement(iterations = 10, timeUnit = TimeUnit.NANOSECONDS) @State(Scope.Benchmark) @Threads(1) @Warmup(iterations = 5, timeUnit = TimeUnit.NANOSECONDS) @OutputTimeUnit(TimeUnit.NANOSECONDS) public class MyBenchmark { private static final String[] TEST_ARRAY = { "a", "b", "c", "d", "e", "f" }; private static final int NEW_LENGTH = 10_000; @Benchmark public String[] testNativeCall() { String[] dup = Arrays.copyOf(TEST_ARRAY, NEW_LENGTH); for (int last = TEST_ARRAY.length; last != 0 && last < NEW_LENGTH; last <<= 1) { System.arraycopy(dup, 0, dup, last, Math.min(last << 1, NEW_LENGTH) - last); } return dup; } @Benchmark public String[] testLoopModulo() { String[] arr = new String[NEW_LENGTH]; for (int i = 0; i < NEW_LENGTH; i++) { arr[i] = arr[i % TEST_ARRAY.length]; } return arr; } @Benchmark public String[] testArrayList() { List initialLetters = Arrays.asList(TEST_ARRAY); List results = new ArrayList<>(); int indexOfLetterToAdd = 0; for (int i = 0; i < 10000; i++) { results.add(initialLetters.get(indexOfLetterToAdd++)); if (indexOfLetterToAdd == initialLetters.size()) { indexOfLetterToAdd = 0; } } return results.toArray(new String[results.size()]); } @Benchmark public String[] testLoopReset() { String result[] = new String[NEW_LENGTH]; for (int i = 0, j = 0; i < NEW_LENGTH && j < TEST_ARRAY.length; i++, j++) { result[i] = TEST_ARRAY[j]; if (j == TEST_ARRAY.length - 1) { j = -1; } } return result; } @Benchmark public String[] testStream() { String[] result = Stream.iterate(TEST_ARRAY, x -> x).flatMap(x -> Stream.of(TEST_ARRAY)).limit(NEW_LENGTH) .toArray(String[]::new); return result; } } 

结果

 Benchmark Mode Cnt Score Error Units MyBenchmark.testNativeCall avgt 30 4154,553 ± 11,242 ns/op MyBenchmark.testLoopModulo avgt 30 19273,717 ± 235,547 ns/op MyBenchmark.testArrayList avgt 30 71079,139 ± 2686,136 ns/op MyBenchmark.testLoopReset avgt 30 18307,368 ± 202,520 ns/op MyBenchmark.testStream avgt 30 68898,278 ± 2488,104 ns/op 

如您所见,本机调用方法是重复数组的最快方法。

其他结果

此外,我被要求用各种输入对这些方法进行基准测试。

输入范围

 // Array size not fixed anymore - filled with random elements SIZE = { 100, 1000, 100000, 1000000 } NEW_LENGTH = { 100, 1000, 100000, 1000000 } 

这意味着有SIZE x NEW_LENGTH测试,结果如下:

 Benchmark (NEW_LENGTH) (SIZE) Mode Cnt Score Error Units MyBenchmark.testArrayList 100 100 avgt 30 706,274 ± 6,787 ns/op MyBenchmark.testArrayList 100 1000 avgt 30 692,586 ± 15,076 ns/op MyBenchmark.testArrayList 100 100000 avgt 30 685,214 ± 6,747 ns/op MyBenchmark.testArrayList 100 1000000 avgt 30 685,333 ± 5,493 ns/op MyBenchmark.testArrayList 1000 100 avgt 30 7170,897 ± 63,221 ns/op MyBenchmark.testArrayList 1000 1000 avgt 30 7180,612 ± 93,280 ns/op MyBenchmark.testArrayList 1000 100000 avgt 30 6818,585 ± 197,859 ns/op MyBenchmark.testArrayList 1000 1000000 avgt 30 6810,614 ± 139,456 ns/op MyBenchmark.testArrayList 100000 100 avgt 30 597614,173 ± 6446,318 ns/op MyBenchmark.testArrayList 100000 1000 avgt 30 580696,750 ± 5141,845 ns/op MyBenchmark.testArrayList 100000 100000 avgt 30 598657,608 ± 5126,519 ns/op MyBenchmark.testArrayList 100000 1000000 avgt 30 595529,027 ± 4981,095 ns/op MyBenchmark.testArrayList 1000000 100 avgt 30 6836746,484 ± 38848,467 ns/op MyBenchmark.testArrayList 1000000 1000 avgt 30 6745066,786 ± 57971,469 ns/op MyBenchmark.testArrayList 1000000 100000 avgt 30 7130391,072 ± 50583,914 ns/op MyBenchmark.testArrayList 1000000 1000000 avgt 30 8791342,042 ± 172323,938 ns/op MyBenchmark.testLoopModulo 100 100 avgt 30 301,252 ± 1,195 ns/op MyBenchmark.testLoopModulo 100 1000 avgt 30 301,988 ± 2,056 ns/op MyBenchmark.testLoopModulo 100 100000 avgt 30 299,892 ± 1,776 ns/op MyBenchmark.testLoopModulo 100 1000000 avgt 30 300,468 ± 2,569 ns/op MyBenchmark.testLoopModulo 1000 100 avgt 30 3277,018 ± 14,880 ns/op MyBenchmark.testLoopModulo 1000 1000 avgt 30 3275,648 ± 21,742 ns/op MyBenchmark.testLoopModulo 1000 100000 avgt 30 3258,570 ± 27,360 ns/op MyBenchmark.testLoopModulo 1000 1000000 avgt 30 3259,617 ± 28,747 ns/op MyBenchmark.testLoopModulo 100000 100 avgt 30 321483,331 ± 4320,938 ns/op MyBenchmark.testLoopModulo 100000 1000 avgt 30 326319,662 ± 2419,602 ns/op MyBenchmark.testLoopModulo 100000 100000 avgt 30 327027,966 ± 3174,011 ns/op MyBenchmark.testLoopModulo 100000 1000000 avgt 30 319201,057 ± 4472,220 ns/op MyBenchmark.testLoopModulo 1000000 100 avgt 30 3053122,364 ± 31814,342 ns/op MyBenchmark.testLoopModulo 1000000 1000 avgt 30 3134151,676 ± 108227,023 ns/op MyBenchmark.testLoopModulo 1000000 100000 avgt 30 3220082,188 ± 43925,401 ns/op MyBenchmark.testLoopModulo 1000000 1000000 avgt 30 3204777,236 ± 25365,542 ns/op MyBenchmark.testLoopReset 100 100 avgt 30 159,828 ± 1,107 ns/op MyBenchmark.testLoopReset 100 1000 avgt 30 125,461 ± 0,881 ns/op MyBenchmark.testLoopReset 100 100000 avgt 30 129,912 ± 7,801 ns/op MyBenchmark.testLoopReset 100 1000000 avgt 30 134,503 ± 7,602 ns/op MyBenchmark.testLoopReset 1000 100 avgt 30 1809,207 ± 93,642 ns/op MyBenchmark.testLoopReset 1000 1000 avgt 30 1728,705 ± 70,808 ns/op MyBenchmark.testLoopReset 1000 100000 avgt 30 1354,887 ± 9,631 ns/op MyBenchmark.testLoopReset 1000 1000000 avgt 30 1350,327 ± 15,886 ns/op MyBenchmark.testLoopReset 100000 100 avgt 30 159680,209 ± 2477,183 ns/op MyBenchmark.testLoopReset 100000 1000 avgt 30 162030,985 ± 1949,660 ns/op MyBenchmark.testLoopReset 100000 100000 avgt 30 149299,890 ± 1516,486 ns/op MyBenchmark.testLoopReset 100000 1000000 avgt 30 136059,242 ± 3090,410 ns/op MyBenchmark.testLoopReset 1000000 100 avgt 30 1407369,992 ± 12979,717 ns/op MyBenchmark.testLoopReset 1000000 1000 avgt 30 1447001,173 ± 14979,769 ns/op MyBenchmark.testLoopReset 1000000 100000 avgt 30 1463913,706 ± 12564,617 ns/op MyBenchmark.testLoopReset 1000000 1000000 avgt 30 1404701,860 ± 21587,436 ns/op MyBenchmark.testNativeCall 100 100 avgt 30 58,306 ± 0,669 ns/op MyBenchmark.testNativeCall 100 1000 avgt 30 57,441 ± 0,590 ns/op MyBenchmark.testNativeCall 100 100000 avgt 30 57,595 ± 0,386 ns/op MyBenchmark.testNativeCall 100 1000000 avgt 30 60,196 ± 1,995 ns/op MyBenchmark.testNativeCall 1000 100 avgt 30 450,808 ± 8,259 ns/op MyBenchmark.testNativeCall 1000 1000 avgt 30 558,079 ± 5,724 ns/op MyBenchmark.testNativeCall 1000 100000 avgt 30 557,246 ± 4,873 ns/op MyBenchmark.testNativeCall 1000 1000000 avgt 30 565,005 ± 9,696 ns/op MyBenchmark.testNativeCall 100000 100 avgt 30 73074,811 ± 3332,432 ns/op MyBenchmark.testNativeCall 100000 1000 avgt 30 70970,603 ± 2693,394 ns/op MyBenchmark.testNativeCall 100000 100000 avgt 30 69907,864 ± 2945,072 ns/op MyBenchmark.testNativeCall 100000 1000000 avgt 30 74041,205 ± 2599,841 ns/op MyBenchmark.testNativeCall 1000000 100 avgt 30 790679,353 ± 15672,480 ns/op MyBenchmark.testNativeCall 1000000 1000 avgt 30 812660,137 ± 25490,999 ns/op MyBenchmark.testNativeCall 1000000 100000 avgt 30 838094,181 ± 12374,194 ns/op MyBenchmark.testNativeCall 1000000 1000000 avgt 30 925567,535 ± 19091,943 ns/op MyBenchmark.testStream 100 100 avgt 30 810,262 ± 54,519 ns/op MyBenchmark.testStream 100 1000 avgt 30 1344,998 ± 14,792 ns/op MyBenchmark.testStream 100 100000 avgt 30 159901,562 ± 3453,210 ns/op MyBenchmark.testStream 100 1000000 avgt 30 1407506,571 ± 419985,287 ns/op MyBenchmark.testStream 1000 100 avgt 30 6464,099 ± 169,665 ns/op MyBenchmark.testStream 1000 1000 avgt 30 5869,457 ± 260,297 ns/op MyBenchmark.testStream 1000 100000 avgt 30 165394,656 ± 4943,362 ns/op MyBenchmark.testStream 1000 1000000 avgt 30 1352900,959 ± 412849,634 ns/op MyBenchmark.testStream 100000 100 avgt 30 423531,274 ± 3944,801 ns/op MyBenchmark.testStream 100000 1000 avgt 30 391727,181 ± 5341,826 ns/op MyBenchmark.testStream 100000 100000 avgt 30 427462,700 ± 7517,953 ns/op MyBenchmark.testStream 100000 1000000 avgt 30 981304,769 ± 10206,849 ns/op MyBenchmark.testStream 1000000 100 avgt 30 4528465,859 ± 72959,405 ns/op MyBenchmark.testStream 1000000 1000 avgt 30 4121720,516 ± 60283,781 ns/op MyBenchmark.testStream 1000000 100000 avgt 30 5920334,609 ± 63051,631 ns/op MyBenchmark.testStream 1000000 1000000 avgt 30 6227476,270 ± 84066,493 ns/op 

正如预期的那样,本机调用总是领先(大约比循环版本快2倍)。

你不需要在每个循环上使用Min ,你可以将其提取到之后。 您还可以在每个循环中将偏移量加倍。

然而,我的速度似乎并不快,有时速度更快,有时也不快,但它可以替代Min调用。

 Benchmark Mode Samples Score Score error Units ceMyBenchmark.testNativeCall avgt 30 4027.890 26.544 ns/op ceMyBenchmark.westonsRepeat avgt 30 4035.817 49.168 ns/op 

码:

 @Benchmark public String[] westonsRepeat() { String[] sourceArray = TEST_ARRAY; int newLength = NEW_LENGTH; int offset = sourceArray.length; String[] dup = Arrays.copyOf(sourceArray, newLength); if (offset == 0 || offset >= newLength) return dup; int next = offset << 1; while (next < newLength) { System.arraycopy(dup, 0, dup, offset, offset); offset = next; next <<= 1; } System.arraycopy(dup, 0, dup, offset, newLength - offset); return dup; } 

我尝试的另一件事是计算我可以安全地加倍的次数,而这是一些我认为有趣的显示死胡同的表现:

 int timesCanDouble = (int)(log2(newLength) - log2(offset)); => int timesCanDouble = (int)(log2(newLength/offset)); => int timesCanDouble = 31 - numberOfLeadingZeros(newLength/offset));