更快的哈希函数

我正在尝试实现自己的哈希函数，我使用java将每个字符串的ASCII编号相加。我通过查找哈希表大小和总和来找到哈希码。大小％之和。我想知道在搜索字符串时是否有办法使用相同的过程但减少冲突？

提前致谢。

我会查看String和HashMap的代码，因为它们具有较低的冲突率并且不使用%并处理负数。

来自String的来源

 public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value; for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } return h; }

来自HashMap的源代码

 /** * Retrieve object hash code and applies a supplemental hash function to the * result hash, which defends against poor quality hash functions. This is * critical because HashMap uses power-of-two length hash tables, that * otherwise encounter collisions for hashCodes that do not differ * in lower bits. Note: Null keys always map to hash 0, thus index 0. */ final int hash(Object k) { int h = 0; if (useAltHashing) { if (k instanceof String) { return sun.misc.Hashing.stringHash32((String) k); } h = hashSeed; } h ^= k.hashCode(); // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }

由于HashMap总是可以使用2的大小

  hash = (null != key) ? hash(key) : 0; bucketIndex = indexFor(hash, table.length);

和

 /** * Returns index for hash code h. */ static int indexFor(int h, int length) { return h & (length-1); }

使用&比%快得多，只返回正数，因为长度是正数。

Java String.hashcode（）在一个非常好的哈希函数和尽可能高效的哈希函数之间进行权衡。简单地在字符串中添加字符值不是可靠的散列函数。

例如，考虑两个字符串dog和god 。由于它们都包含’d’，’g’和’o’，因此任何仅涉及添加的方法都不会产生不同的哈希码。

实现了Java的很好部分的Joshua Bloch在他的“ Effective Java”一书中讨论了String.hashCode（）方法，并讨论了在1.3之前的Java版本中，String.hashCode（）函数如何仅考虑16个字符在给定的字符串中。这比当前的实施运行得快一些，但在某些情况下导致的性能令人震惊。

通常，如果您的特定数据集定义非常明确并且可以利用其中的某些唯一性，那么您可能可以创建更好的散列函数。对于通用字符串，祝你好运。

更快的哈希函数

java xslt教程

（java）ObjectInputStream反序列化错误版本的对象

在persistence.xml中引用Tomcat JNDI数据源

Maven Groovy和Java + Lombok

如何安全地删除java中的文件

双重检查锁定没有波动

如何在Java中解析WSDL？

XML中的不需要的元素通过XSTREAM

增加tomcat 7中的并发连接数

保存到属性文件转义：