Here is a direct replacement for Java’s String.hashCode() method implemented in Javascript.
I wrote this function to fulfill a requirement at work. Apparently, the back-end engineers thought hashCode() was a standard function. One of the hurdles for this project was not only figuring out how to translate the mathematical formula used in Java to generate hashCode()’s but also how to force Javascript to use 32bit integer math (no small feat).
Fortunately, I discovered that Java supports bitwise operators which are constrained to 32bit integer math.
So here’s the resulting String prototype in Javascript. With this prototype you can simply call .hashCode() on any string, ie. “some string”.hashCode(), and receive a numerical hash code (more specifically, a Java equivalent) such as 1395333309.
String.prototype.hashCode = function(){ var hash = 0; if (this.length == 0) return hash; for (i = 0; i < this.length; i++) { char = this.charCodeAt(i); hash = ((hash<<5)-hash)+char; hash = hash & hash; // Convert to 32bit integer } return hash; }
A left shift would be faster than a multiply.
The length better be low enough that 31^length < 2^31.
The length should be approx. less than 10 characters, or it reaches maxint boundaries and shifts to a minus integer.
Java must handle this some other way.
code is undefined, not 0
declare your variables properly and you won’t get such bugs
I’m guessing this line
if (this.length == 0) return code;
Is meant to be
if (this.length == 0) return hash;
As you have no variable ‘code’ anywhere. And agree about the shift operations, usually a lot quicker than multiplies..
You’re right, that should be hash instead of code.
As for the negative numbers produced by long strings, that is desired behavior because Java calculates hashCodes in 32bit.
I’ve also updated the post to use a left shift.
I get negative hash with some strings. Is that OK? How can I make it positive in all cases?
Yes, this function was modeled on Java’s hashCode function: http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/String.html#hashCode()
..which uses Java’s int data type which, being 32bit, means that large results will be negative:
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
May be
for (var i = 0; i < this.length; i++) {
not
for (i = 0; i < this.length; i++) {
And I think if we cash this.length it was more faster, finally:
String.prototype.hashCode = function(){
var hash = 0, len = this.length;
if ( len === 0 ) return hash;
for( var i = 0; i < len; ++i ) {
char = this.charCodeAt(i);
hash = ((hash<<5)-hash)+char;
hash = hash & hash; // Convert to 32bit integer
}
return hash;
}
char is reserved keyword, and shouldn’t be used as variable name. For example yui-compressor fails to pack code that includes invalid variable names.
same function, 85 bytes:
function(e){for(var r=0,i=0;i<e.length;i++)r=(r<<5)-r+e.charCodeAt(i),r&=r;return r};
Thanks! Just what I was looking for!
Great little function. Needed this for a non-secure simple hash comparison function and it is an efficient way of doing it.
Pingback: Stub: 2013 Week 6 « andrewBridge
This demonstrates my lack of understanding hashes probably…but, how do you reverse the process. Once you’ve converted the string to a hash, how do you turn it back into a string?
Hey Adam, here is a similar question on StackOverflow. Basically the answer is that a hash is a one-way function and so there is no way to reverse it with any certainty that there aren’t possible collisions with another string. In fact, since java’s string hash calculation is 32bit there is a very good chance that any string you generate a hash for will find a collision. Here is more info on collision probabilities.
mileusna, wes: I was looking at the negatives too and discovered from http://stackoverflow.com/questions/1908492/unsigned-integer-in-javascript that operator >>> can be used to convert Number-wrapped int32’s to uint32, so change e.g. return hash; to return (hash >>> 0);
Another remark – while the actual difference may be rather small, ++i should be faster than i++ and at least looks faster to developers with C/C++ background. ++i also gives better ops/sec on ff and chromium @ http://jsperf.com/ipp-vs-ppi-2/2.
Pingback: Hash Function in Javascript
There are two additional improvements that could be made to this:
i is not being var’d.
So your var statement could look like this instead:
var hash = 0
i = 0;
And instead of this line:
hash = hash & hash;
You could change it to this:
hash &= hash;
Hey,
Thank you so much for sharing! I came here, as many others, via the StackOverflow Answer http://stackoverflow.com/a/8831937/841636
I have adapted the excellent original function so it
* doesn’t use the String’s Prototype,
* declares every variable used
* avoids the reserved word “char”
* takes #8 Andreys suggestion of caching string length
utils.checksum = function(s) {
var hash = 0,
strlen = s.length,
i,
c;
if ( strlen === 0 ) {
return hash;
}
for ( i = 0; i < strlen; i++ ) {
c = s.charCodeAt( i );
hash = ((hash << 5) – hash) + c;
hash = hash & hash; // Convert to 32bit integer
}
return hash;
};
Pingback: Implement Local Cache of Fetched AJAX Requests – Latest posts about technology