lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (LUCENE-460) hashCode improvements
Date Sun, 30 Oct 2005 15:04:55 GMT
    [ ] 

Yonik Seeley commented on LUCENE-460:

A couple of guidelines off the top of my head...
 - hash codes should strive to be unique across the Query hierarchy, not just unique within
one specific subclass.  For example, TermQuery(t) and SpanTermQuery(t) will generate the exact
same hash codes.
- mix bits between different components that have any hashCode parts in common... 
   for example RangeQuery will produce the same hashCode whenever lowerTerm==upperTerm.
   Also, field[x TO y] will produce the same hashCode for *any* field since the fieldname
parts of the
  terms will always cancel eachother out.  This will also cause the hashCode of field{x TO
x} to equal field:x
  The hashCode of FilteredQuery will also cause many collisions because the bits aren't mixed
   the query and the filter.
  Remember that every query as a boost component... never just xor two query hashCodes together.
- make things position dependent.
  Currently, field[x TO y] will produce the same hasCode as field[y TO x]... not particularly
important for RangeQuery, but
   you get the idea. 
- don't be afraid of using "+" instead of "^".  They both take a single CPU cycle, but "+"
is not quite so easily (accidentally) reversed.
- flipping more than a single bit when hashing a boolean might be a good idea - it will make
collisions harder. is an interesting link on integer hash
codes (what we are in effect doing when we combine multiple hash codes).  Esp interesting
is the section "Parallel Operations"

> hashCode improvements
> ---------------------
>          Key: LUCENE-460
>          URL:
>      Project: Lucene - Java
>         Type: Improvement
>   Components: Search
>     Versions: CVS Nightly - Specify date in submission
>     Reporter: Yonik Seeley
>     Priority: Minor
>      Fix For: CVS Nightly - Specify date in submission

> It would be nice for all Query classes to implement hashCode and equals to enable them
to be used as keys when caching.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message