lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: any plan to support hash index?
Date Mon, 24 Apr 2017 08:19:01 GMT
Le lun. 24 avr. 2017 à 05:02, 马可阳 <makeyang@jd.com> a écrit :

> Last week I did a test:
> Use a pk to search elasticsearch, and it’s TPS is 16000+(I use mmap fs,
> filter etc to optimize search performance) while I use Redis to do it and
> it’s TPS is 80000+
>

If you want optimal performance, you should use the GET API. The search API
introduces overhead like opening index inputs, creating weights,
interacting with the query cache, etc. which you can skip if you use the
GET API. It can also apply deleted docs more efficiently since it knows
there is a at most one match for any key.

There are a bouch of overhead like network and other staff I’ll exclude
> later, but I wonder if Lucene can build a hash index to support this
> scenario?
>

I believe you could implement a custom PostingsFormat that looks up terms
using hashing. It would have some limitations, eg. it would not be possible
to run prefix or range queries, but for a primary key this should not be a
problem. It could be an interesting addition to the Lucene/codecs module.
However I need to warn you that it is quite unlikely that we end up
supporting another postings format from a backward compatibility
perspective as it is a lot of efforts so we only want to support one.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message