lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Storing Json field in Lucene
Date Wed, 22 Apr 2020 12:09:44 GMT
"Is it good idea to store complete Json as string to Lucene DB. If we store as separate fields
then we have around 30 fields. There will be 30 seeks to get complete stored fields”

This is not true. Under the covers, all the stored fields are compressed and stored as a blob
and Lucene does the magic of un-compressing that blob and extracting the stored field when
you ask for it.

Further, while you’re right that storing lots of things will bloat the index, that’s not
very important. Stored data is kept in separate files (*.fdx) in each segment and has little
to no impact on search performance. That data is not accessed unless you ask for the field
to be returned, i.e. it’s not part of the data used to get the top N documents. Say you
have a search that has 10,000,000 hits and return the top 10. _Only_ the stored data for those
top 10 hits is accessed, and that only after all the scoring is done.

I think this is premature optimization, try using the least-complex way organizing your data
and measure.

Best,
Erick

> On Apr 22, 2020, at 1:00 AM, ganesh m <emailgane@yahoo.co.in.INVALID> wrote:
> 
> Is it good idea to store complete Json as string to Lucene DB. If we store as separate
fields then we have around 30 fields. There will be 30 seeks to get complete stored fields


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message