lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silent Surfer <>
Subject Impact of compressed=true attribute (in schema.xml) on Indexing/Query
Date Sat, 29 Aug 2009 16:12:03 GMT

We observed that when we use the setting "compressed=true" the index size is around 0.66 times
the actual log file, where as if we do not use any compressed=true setting, the index size
is almost as much as 2.6 times.

Our sample solr document size is approximately 1000 bytes. In addition to the text data we
have around 9 metadata tags associated to it. 

We need to display all off the metadata values on the GUI, and hence we are setting stored=true
in our schema.xml

Now the question is, how the compressed=true flag impacts the indexing and Querying operations.
I am sure that there will be CPU utilization spikes as there will be operation of compressing(during
indexing) and uncompressing(during querying) of the indexed data. I am mainly looking for
any bench marks for the above scenario.

The expected volumes of the data coming in would be approximately 400 GB of data per day,
so it is very important for us to evaluate the compressed=true, due to the file system utilization
and index sizing issues.

Any help would be greatly appreciated..



View raw message