lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: date issues
Date Thu, 23 Feb 2012 13:26:28 GMT
1> Don't use sint, it's being deprecated. And it'll take up more space
than a TrieDate
2> Precision. Sure, use the coarsest time you can, normalizing
everything to day would be a good thing.

You won't get any space savings by storing to day resolution, it's
just a long under the covers. But
depending on how you're doing your query, you may get much less memory
usage since some searches are sensitive to the number of *unique* terms
in a field and you'll reduce that number.

But without some idea of the queries you're running it's hard to say whether
this will help.


On Thu, Feb 23, 2012 at 1:25 AM, Jason Toy <> wrote:
> I  have a solr instance with about 400m docs. For text searches it is perfectly fine.
When I do searches that calculate  the amount of times a word appeared in the doc set for
every day of a month, it usually causes solr to crash with out of memory errors.
> I calculate this by running  ~30 queries, one for each day to see the count for that
> Is there a better way I could do this?
> Currently the date fields are stored as:
> <fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0"
> and the timestamps are stored in the format of:
> 2012-02-22T21:11:14Z
> We have no need to store anything beyond the date. Will just changing the time portion
to zeros make things faster:
> 2012-02-22T00:00:00Z
> I thought that to optimize this, there would be an actual date type that doesnt store
the time component, but looking through the solr docs, I don't see anything specifically for
a date as opposed to a timestamp.  Would it be faster for me to store dates in an sint format?
 What is the optimal format I should use? If the format is to continue to use TrieDateField,
 is it not a waste to store the hour/minute/seconds even if they are not being used?
> Is there anything else I can do to make this more efficient?
> I have looked around on the mailing list and on google and not sure what to use, I would
appreciate any pointers.  Thanks.
> Jason
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message