lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Johnson <jej2...@gmail.com>
Subject Re: Solr Lucene Index Version
Date Thu, 08 Dec 2011 13:50:48 GMT
Mark,

Agreed that Replication wouldn't help, I was dreaming that there was
some intermediate format used in replication.

Ideally you are right, I could just reindex the data and go on with
life, but my case is not so simple.  Currently we have some set of
processes which is run against the raw artifact to index things of
interest within the text document.  I don't believe (and I need to
check with the folks who wrote this) that I have an easy way to do
this currently but this would be my preference.

Andrzej,

Isn't the codec stuff merged with trunk now?  Admittedly I know very
little about Lucene's index format but I'd be willing to be a guinea
pig if you needed a tester.


On Thu, Dec 8, 2011 at 5:34 AM, Andrzej Bialecki <ab@getopt.org> wrote:
> On 08/12/2011 05:00, Mark Miller wrote:
>>
>> Replication just copies the index, so I'm not sure how this would help
>> offhand?
>>
>> With SolrCloud this is a breeze - just fire up another replica for a shard
>> and the current index will replicate to it.
>>
>> If you where willing to export the data to some portable format and then
>> pull it back in, why not just store the original data and reindex?
>
>
> This was actually one of the situations that motivated that jira issue -
> there are scenarios where reindexing, or keeping the original data, is very
> costly, in terms of space, time, I/O, pre-processing costs, curating,
> merging, etc, etc...
>
> The good news is that once the recent work on the codecs is merged with the
> trunk then we can revisit this issue and implement it with much less effort
> than before - we could even start by modifying SimpleTextCodec to be more
> lenient, and proceed from there.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>

Mime
View raw message