lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Midas A <test.mi...@gmail.com>
Subject Re: data import
Date Fri, 20 Mar 2015 04:36:21 GMT
Hi Shawn ,

Thanks for replying .. I need clarity on following points
a) Making store false in schema for few fields will improve indexing time ?
b) Does soft commit and hard commit configuration depends on hard ware ?
c) Should i do merge factor , Rambuffersize configuration ? and how should
i decide these values ?


We are doing full indexing and it takes around 4.5 hrs ..(20 M documents )

Regards,
MA

On Fri, Mar 20, 2015 at 1:57 AM, Shawn Heisey <apache@elyograg.org> wrote:

> On 3/19/2015 11:47 AM, abhishek tiwari wrote:
> > <autoSoftCommit> <maxTime>500</maxTime> </autoSoftCommit>
>
> You're doing soft commits as often as twice a second.  You have
> configured 500 milliseconds here.  This might have something to do with
> your slow indexing speed.  A soft commit is less expensive than a full
> hard commit, but soft commits are *NOT* free, and they aren't even cheap.
>
> I doubt that you *need* your documents to be visible within half a
> second of indexing them ... and there's a good chance that even with
> this config they won't be visible that soon, because each commit is
> probably going to take longer than half a second to complete.  With a
> 500 millisecond autoSoftCommit configuration, your server may be doing
> commit operations close to 100% of the time while indexing is happening.
>
>
> http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Also, the dataimport handler is single threaded, so if you are only
> using one handler definition in solrconfig.xml, there is no parallel
> indexing.  You'll need to write your own multi-threaded indexing program
> if you want parallel indexing.
>
> Thanks,
> Shawn
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message