lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Serba <ase...@gmail.com>
Subject Re: Importing large datasets
Date Mon, 07 Jun 2010 09:23:54 GMT
What's the relation between items and item_descriptions table? I.e. is
there only one item_descriptions record for every id?

If 1-1 then you can merge all your data into single database and use
the following query

 <entity name="item"
           dataSource="single_datasource"
           query="select * from items inner join item_descriptions on
item_descriptions.id=items.id">
 </entity>

HTH,
Alex

On Thu, Jun 3, 2010 at 6:34 AM, Blargy <zmanods@hotmail.com> wrote:
>
>
> Erik Hatcher-4 wrote:
>>
>> One thing that might help indexing speed - create a *single* SQL query
>> to grab all the data you need without using DIH's sub-entities, at
>> least the non-cached ones.
>>
>>       Erik
>>
>> On Jun 2, 2010, at 12:21 PM, Blargy wrote:
>>
>>>
>>>
>>> As a data point, I routinely see clients index 5M items on normal
>>> hardware
>>> in approx. 1 hour (give or take 30 minutes).
>>>
>>> Also wanted to add that our main entity (item) consists of 5 sub-
>>> entities
>>> (ie, joins). 2 of those 5 are fairly small so I am using
>>> CachedSqlEntityProcessor for them but the other 3 (which includes
>>> item_description) are normal.
>>>
>>> All the entites minus the item_description connect to datasource1.
>>> They
>>> currently point to one physical machine although we do have a pool
>>> of 3 DB's
>>> that could be used if it helps. The other entity, item_description
>>> uses a
>>> datasource2 which has a pool of 2 DB's that could potentially be
>>> used. Not
>>> sure if that would help or not.
>>>
>>> I might as well that the item description will have indexed, stored
>>> and term
>>> vectors set to true.
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865219.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>>
>
> I can't find any example of creating a massive sql query. Any out there?
> Will batching still work with this massive query?
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p866506.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
View raw message