lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chantal Ackermann <chantal.ackerm...@btelligent.de>
Subject Re: best way to model 1-N
Date Fri, 30 Oct 2009 16:44:47 GMT
This looks all right to me, but I might be missing something.
Which version/build of SOLR are you using?

Chantal

Joel Nylund schrieb:
> Thanks Chantal, I will keep that in mind for tuning,
> 
> for sql I figured  way to combine them into one row using concat, but
> I still seem to be having an issue splitting them:
> 
> Db now returns as one column categoryType:
> TOPIC,LANGUAGE
> 
> but my solr result, if you note the item in categoryType  all seem to
> be within one str, I would expect it to be in multiple strings within
> the array, is this assumption wrong?
> 
> <doc>
> −
> <arr name="categoryType">
> <str>TOPIC,LANGUAGE</str>
> </arr>
> <str name="id">40</str>
> <str name="title">feed title</str>
> </doc>
> 
> 
> Here is my import:
>    <document name="doc">
>          <entity name="item"
>         query="SELECT f.id, f.title
>                 FROM Feed f
>             <field column="id" name="id" />
>              <field column="title" name="title" />
>                         <entity name="category" query="select cfcr.feedId,
> group_concat(cfcr.categoryType) as categoryType
>                                                 from CFR cfcr
>                                                 where
>                                                 cfcr.feedId = '${item.id}' AND
>                                                 group by cfcr.feedId">
>                                         <field column="categoryType" name="categoryType"
> splityBy="," />
>                     </entity>
> 
>       </entity>
> 
> In schema:
>         <field name="categoryType" type="text" indexed="true" stored="true"
> required="false" multiValued="true"/>
>         <field name="categoryName" type="text" indexed="true" stored="true"
> required="false" multiValued="true"/>
> 
> 
> what am I missing?
> 
> thanks
> Joel
> 
> 
> On Oct 30, 2009, at 10:00 AM, Chantal Ackermann wrote:
> 
>> That depends a bit on your database, but it is tricky and might not
>> be performant.
>>
>> If you are more of a Java developer, you might prefer retrieving
>> mutliple rows per SOLR document from your dataSource (join on your
>> category and main table), and aggregate them in your custom
>> EntityProcessor. I got a far(!) better performance retrieving
>> everything in one query and doing the aggregation in Java. But this
>> is, of course, depending on your table structure and data.
>>
>> Noble Paul helped me with the custom EntityProcessor, and it turned
>> out quite easy. Have a look at the thread with the heading from this
>> mailing list (SOLR-USER):
>> DataImportHandler / Import from DB : one data set comes in multiple
>> rows
>>
>> Cheers,
>> Chantal
>>
>>
>> Joel Nylund schrieb:
>>> thanks, but im confused how I can aggregate across rows, I dont know
>>> of any easy way to get my db to return one row for all the categories
>>> (given the hint from your other email), I have split the category
>>> query into a separate entity, but its returning multiple rows, how do
>>> I combine multiple rows into 1 index entity?
>>> thanks
>>> Joel
>>> On Oct 29, 2009, at 8:58 PM, Avlesh Singh wrote:
>>>>> In the database this is modeled a a 1-N where category table has
>>>>> the
>>>>> mapping of feed to category
>>>>> I need to be able to query , give me all the feeds in any given
>>>>> category.
>>>>> How can I best model this in solr?
>>>>> Seems like multiValued field might help, but how would I populate
>>>>> it, and
>>>>> would the query above work?.
>>>>>
>>>> Yes you are right. A multivalued field for "categories" is the
>>>> answer.
>>>>
>>>> For populating in the index -
>>>>
>>>>  1. If you use DIH to populate your indexes and your datasource is a
>>>>  database then you can use DIH's RegexTransformer on an aggregated
>>>> list of
>>>>  categories. e.g. if your database query retruns "a,b,c,d" in a
>>>> column called
>>>>  "db_categories", this is how you would put it in DIH's data-config
>>>> file -
>>>>  <field column="db_categories" name="categories" splityBy="," />.
>>>>  2. If you "add" documents to Solr yourself  multiple values for
>>>> the field
>>>>  can be specified as an array or list of values in the
>>>> SolrInputDocument.
>>>>
>>>> A multivalued field provides the same faceting and searching
>>>> capabilites
>>>> like regular fields. There is no special syntax.
>>>>
>>>> Cheers
>>>> Avlesh
>>>>
>>>> On Fri, Oct 30, 2009 at 4:55 AM, Joel Nylund <jnylund@yahoo.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have one index so far which contains feeds.  I have been able to
>>>>> de-normalize several tables and map this data onto the feed entity.
>>>>> There is
>>>>> one tricky problem that I need help on.
>>>>>
>>>>> Feeds have 1 - many categories.
>>>>>
>>>>> So Lets say we have Category1, Category2 and Category3
>>>>>
>>>>> Feed 1 - is in Category 1
>>>>> Feed 2 is in category2 and category3
>>>>> Feed 3 is in category2
>>>>> Feed 4 has no category
>>>>>
>>>>> In the database this is modeled a a 1-N where category table has
>>>>> the
>>>>> mapping of feed to category
>>>>>
>>>>> I need to be able to query , give me all the feeds in any given
>>>>> category.
>>>>>
>>>>> How can I best model this in solr?
>>>>>
>>>>> Seems like multiValued field might help, but how would I populate
>>>>> it, and
>>>>> would the query above work?.
>>>>>
>>>>> thanks
>>>>> Joel
>>>>>
>>>>>
> 

Mime
View raw message