lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Nylund <jnyl...@yahoo.com>
Subject Re: best way to model 1-N
Date Fri, 30 Oct 2009 19:17:39 GMT
Im using apache-solr-1.3.0

I got it to work using javascript function instead.

thanks
Joel

On Oct 30, 2009, at 12:44 PM, Chantal Ackermann wrote:

> This looks all right to me, but I might be missing something.
> Which version/build of SOLR are you using?
>
> Chantal
>
> Joel Nylund schrieb:
>> Thanks Chantal, I will keep that in mind for tuning,
>> for sql I figured  way to combine them into one row using concat, but
>> I still seem to be having an issue splitting them:
>> Db now returns as one column categoryType:
>> TOPIC,LANGUAGE
>> but my solr result, if you note the item in categoryType  all seem to
>> be within one str, I would expect it to be in multiple strings within
>> the array, is this assumption wrong?
>> <doc>
>> −
>> <arr name="categoryType">
>> <str>TOPIC,LANGUAGE</str>
>> </arr>
>> <str name="id">40</str>
>> <str name="title">feed title</str>
>> </doc>
>> Here is my import:
>>   <document name="doc">
>>         <entity name="item"
>>        query="SELECT f.id, f.title
>>                FROM Feed f
>>            <field column="id" name="id" />
>>             <field column="title" name="title" />
>>                        <entity name="category" query="select  
>> cfcr.feedId,
>> group_concat(cfcr.categoryType) as categoryType
>>                                                from CFR cfcr
>>                                                where
>>                                                cfcr.feedId = '$ 
>> {item.id}' AND
>>                                                group by cfcr.feedId">
>>                                        <field column="categoryType"  
>> name="categoryType"
>> splityBy="," />
>>                    </entity>
>>      </entity>
>> In schema:
>>        <field name="categoryType" type="text" indexed="true"  
>> stored="true"
>> required="false" multiValued="true"/>
>>        <field name="categoryName" type="text" indexed="true"  
>> stored="true"
>> required="false" multiValued="true"/>
>> what am I missing?
>> thanks
>> Joel
>> On Oct 30, 2009, at 10:00 AM, Chantal Ackermann wrote:
>>> That depends a bit on your database, but it is tricky and might not
>>> be performant.
>>>
>>> If you are more of a Java developer, you might prefer retrieving
>>> mutliple rows per SOLR document from your dataSource (join on your
>>> category and main table), and aggregate them in your custom
>>> EntityProcessor. I got a far(!) better performance retrieving
>>> everything in one query and doing the aggregation in Java. But this
>>> is, of course, depending on your table structure and data.
>>>
>>> Noble Paul helped me with the custom EntityProcessor, and it turned
>>> out quite easy. Have a look at the thread with the heading from this
>>> mailing list (SOLR-USER):
>>> DataImportHandler / Import from DB : one data set comes in multiple
>>> rows
>>>
>>> Cheers,
>>> Chantal
>>>
>>>
>>> Joel Nylund schrieb:
>>>> thanks, but im confused how I can aggregate across rows, I dont  
>>>> know
>>>> of any easy way to get my db to return one row for all the  
>>>> categories
>>>> (given the hint from your other email), I have split the category
>>>> query into a separate entity, but its returning multiple rows,  
>>>> how do
>>>> I combine multiple rows into 1 index entity?
>>>> thanks
>>>> Joel
>>>> On Oct 29, 2009, at 8:58 PM, Avlesh Singh wrote:
>>>>>> In the database this is modeled a a 1-N where category table has
>>>>>> the
>>>>>> mapping of feed to category
>>>>>> I need to be able to query , give me all the feeds in any given
>>>>>> category.
>>>>>> How can I best model this in solr?
>>>>>> Seems like multiValued field might help, but how would I populate
>>>>>> it, and
>>>>>> would the query above work?.
>>>>>>
>>>>> Yes you are right. A multivalued field for "categories" is the
>>>>> answer.
>>>>>
>>>>> For populating in the index -
>>>>>
>>>>> 1. If you use DIH to populate your indexes and your datasource  
>>>>> is a
>>>>> database then you can use DIH's RegexTransformer on an aggregated
>>>>> list of
>>>>> categories. e.g. if your database query retruns "a,b,c,d" in a
>>>>> column called
>>>>> "db_categories", this is how you would put it in DIH's data-config
>>>>> file -
>>>>> <field column="db_categories" name="categories" splityBy="," />.
>>>>> 2. If you "add" documents to Solr yourself  multiple values for
>>>>> the field
>>>>> can be specified as an array or list of values in the
>>>>> SolrInputDocument.
>>>>>
>>>>> A multivalued field provides the same faceting and searching
>>>>> capabilites
>>>>> like regular fields. There is no special syntax.
>>>>>
>>>>> Cheers
>>>>> Avlesh
>>>>>
>>>>> On Fri, Oct 30, 2009 at 4:55 AM, Joel Nylund <jnylund@yahoo.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have one index so far which contains feeds.  I have been able 

>>>>>> to
>>>>>> de-normalize several tables and map this data onto the feed  
>>>>>> entity.
>>>>>> There is
>>>>>> one tricky problem that I need help on.
>>>>>>
>>>>>> Feeds have 1 - many categories.
>>>>>>
>>>>>> So Lets say we have Category1, Category2 and Category3
>>>>>>
>>>>>> Feed 1 - is in Category 1
>>>>>> Feed 2 is in category2 and category3
>>>>>> Feed 3 is in category2
>>>>>> Feed 4 has no category
>>>>>>
>>>>>> In the database this is modeled a a 1-N where category table has
>>>>>> the
>>>>>> mapping of feed to category
>>>>>>
>>>>>> I need to be able to query , give me all the feeds in any given
>>>>>> category.
>>>>>>
>>>>>> How can I best model this in solr?
>>>>>>
>>>>>> Seems like multiValued field might help, but how would I populate
>>>>>> it, and
>>>>>> would the query above work?.
>>>>>>
>>>>>> thanks
>>>>>> Joel
>>>>>>
>>>>>>


Mime
View raw message