lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: data-import problem
Date Wed, 05 Jun 2013 21:02:39 GMT
My usual admonishment is that Solr isn't a database, and when
you try to use it like one you're just _asking_ for problems. That
said....

Consider two options:
1> use a different core for each table.
2> in schema.xml, remove the id field (required="true" _might_ be specified)
                              remove the <uniqueKey> definition in schema.xml
    You'll have to re-index of course.

But do not, that while Solr does not _require_ a <uniqueKey> definition,
almost all solr installations have one.

Best
Erick

On Wed, Jun 5, 2013 at 3:19 PM, Stavros Delisavas <stavros@delisavas.de> wrote:
> Thanks for the hints.
> I am not sure how to solve this issue. I previously made a typo, there are
> definetly two different tables.
> Here is my real configuration:
>
> http://pastebin.com/JUDzaMk0
>
> For testing purposes I added "LIMIT 10" to the SQL-statements because my
> tables are very huge and tests would take too long (about 5gb, 6.5million
> rows). I included my whole data-config and what I have changed from the
> default schema.xml. I don't know how to solve the "all ids have to be
> unique"-problem. I can not believe that Solr does not offer any solution at
> all to handle multiple data sources with their own individual ids. Maybe its
> possible to have solr create its own ids while importing the data?
>
> Actually there is no direct relation between my "name"-table and my
> "title"-table. All I want is to be able to do fast text-search in those two
> tables in order to find the belonging ids of these entries.
>
> Let me know if you need more information.
>
> Thank you!
>
>
>
>
>
> Am 05.06.2013 20:54, schrieb Gora Mohanty:
>
>> On 6 June 2013 00:09, Stavros Delisavas <stavros@delisavas.de> wrote:
>>>
>>> Thanks so far.
>>>
>>> This change makes Solr work over the title-entries too, yay! Unfortunatly
>>> they don't get processed(skipped rows). In my log it says
>>> "missing required field id" for every entry.
>>>
>>> I checked my schema.xml. In there "id" is not set as a required field.
>>> removing the uniquekey-property also leads to no improvement.
>>
>> [...]
>>
>> There are several things wrong with your problem statement.
>> You say that you have two tables, but both SELECTs seem
>> to use the same table. I am going to assume that you really
>> have two different tables.
>>
>> Unless you have changed the default schema.xml, "id" should
>> be defined as the uniqueKey for the document. You probably
>> do not want to remove that, and even if you just remove the
>> uniqueKey property, the field "id" remains defined as a required
>> field.
>>
>> The issue is with with your SELECT for the second entity:
>> <entity name="title" query="SELECT id AS titleid, title FROM
>> name"></entity>
>> This renames "id" to titleid, and hence the required field
>> "id" in schema.xml is missing.
>>
>> While you do need something like:
>> <document>
>>        <entity name="name" query="SELECT id, name FROM name1"></entity>
>>        <entity name="title" query="SELECT id, title FROM name2"></entity>
>> </document>
>>
>> However, you will need to ensure that the ids are unique
>> in the two tables, else entries from the second entity will
>> overwrite matching ids from the first.
>>
>> Also, do you have field definitions within the entities? Please
>> share the complete schema.xml and the DIH configuration
>> file with us, rather than snippets: Use pastebin.com if they
>> are large.
>>
>> Regards,
>> Gora
>>
>

Mime
View raw message