lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Unique key error while indexing pdf files
Date Mon, 01 Jul 2013 12:59:19 GMT
It's really 100% up to you how you want to come up with the unique key 
values for your documents. What would you like them to be? Just use that. 
Anything (within reason) - anything goes.

But it also comes back to your data model. You absolutely must come up with 
a data model for how you expect to index and query data in Solr before you 
just start throwing random data into Solr.

1. Design your data model.
2. Produce a Solr schema from that data model.
3. Map the raw data from your data sources (e.g., PDF files) to the fields 
in your Solr schema.

That last step includes the ID/key field, but your data model will imply any 
requirements for what the ID/key should be.

To be absolutely clear, it is 100% up to you to design the ID/key for every 
document; Solr does NOT do that for you.

Even if you are just "exploring", at least come up with an "exploratory" 
data model - which includes what expectations you have about the unique 
ID/key for each document.

So, for that first PDF file, what expectation (according to your data model) 
do you have for what its ID/key should be?

-- Jack Krupansky

-----Original Message----- 
From: archit2112
Sent: Monday, July 01, 2013 8:30 AM
To: solr-user@lucene.apache.org
Subject: Re: Unique key error while indexing pdf files

Im new to solr. Im just trying to understand and explore various features
offered by solr and their implementations. I would be very grateful if you
could solve my problem with any example of your choice. I just want to learn
how i can index pdf documents using data import handler.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074327.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Mime
View raw message