manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shinichiro Abe <shinichiro.ab...@gmail.com>
Subject Re: Data query of JDBC repo
Date Mon, 01 Aug 2011 11:20:34 GMT
Thank you. Indexing data of VARCHAR worked well. My solrconfig setting was incorrect.

Shinichiro

On 2011/07/29, at 19:06, Karl Wright wrote:

> Oh, FWIW, content data of type VARCHAR should also work.
> Karl
> 
> On Fri, Jul 29, 2011 at 6:05 AM, Karl Wright <daddywri@gmail.com> wrote:
>> I believe the end-user documentation talks about this to some extent.
>> Nevertheless, the JDBC handler is designed to pull all the necessary
>> information for a document, including the content data, out of a
>> single database table.  So it presumes the content is stored as either
>> CLOB data or BLOB data in one column of the table.
>> 
>> The url field is necessary because that is what ManifoldCF uses for
>> the "id" in the target search engine.  It needs this to be able to
>> remove or replace the document in the target on subsequent job runs.
>> It might as well be a URL because it presumes that the search user
>> will need some way to get to the content of the indexed document.
>> 
>> Hope that answers your question.
>> 
>> Karl
>> 
>> 2011/7/29 Shinichiro Abe <shinichiro.abe.1@gmail.com>:
>>> Hello.
>>> 
>>> I used JDBC Repository Connection and created
>>> the following view table[1] on postgesql.
>>> I set the default setting at Queries tab in job lists.
>>> I run the job, then on the Solr, only urlfield was indexed as id field.
>>> 
>>> 1)I also want to index datafield. What is needed to set?
>>> Can I use it like solr dataimporthandler?
>>> For example, can it index datafield1, datafield2, datafield…?
>>> 
>>> 2)Why ingesting datafield need to know if url is valid in source code?
>>> I want to index datafield without urlfield.
>>> 
>>> My usage may be wrong, I assumed that string data of datafield is indexed as
contents.
>>> I want to know what kind of table Data-query assume.
>>> 
>>> [1]view:documenttable
>>> | idfield             | versionfield     | urlfield             | datafield 
       | modifydatefield
>>> | char varying  | char varying    | char varying   | char varying  | bigint
>>> --------------------------------------------------------------------------
>>> | 1                      | 1                        | file:///dummy/1| test string
      | 1
>>> | 2                      | 1                        | file:///dummy/2| test info
         | 1
>>> 
>>> Thank you,
>>> Shinichiro Abe
>> 


Mime
View raw message