lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: embedded documents
Date Mon, 25 Aug 2014 20:29:59 GMT
And a comparison to Elasticsearch would be helpful, since ES gets a lot of 
mileage from their super-easy JSON support. IOW, how much of the ES 
"advantage" is eliminated.

-- Jack Krupansky

-----Original Message----- 
From: Noble Paul
Sent: Monday, August 25, 2014 1:59 PM
To: solr-user@lucene.apache.org
Subject: Re: embedded documents

The simplest use case is to dump the entire json using split=/&f=/** . i am
planning to add an alias for the same (SOLR-6343) .

The nested docs is missing now and we will need to add it. A ticket needs
to be opened


On Mon, Aug 25, 2014 at 6:45 AM, Jack Krupansky <jack@basetechnology.com>
wrote:

> Thanks, Erik, but... I've read that Jira several times over the past
> month, it is is far too cryptic for me to make any sense out of what it is
> really trying to do. A simpler approach is clearly needed.
>
> My perception of SOLR-6304 is not that it indexes a single JSON object as
> a single Solr document, but that it generates a collection of separate
> documents, somewhat analogous to Lucene block/child documents, but... not
> quite.
>
> I understood the request on this message thread to be the flattening of a
> single nested JSON object to a single Solr document.
>
> IMHO, we need to be trying to make Solr more automatic and more
> approachable, not an even more complicated "toolkit".
>
> -- Jack Krupansky
>
> -----Original Message----- From: Erik Hatcher
> Sent: Monday, August 25, 2014 9:32 AM
>
> To: solr-user@lucene.apache.org
> Subject: Re: embedded documents
>
> Jack et al - there’s now this, which is available in the any-minute
> release of Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304
>
> Erik
>
> On Aug 25, 2014, at 5:01 AM, Jack Krupansky <jack@basetechnology.com>
> wrote:
>
>  That's a completely different concept, I think - the ability to return a
>> single field value as a structured JSON object in the "writer", rather 
>> than
>> simply "loading" from a nested JSON object and distributing the key 
>> values
>> to normal Solr fields.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Bill Bell
>> Sent: Sunday, August 24, 2014 7:30 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: embedded documents
>>
>> See my Jira. It supports it via json.fsuffix=_json&wt=json
>>
>> http://mail-archives.apache.org/mod_mbox/lucene-dev/
>> 201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E
>>
>> Bill Bell
>> Sent from mobile
>>
>>
>>  On Aug 24, 2014, at 6:43 AM, "Jack Krupansky" <jack@basetechnology.com>
>>> wrote:
>>>
>>> Indexing and query of raw JSON would be a valuable addition to Solr, so
>>> maybe you could simply explain more precisely your data model and
>>> transformation rules. For example, when multi-level nesting occurs, what
>>> does your loader do?
>>>
>>> Maybe if the fielld names were derived by concatenating the full path of
>>> JSON key names, like titles_json.FR, field_naming nesting could be 
>>> handled
>>> in a fully automated manner.
>>>
>>> I had been thinking of filing a Jira proposing exactly that, so that
>>> even the most deeply nested JSON maps could be supported, although
>>> combinations of arrays and maps would be problematic.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Michael Pitsounis
>>> Sent: Wednesday, August 20, 2014 7:14 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: embedded documents
>>>
>>> Hello everybody,
>>>
>>> I had a requirement to store complicated json documents in solr.
>>>
>>> i have modified the JsonLoader to accept complicated json documents with
>>> arrays/objects as values.
>>>
>>> It stores the object/array and then flatten it and  indexes the fields.
>>>
>>> e.g  basic example document
>>>
>>> {
>>>      "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
>>> title"} ,
>>>      "id": 1000003,
>>>      "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
>>> }
>>>
>>> It will store titles_json:{"FR":"This is the FR title" , "EN":"This is
>>> the
>>> EN title"}
>>> and then index fields
>>>
>>> titles.FR:"This is the FR title"
>>> titles.EN:"This is the EN title"
>>>
>>>
>>> Do you see any problems with this approach?
>>>
>>>
>>>
>>> Regards,
>>> Michael Pitsounis
>>>
>>
>>


-- 
-----------------------------------------------------
Noble Paul 


Mime
View raw message