lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <erik.hatc...@gmail.com>
Subject Re: DIH XPathEntityProcessor XPath subset?
Date Wed, 03 Jan 2018 16:50:52 GMT
Stefan -

If you pre-transform the XML, I’d personally recommend either transforming it into straight
up Solr XML (docs/fields/values) or some other format or posting directly to Solr.   Avoid
this DIH thing when things get complicated.

	Erik

> On Jan 3, 2018, at 11:40 AM, Stefan Moises <moises@shoptimax.de> wrote:
> 
> Hi there,
> 
> I'm trying to index a wordpress site using DIH XPathEntityProcessor... I've read it only
supports a subset of XPath, but I couldn't find any docs what exactly is supported.
> 
> After some painful trial and error, I've found that xpath expressions like the following
don't work:
> 
>             <field column="title" name="mytitle" xpath="/methodResponse/params/param/value/array/data/value/struct/member[name='post_title']/value/string"
/>
> 
> I want to find elements like this ("the 'value' element after a 'member' element with
a name element 'post_title'"):
> 
> <methodResponse>
>   <params>
>     <param>
>       <value>
>         <array>
>             <data>
>                 <value>
>                     <struct>
> <member><name>post_id</name><value><string>11809</string></value></member>
> <member><name>post_title</name><value><string>Some titel</string></value></member>
> 
> Unfortunately that is the default output structure of Wordpress' XMLrpc calls.
> 
> My Xpath expression works e.g. when testing it with https://www.freeformatter.com/xpath-tester.html
but not if I try to index it with Solr.... any ideas? Or do I have to pre-transform the XML
myself to match XPathEntityProcessors limited abilites?
> 
> Thanks in advance,
> 
> Stefan
> 
> -- 
> --
> ************************************
> Stefan Moises
> Manager Research & Development
> shoptimax GmbH
> Ulmenstraße 52 H
> 90443 Nürnberg
> Tel.: 0911/25566-0
> Fax: 0911/25566-29
> moises@shoptimax.de
> http://www.shoptimax.de
> 
> Geschäftsführung: Friedrich Schreieck
> Ust.-IdNr.: DE 814340642
> Amtsgericht Nürnberg HRB 21703
>  ************************************
> 


Mime
View raw message