xml-xindice-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Natalia Shilenkova <nshilenk...@gmail.com>
Subject Re: Incorrect "No result for query" for XPath expression
Date Wed, 21 Oct 2009 22:13:55 GMT
David,

The problem you're describing is not a bug, your XPath query is  
executed correctly.

Let's see what happens when query /martif/text/body/termEntry[contains 
(langSet/ntig/termGrp/term/text(),'bancaire')] is executed. First,  
XPath finds all nodes  with path /martif/text/body/termEntry/langSet/ 
ntig/termGrp/term and selects their children text nodes. The result of  
this step is node-set, which includes <term> data for every language.  
Then, XPath evaluates function contains(), where first argument is  
node-set. Per XPath specification [1], function contains expects two  
arguments of type string, not node-set, so it converts the first  
argument to string using function string(). When applied to node-set,  
it returns string value of the _first_ node in the document order.

Instead of checking <term> data for every language, it just checks if  
<term> data contains given string for language that happened to be  
first in the document. You can easily verify that by rearranging order  
of langSet tags in the document. The query /martif/text/body/termEntry 
[contains(langSet[starts-with(@lang,'fr')]/ntig/termGrp/term/text 
(),'bancaire')] works because of the same reason: contains() function  
only gets one langSet.

If you want the query that would check all the text nodes to see if  
they contain some substring, you can try something like that:
/martif/text/body/termEntry[langSet/ntig/termGrp/term[contains(text 
(),'bancaire')]]

[1] http://www.w3.org/TR/1999/REC-xpath-19991116

Regards,
Natalia


On Oct 21, 2009, at 12:02 PM, David Vergnaud wrote:

> Hi,
>
> I'm reporting on a problem which I'm pretty much convinced is a bug  
> in the current 1-2.dev version of Xindice (1.2m1). I'm using Xindice  
> running on its own (no Tomcat) as a daemon on a Linux box (Suse 11)  
> with JDK 1.6.
>
> Basically, I have a DB where I've stored terminology entries that  
> contain information about various banking terms in 4 languages. I  
> want to be able to conduct two types of searches, one where the term  
> is searched for only one of the languages, and one where the search  
> is carried out in all languages. For this, I use two versions of a  
> somewhat complicated XPath expression: one where the language is  
> specified (as attribute of one of the nodes, in a predicate) and one  
> where it isn't. This is the only difference between the two  
> expressions. Surprisingly, the one where the language is fixed does  
> return results where the one without specification doesn't. Besides,  
> I've tested the XPath expression on other systems, and seen that  
> there really should be results.
>
> The first impression is that when evaluating function arguments  
> inside a predicate, only the first node of a node set is evaluated.  
> In my case, that would be confirmed by the following fact: each  
> entry contains first the German word, then either French or English.  
> When doing an "unrefined" search (no language specification) with a  
> German word, results are returned. When doing the same unrefined  
> search with French or English, no results are returned.
>
> Here's an example of an XPath we're using, first with the language  
> refinement, then without:
> /martif/text/body/termEntry[contains(langSet[starts-with 
> (@lang,'fr')]/ntig/termGrp/term/text(),'bancaire')]
> /martif/text/body/termEntry[contains(langSet/ntig/termGrp/term/text 
> (),'bancaire')]
>
> As you can see, the goal is to extract a termEntry element which  
> contains the word "bancaire" under the specified path. In the first  
> path, I set the langSet to have attribute lang start with "fr" (for  
> French), in the second I don't. As I said before, the first  
> expression yields a result and the second one doesn't.
>
> I'm including an example DB entry which can be used to test this --  
> I assume it should be possible to observe this behaviour with only  
> one entry in the DB as well. In order to use the xpath above with  
> it, one would need to prefix all node names in the xpath expression  
> with "tbx" (I only removed that for legibility).
>
> Should this prove to be an error on my side, I'd be grateful to  
> anyone who'd point it out. Otherwise, it might need to be taken onto  
> the Xindice bug list.
>
> Cheers,
>
> David
>
> <?xml version="1.0"?>
> <martif xmlns="http://www.lisa.org/tbx" type="TBX" xml:lang="de-CH">
>  <martifHeader>
>    <fileDesc>
>      <titleStmt>
>        <title>
>          Test-TerminologieDB        </title>
>      </titleStmt>
>      <publicationStmt>
>        <p>
>           Version 1.1        </p>
>      </publicationStmt>
>      <sourceDesc>
>        <p>
>           Version 1.1        </p>
>      </sourceDesc>
>    </fileDesc>
>  </martifHeader>
>  <text>
>    <body>
>      <termEntry>
>        <descrip type="classificationCode" />
>        <descrip type="subjectField">
>        </descrip>
>        <langSet xml:lang="de-CH">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition">
>              Die Garantie ist eine selbstständige, vom  
> Hauptschuldverhältnis unabhängige Verpflichtung. Der Garant (die  
> Bank) kann keinerlei Einwendungen und Einreden aus dem Grundgeschäft  
> erheben. Das heisst: Der Garant zahlt auf erste schriftliche  
> Anforderung (Inanspruchnahme) des Begünstigten, gegen Einreichung  
> der im Garantietext vorgeschriebenen Bestätigung und allenfalls  
> vorgeschriebenen Dokumente.            </descrip>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                Bankgarantie              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>        <langSet xml:lang="en-GB">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                bank guarantee              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>        <langSet xml:lang="fr-CH">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                garantie bancaire              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>        <langSet xml:lang="it-CH">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                garanzia bancaria              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>      </termEntry>
>    </body>
>  </text>
> </martif>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com


Mime
View raw message