lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From txlap786 <tx...@hotmail.com>
Subject xml indexing
Date Tue, 04 Jul 2017 15:15:16 GMT
Hello everyone o/. Im trying to index a xml file using DIH. 

Its mostly like this.

---- EXAMPLE DIH CONFIG STRUCTURE----
<entity processor=&quot;FileListEntityProcessor&quot; ..
        &lt;entity processor=&quot;XPathEntityProcessor&quot; ..
forEach=&quot;/entryHeader&quot;
                &lt;field column=.. xpath=.. />
                <field column=.. xpath=.. />
                <field column=.. xpath=.. />
        </entity>
</entity>

---- EXAMPLE XML STRUCTURE ----
<gl-cor:entryHeader>                            
	<gl-cor:entryNumberCounter> xx </gl-cor:entryNumberCounter>
	<gl-cor:entryNumber> xx </gl-cor:entryNumber>
	<gl-cor:entryDetail>
		<gl-cor:lineNumber> xx </gl-cor:lineNumber>
		<gl-cor:detailComment> xx </gl-cor:detailComment>
	</gl-cor:entryDetail>
	<gl-cor:entryDetail>
		<gl-cor:lineNumber> xx </gl-cor:lineNumber>
		<gl-cor:detailComment> xx </gl-cor:detailComment>
	</gl-cor:entryDetail>
	<gl-cor:entryDetail>
		<gl-cor:lineNumber> xx </gl-cor:lineNumber>
		<gl-cor:detailComment> xx </gl-cor:detailComment>
	</gl-cor:entryDetail>
</gl-cor:entryHeader>
<gl-cor:entryHeader>                            
	<gl-cor:entryNumberCounter> xx </gl-cor:entryNumberCounter>
	<gl-cor:entryNumber> xx </gl-cor:entryNumber>
	<gl-cor:entryDetail>
		<gl-cor:lineNumber> xx </gl-cor:lineNumber>
		<gl-cor:detailComment> xx </gl-cor:detailComment>
	</gl-cor:entryDetail>
</gl-cor:entryHeader>
<gl-cor:entryHeader>                            
	<gl-cor:entryNumberCounter> xx </gl-cor:entryNumberCounter>
	<gl-cor:entryNumber> xx </gl-cor:entryNumber>
	<gl-cor:entryDetail>
		<gl-cor:lineNumber> xx </gl-cor:lineNumber>
	</gl-cor:entryDetail>
	<gl-cor:entryDetail>
		<gl-cor:lineNumber> xx </gl-cor:lineNumber>
	</gl-cor:entryDetail>
</gl-cor:entryHeader>

(at  detailComment doesnt exist) !! 

---- JSON return ----

"detailComment",
[
"100.01",
"102.01",
"102.02",
"120.01",
"120.02",
"153.01",
"320.01",
null,
null
]

---- INDEXED ----

"detailComment" : [
"100.01",
"102.01",
"102.02",
"120.01",
"120.02",
"153.01",
"320.01"  
]


so,
<field name="detailComment" ... multiValued="true" default="somethingelse"/>
default doesnt work due to multivalued

How can i index those null as something visible. like "0","null","NULL" or
"empty"

I want the indexed ones to be same as json return.. 

Can i use xPathprocessor inside of xPathprocessor to get those "entryDetail"
?
So i wont have to use multivalues anymore. just gonna set default values for
each



--
View this message in context: http://lucene.472066.n3.nabble.com/xml-indexing-tp4344191.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message