Karl,
Now that we have MCF sending documents to ES so that they are properly being scanned, I’m finding a couple of oddities.
I’m using the JDBC connector to feed ES, where the main ‘document’ (identified by the $(DATACOLUMN) variable) is in XML. Therefore, I set the $(CONTENTTYPE) column to ‘application/xml’. Generally, this works. But…
1)
I didn’t set the “Allowed MIME Types” on the ES tab in the job to allow “application/xml”. I was expecting to have all of the rows filtered out. That didn’t happen. All rows returned were indexed by ES anyway.
2)
Some of the columns (which are of type nvarchar) have embedded linefeed and/or return characters in them (e.g. mult-line addresses). These are getting flagged as JSON errors by ES (as containing an ‘unescaped character’). I see that
ElasticSearchIndex::jsonStringEscape() doesn’t deal with non-printable characters. Should it?
Regards,
Rick
Richard D. Nichols
Staff Engineer
Tellabs, Inc.
18583 N. Dallas Parkway
Dallas, TX 75287
Office: (972) 588-6942
richard.nichols@tellabs.com
Want the latest news
on what’s driving the telecom industry? Subscribe to Tellabs
Insight Magazine