lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordon Saardchit <>
Subject Re: Get Analyzed/Tokenized Field List
Date Fri, 24 Dec 2010 18:16:19 GMT
Heh, yes, all stuff I know.  My question was if an index contained any meta data which revealed
whether or not a certain indexed field had been analyzed or not, which I think you are saying
it does not.

Our searching and indexing is isolated into 2 completely seperate packages which can be deployed
independantly of each other.  The only common dependency (obviously) is the index itself.
 That being said, I was trying to determine from the search runtime if the given fieldname/input
pair should be analyzed or not when building the query without having any knowledge of how
the index was created.


On Dec 23, 2010, at 5:59 PM, Erick Erickson wrote:

> I guess I'm missing the point. The fact that it is stored is irrelevant for
> searching. Stored
> fields really only govern whether Document.getField("fieldname") returns
> anything #after#
> the search. You can find out if a field is stored-only by asking
> IndexReader.getFields
> for UNINDEXED, and you can search on anything that is INDEXED.
> So if, say, you're creating a drop-down with a selection of fields to choose
> from, you
> should be able to get the list by looking for INDEXED.
> But somewhere you've got to insure that the analyzers used at index time are
> identical
> or compatible with those used at query time. If all you're concerned is
> building up a string
> like "+text:stuff +title:nonsense" and handing that off to the app that
> knows how the index
> was built (so it can use the right analyzers for the text and title fields
> when parsing the input)
> looking for INDEXED should be fine.
> If you're #only# using  your custom analyzer for searchable fields, it's
> fine because any INDEXED
> field can use the your custom analyzer.
> But if you use different analyzers for different searchable fields, there's
> no way I know of to
> analyze an index and answer the question "what analyzer was this field
> created with",
> that knowledge is built a-priori into the app as far as I know.
> Best
> Erick
> On Thu, Dec 23, 2010 at 6:32 PM, Jordon Saardchit <>wrote:
>> The basic use case is determiniation of rules in regards to building a
>> query.  I've got an application that programmatically builds queries
>> (without any pre existing knowledge of the contents of the index it is
>> searching).  We have a custom designed analyzer and filter chain.  However,
>> it is applied to certain fields at index time.  The fields it is applied to
>> are unstored.
>> On the search side, I want to be able to determine at runtime which field
>> the analyzer should be applied to, and which field not to.  I could be
>> approaching the solution incorrectly, but I figured this would be a pretty
>> common or natural use case.
>> Jordon
>> On Dec 23, 2010, at 2:51 PM, Erick Erickson wrote:
>>> Ah, you didn't mention indexed but unstored in your original message,
>>> just indexed/analyzed....
>>> I don't think you can (someone jump in here if I'm wrong, please). The
>>> problem
>>> is that Lucene doesn't require any sort of schema. So if you are
>> perfectly
>>> free to
>>> store a field in one document and NOT store it in another. All the
>> variants
>>> specified in IndexReader.fieldOption can quickly be determined by just
>>> looking at the
>>> various index files. But you'd have to spin through all the #documents#
>> in
>>> order
>>> to answer the question "is this field ever stored?". Sounds like a table
>>> scan in the
>>> DB world.
>>> I don't think Lucene keeps meta-data for this, and spinning through all
>> the
>>> documents
>>> would be expensive...
>>> Why do you want to know? Perhaps there's another way to satisfy the
>>> use-case.
>>> I could be way off base here, I'm speaking from general principles not
>>> knowledge of
>>> the code...
>>> Best
>>> Erick
>>> On Thu, Dec 23, 2010 at 4:43 PM, Jordon Saardchit <
>>> wrote:
>>>> Yes I have, and after testing each of the various options denoted in
>>>> IndexReader.FieldOption, I cannot retrieve fieldnames that are indexed
>>>> (analyzed), and unstored.  I figured this would be relatively easy to do
>> and
>>>> I was simply overlooking something.  Is it perhaps not possible to do
>> this?
>>>> Jordon
>>>> On Dec 23, 2010, at 1:30 PM, Erick Erickson wrote:
>>>>> Have you looked at IndexReader.getFieldNames()?
>>>>> Best
>>>>> Erick
>>>>> On Thu, Dec 23, 2010 at 3:23 PM, Jordon Saardchit <
>>>>> wrote:
>>>>>> Is there an easy way to retrieve a collection of fields (or field
>> names)
>>>>>> that are analyzed/tokenized from any given index?
>>>>>> Jordon
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail:
>>>>>> For additional commands, e-mail:
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message