lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roberto Cornacchia (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-7171) IndexableField changes its IndexableFieldType when the index is re-opened for reading
Date Mon, 23 May 2016 12:14:12 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296289#comment-15296289
] 

Roberto Cornacchia edited comment on LUCENE-7171 at 5/23/16 12:13 PM:
----------------------------------------------------------------------

Perhaps I can reformulate this more concisely as:

Why, in {{DocumentStoredFieldVisitor}}, {{StringFiield}} is arbitrarily converted into {{TextField}}?
What is the point of having them as different classes if they are swapped under the hood?

This looks like a quick patch to the fact that no {{textField()}} method is present in {{StoredFieldVisitor}}.

{code}
  @Override
  public void stringField(FieldInfo fieldInfo, byte[] value) throws IOException {
    final FieldType ft = new FieldType(TextField.TYPE_STORED);
    ft.setStoreTermVectors(fieldInfo.hasVectors());
    ft.setOmitNorms(fieldInfo.omitsNorms());
    ft.setIndexOptions(fieldInfo.getIndexOptions());
    doc.add(new Field(fieldInfo.name, new String(value, StandardCharsets.UTF_8), ft));
  }
{code}



was (Author: cornuz):
Perhaps I can reformulate this more concisely as:

Why, in {{DocumentStoredFieldVisitor}}, {{StringFiield}} is arbitrarily converted into {{TextFiield}}?
What is the point of having them as different classes if they are swapped under the hood?

This looks like a quick patch to the fact that no {{textField()}} method is present in {{StoredFieldVisitor}}.

{code}
  @Override
  public void stringField(FieldInfo fieldInfo, byte[] value) throws IOException {
    final FieldType ft = new FieldType(TextField.TYPE_STORED);
    ft.setStoreTermVectors(fieldInfo.hasVectors());
    ft.setOmitNorms(fieldInfo.omitsNorms());
    ft.setIndexOptions(fieldInfo.getIndexOptions());
    doc.add(new Field(fieldInfo.name, new String(value, StandardCharsets.UTF_8), ft));
  }
{code}


> IndexableField changes its IndexableFieldType when the index is re-opened for reading
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-7171
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7171
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 5.5
>            Reporter: Roberto Cornacchia
>
> This code:
> {code}
> /* Store one document into an index */
> Directory index = new RAMDirectory();
> IndexWriterConfig config = new IndexWriterConfig(new StandardAnalyzer());
> IndexWriter w = new IndexWriter(index, config);
> Document d1 = new Document();
> d1.add(new StringField("isbn", "9900333X", Field.Store.YES));
> w.addDocument(d1);
> w.commit();
> w.close();
> /* inspect IndexableFieldType */
> IndexableField f1 = d1.getField("isbn");
> System.err.println("FieldType for " + f1.stringValue() + " : " + f1.fieldType());
> /* retrieve all documents and inspect IndexableFieldType */
> IndexSearcher s = new IndexSearcher(DirectoryReader.open(index));
> TopDocs td = s.search(new MatchAllDocsQuery(), 1);
> for (ScoreDoc sd : td.scoreDocs) {
>     Document d2 = s.doc(sd.doc);
>     IndexableField f2 = d2.getField("isbn");
>     System.err.println("FieldType for " + f2.stringValue() + " : " + f2.fieldType());
> }
> {code}
> Produces:
> {code}
> FieldType for 9900333X : stored,indexed,omitNorms,indexOptions=DOCS
> FieldType for 9900333X : stored,indexed,tokenized,omitNorms,indexOptions=DOCS
> {code}
> The {{StringField}} field {{isbn}} is not tokenized, as correctly reported by the first
output, which happens right after closing the writer.
> However, it becomes tokenized when the index is re-opened with a new reader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message