lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2811) SegmentInfo should explicitly track whether that segment wrote term vectors
Date Mon, 13 Dec 2010 12:36:03 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970794#action_12970794
] 

Simon Willnauer commented on LUCENE-2811:
-----------------------------------------

Mike good that you figured it out :D - other than that I think this part gets messier and
messier each time we change something. Your patch is a good indicator that we need to push
stuff into codecs and let codecs decide if a feature is present in a segment. BW code should
be handled in PreFlexCodec and new stuff like hasVector should be something a codec holds
or rather segmentCodecs encodes really. Accessing the filename extensions outside a codec
seem to be very odd (I know TV and Stored fields are not yet exposed - just sayin) 

Also all the CFS and Compound Doc Store stuff should be pushed to codecs.

I looked at the patch and it looks good to me though except of the one System.out.println:

{code}
    System.out.println("SI READ 2");
{code}

simon

> SegmentInfo should explicitly track whether that segment wrote term vectors
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2811
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2811
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2811.patch
>
>
> Today SegmentInfo doesn't know if it has vectors, which means its files() method must
check if the files exist.
> This leads to subtle bugs, because Si.files() caches the files but then we fail to invalidate
that later when the term vectors files are created.
> It also leads to sloppy code, eg TermVectorsReader "gracefully" handles being opened
when the files do not exist.  I don't like that; it should only be opened if they exist.
> This also fixes these intermittent failures we've been seeing:
> {noformat}
> junit.framework.AssertionFailedError: IndexFileDeleter doesn't know about file _1e.tvx
>        at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:979)
>        at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:917)
>        at org.apache.lucene.index.IndexWriter.filesExist(IndexWriter.java:3633)
>        at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:3699)
>        at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2407)
>        at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2478)
>        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2460)
>        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2444)
>        at org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:213)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message