lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Renaud Delbru (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4591) Make StoredFieldsFormat more configurable
Date Thu, 06 Dec 2012 13:18:59 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511335#comment-13511335
] 

Renaud Delbru commented on LUCENE-4591:
---------------------------------------

Hi Adrien, yes I understand the problem. While, it is true that in the extreme case, people
could configure a different StoredFieldsFormat for each field, which will lead to a large
increase of disk seeks, here we would like to use the CompressingStoredFieldsFormat for all
the standard fields, but have a different mechanism for specific fields.

We would like to store certain fields that requires a different type of data structure than
the one currently supported, i.e., a document is not a simple list of fields, but a more complex
data structure.

We could solve the problem by copying and modifying the current CompressingStoredFieldsWriter
and CompressingStoredFieldsReader so that it can decide what type of encoding to use based
on the field info. However, this is kind of hacky, and we will have to keep in synch our copy
with the original implementation. The only way we could find is to have a perfield approach.
                
> Make StoredFieldsFormat more configurable
> -----------------------------------------
>
>                 Key: LUCENE-4591
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4591
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 4.1
>            Reporter: Renaud Delbru
>             Fix For: 4.1
>
>
> The current StoredFieldsFormat are implemented with the assumption that only one type
of StoredfieldsFormat is used by the index.
> We would like to be able to configure a StoredFieldsFormat per field, similarly to the
PostingsFormat.
> There is a few issues that need to be solved for allowing that:
> 1) allowing to configure a segment suffix to the StoredFieldsFormat
> 2) implement SPI interface in StoredFieldsFormat 
> 3) create a PerFieldStoredFieldsFormat
> We are proposing to start first with 1) by modifying the signature of StoredFieldsFormat#fieldsReader
and StoredFieldsFormat#fieldsWriter so that they use SegmentReadState and SegmentWriteState
instead of the current set of parameters.
> Let us know what you think about this idea. If this is of interest, we can contribute
with a first path for 1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message