tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-148) The ExcelParsing should scan the cell comments
Date Mon, 11 Jan 2010 11:24:54 GMT

    [ https://issues.apache.org/jira/browse/TIKA-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798642#action_12798642
] 

Nick Burch commented on TIKA-148:
---------------------------------

You can associate comments with cells, but it isn't all that easy. The "findCellComment" on
HSSFCell shows how to do it, which is:
* For every TextObjectRecord, find the CommonObjectDataSubRecord of type OBJECT_TYPE_COMMENT
that precedes it
* On that CommonObjectDataSubRecord, get the object ID
* Find a NoteRecord with a shape ID that matches the object ID just found
* The NoteRecord holds the row and column details

Normally the ordering of records in the file is CommonObjectDataSubRecord, TextObjectRecord,
NoteRecord, cells, so you'd need to grab things as they went passed so they're to hand by
the time you get to the cells.

> The ExcelParsing should scan the cell comments
> ----------------------------------------------
>
>                 Key: TIKA-148
>                 URL: https://issues.apache.org/jira/browse/TIKA-148
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>         Environment: All
>            Reporter: Karl Heinz Marbaise
>            Priority: Minor
>         Attachments: comment.patch
>
>
> During the scanning of Excel documents it might be helpful to scan or analyze the cell
comments of an excel worksheet as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message