tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2630) Wrong height and width metadata for JPEG images
Date Tue, 30 Oct 2018 15:27:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668872#comment-16668872
] 

ASF GitHub Bot commented on TIKA-2630:
--------------------------------------

tballison commented on issue #255: TIKA-2630: Wrong height and width metadata for JPEG images
URL: https://github.com/apache/tika/pull/255#issuecomment-434346662
 
 
   Sorry, @dameikle, I should have addressed the change in behavior...the actual reason you
asked for a second pair of eyes. 😆 I agree that changes in behavior are bad.  IIUC, though,
we'd be fixing what we're currently doing, which is over-writing info, right?
   
   If we did something like this in branch_1x:
   `    if (directory instanceof ExifDirectoryBase) {
           metadata.set(directory.getName() + ":" + tag.getTagName(), tag.getDescription());
           metadata.set(tag.getTagName(), tag.getDescription());
       } else {
           metadata.set(EXIF_ROOT + ":" + tag.getTagName(), tag.getDescription());
           metadata.set(tag.getTagName(), tag.getDescription());
       }`
   
   That would maintain the same current (wrong) over-writing behavior, and introduce new tag
names.  I have no idea what an appropriate prefix would be for EXIF_ROOT...something static
and documented and appropriate.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Wrong height and width metadata for JPEG images
> -----------------------------------------------
>
>                 Key: TIKA-2630
>                 URL: https://issues.apache.org/jira/browse/TIKA-2630
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.17
>            Reporter: Ancuta Morarasu
>            Assignee: Dave Meikle
>            Priority: Major
>         Attachments: Tika-metadata.txt, metadata-exctractor-metadata.txt, sizesampleissue.jpg
>
>
> According to [Exif specs|http://www.exif.org/Exif2-2.PDF#page=73&zoom=auto,-176,103],
for compressed images the values for width and height should come from the tags:
> * *PixelXDimension* mapped in metadata-extractor to {{com.drew.metadata.Directory.ExifDirectoryBase.TAG_EXIF_IMAGE_WIDTH}}
and
> * *PixelYDimension* mapped to {{ExifDirectoryBase.TAG_EXIF_IMAGE_HEIGHT}}.
> {{ImageMetadataExtractor$ExifHandler.[handlePhotoTags(...)|https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java#L487]}}
should extract and set these in the metadata:
> {code:java}
>  if (directory.containsTag(ExifSubIFDDirectory.TAG_EXIF_IMAGE_WIDTH)) {
>     metadata.set(Metadata.IMAGE_WIDTH,
>                  trimPixels(directory.getDescription(ExifSubIFDDirectory.TAG_EXIF_IMAGE_WIDTH)));
>   }
>   if (directory.containsTag(ExifSubIFDDirectory.TAG_EXIF_IMAGE_WIDTH)) {
>       metadata.set(Metadata.IMAGE_LENGTH,
>                    trimPixels(directory.getDescription(ExifSubIFDDirectory.TAG_EXIF_IMAGE_HEIGHT)));
>    }
> {code}
> Also the {{CopyUnknownFieldsHandler}} overrides the values for "Image Width" ({{JpegDirectory.TAG_IMAGE_WIDTH}})
and "Image Height" ({{JpegDirectory.TAG_IMAGE_HEIGHT}}) with the values from {{ExifIFD0Descriptor.TAG_IMAGE_WIDTH}}
and {{ExifIFD0Descriptor.TAG_IMAGE_HEIGHT}} because they have the same tag name.
> I attached a sample image, these are the metadata values:
> * extracted by metadata-extractor:
> [JPEG] Image Height = 367 pixels
> [JPEG] Image Width = 1535 pixels
> [Exif IFD0] Image Width = 2173 pixels
> [Exif IFD0] Image Height = 520 pixels
> [Exif SubIFD] Exif Image Width = 1535 pixels
> [Exif SubIFD] Exif Image Height = 367 pixels
> * Tika metadata:
> Image Height: 520 pixels
> Image Width: 2173 pixels
> tiff:ImageLength: 520
> tiff:ImageWidth: 2173
> Exif Image Height: 367 pixels
> Exif Image Width: 1535 pixels



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message