lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Allison <talli...@apache.org>
Subject Re: problem indexing GPS metadata for video upload
Date Thu, 02 May 2019 16:01:32 GMT
I just pushed a fix for TIKA-2861.  If you can either build locally or
wait a few hours for Jenkins to build #182, let me know if that works
with straight tika-app.jar.

On Thu, May 2, 2019 at 5:00 AM Where is Where <whisere@gmail.com> wrote:
>
> Thank you Alex and Tim.
> I have looked at the solrconfig.xml file (I am trying the techproducts demo
> config), the only related place I can find is the extract handle
>
> <requestHandler name="/update/extract"
>                   startup="lazy"
>                   class="solr.extraction.ExtractingRequestHandler" >
>     <lst name="defaults">
>       <str name="lowernames">true</str>
>       <!--<str name="uprefix">ignored_</str>-->
>
>       <!-- capture link hrefs but ignore div attributes -->
>       <str name="captureAttr">true</str>
>       <str name="fmap.a">links</str>
>       <str name="fmap.div">ignored_</str>
>     </lst>
>   </requestHandler>
>
> I am using this command bin/post -c techproducts example/exampledocs/1.mp4
> -params "literal.id=mp4_1&uprefix=attr_"
>
> I have tried commenting out <str name="uprefix">ignored_</str> and changing
> to <str name="fmap.div">div</str>
> but still not working. I don't quite get why image is getting gps etc
> metadata but video is acting differently while it is using the same
> solrconfig and the gps metadata are in the same fields. There is no
> differentiation in solrconfig setting between image and video.
>
> Tim yes this is related to the TIKA link. Thank you!
>
> Here is the output in solr for mp4.
>
> {
>         "attr_meta":["stream_size",
>           "5721559",
>           "date",
>           "2019-03-29T04:36:39Z",
>           "X-Parsed-By",
>           "org.apache.tika.parser.DefaultParser",
>           "X-Parsed-By",
>           "org.apache.tika.parser.mp4.MP4Parser",
>           "stream_content_type",
>           "application/octet-stream",
>           "meta:creation-date",
>           "2019-03-29T04:36:39Z",
>           "Creation-Date",
>           "2019-03-29T04:36:39Z",
>           "tiff:ImageLength",
>           "1080",
>           "resourceName",
>           "/Volumes/Data/inData/App/solr/example/exampledocs/1.mp4",
>           "dcterms:created",
>           "2019-03-29T04:36:39Z",
>           "dcterms:modified",
>           "2019-03-29T04:36:39Z",
>           "Last-Modified",
>           "2019-03-29T04:36:39Z",
>           "Last-Save-Date",
>           "2019-03-29T04:36:39Z",
>           "xmpDM:audioSampleRate",
>           "1000",
>           "meta:save-date",
>           "2019-03-29T04:36:39Z",
>           "modified",
>           "2019-03-29T04:36:39Z",
>           "tiff:ImageWidth",
>           "1920",
>           "xmpDM:duration",
>           "2.64",
>           "Content-Type",
>           "video/mp4"],
>         "id":"mp4_4",
>         "attr_stream_size":["5721559"],
>         "attr_date":["2019-03-29T04:36:39Z"],
>         "attr_x_parsed_by":["org.apache.tika.parser.DefaultParser",
>           "org.apache.tika.parser.mp4.MP4Parser"],
>         "attr_stream_content_type":["application/octet-stream"],
>         "attr_meta_creation_date":["2019-03-29T04:36:39Z"],
>         "attr_creation_date":["2019-03-29T04:36:39Z"],
>         "attr_tiff_imagelength":["1080"],
>         "resourcename":"/Volumes/Data/inData/App/solr/example/exampledocs/1.mp4",
>         "attr_dcterms_created":["2019-03-29T04:36:39Z"],
>         "attr_dcterms_modified":["2019-03-29T04:36:39Z"],
>         "last_modified":"2019-03-29T04:36:39Z",
>         "attr_last_save_date":["2019-03-29T04:36:39Z"],
>         "attr_xmpdm_audiosamplerate":["1000"],
>         "attr_meta_save_date":["2019-03-29T04:36:39Z"],
>         "attr_modified":["2019-03-29T04:36:39Z"],
>         "attr_tiff_imagewidth":["1920"],
>         "attr_xmpdm_duration":["2.64"],
>         "content_type":["video/mp4"],
>         "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n
>  \n  \n  \n  \n  \n  \n  \n  \n  \n \n   "],
>         "_version_":1632383499325407232}]
>   }}
>
> JPEG is getting these:
> "attr_meta":[....
> "GPS Latitude",
>           "37° 47' 41.99\"",
> ....
> "attr_gps_latitude":["37° 47' 41.99\""],
>
>
> On Wed, May 1, 2019 at 2:57 PM Where is Where <whisere@gmail.com> wrote:
>
> > uploading video to solr via tika
> > https://lucene.apache.org/solr/guide/7_7/uploading-data-with-solr-cell-using-apache-tika.html
> > The index has no video GPS metadata which is extracted and indexed for
> > images such as jpeg. I have checked both MP4 and MOV files, the files I
> > checked all have GPS Exif data embedded in the same fields as image. Any
> > idea? Thanks!
> >

Mime
View raw message