lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Allison <talli...@apache.org>
Subject Re: problem indexing GPS metadata for video upload
Date Thu, 02 May 2019 16:02:11 GMT
Sorry build #182: https://builds.apache.org/job/tika-branch-1x/

On Thu, May 2, 2019 at 12:01 PM Tim Allison <tallison@apache.org> wrote:
>
> I just pushed a fix for TIKA-2861.  If you can either build locally or
> wait a few hours for Jenkins to build #182, let me know if that works
> with straight tika-app.jar.
>
> On Thu, May 2, 2019 at 5:00 AM Where is Where <whisere@gmail.com> wrote:
> >
> > Thank you Alex and Tim.
> > I have looked at the solrconfig.xml file (I am trying the techproducts demo
> > config), the only related place I can find is the extract handle
> >
> > <requestHandler name="/update/extract"
> >                   startup="lazy"
> >                   class="solr.extraction.ExtractingRequestHandler" >
> >     <lst name="defaults">
> >       <str name="lowernames">true</str>
> >       <!--<str name="uprefix">ignored_</str>-->
> >
> >       <!-- capture link hrefs but ignore div attributes -->
> >       <str name="captureAttr">true</str>
> >       <str name="fmap.a">links</str>
> >       <str name="fmap.div">ignored_</str>
> >     </lst>
> >   </requestHandler>
> >
> > I am using this command bin/post -c techproducts example/exampledocs/1.mp4
> > -params "literal.id=mp4_1&uprefix=attr_"
> >
> > I have tried commenting out <str name="uprefix">ignored_</str> and changing
> > to <str name="fmap.div">div</str>
> > but still not working. I don't quite get why image is getting gps etc
> > metadata but video is acting differently while it is using the same
> > solrconfig and the gps metadata are in the same fields. There is no
> > differentiation in solrconfig setting between image and video.
> >
> > Tim yes this is related to the TIKA link. Thank you!
> >
> > Here is the output in solr for mp4.
> >
> > {
> >         "attr_meta":["stream_size",
> >           "5721559",
> >           "date",
> >           "2019-03-29T04:36:39Z",
> >           "X-Parsed-By",
> >           "org.apache.tika.parser.DefaultParser",
> >           "X-Parsed-By",
> >           "org.apache.tika.parser.mp4.MP4Parser",
> >           "stream_content_type",
> >           "application/octet-stream",
> >           "meta:creation-date",
> >           "2019-03-29T04:36:39Z",
> >           "Creation-Date",
> >           "2019-03-29T04:36:39Z",
> >           "tiff:ImageLength",
> >           "1080",
> >           "resourceName",
> >           "/Volumes/Data/inData/App/solr/example/exampledocs/1.mp4",
> >           "dcterms:created",
> >           "2019-03-29T04:36:39Z",
> >           "dcterms:modified",
> >           "2019-03-29T04:36:39Z",
> >           "Last-Modified",
> >           "2019-03-29T04:36:39Z",
> >           "Last-Save-Date",
> >           "2019-03-29T04:36:39Z",
> >           "xmpDM:audioSampleRate",
> >           "1000",
> >           "meta:save-date",
> >           "2019-03-29T04:36:39Z",
> >           "modified",
> >           "2019-03-29T04:36:39Z",
> >           "tiff:ImageWidth",
> >           "1920",
> >           "xmpDM:duration",
> >           "2.64",
> >           "Content-Type",
> >           "video/mp4"],
> >         "id":"mp4_4",
> >         "attr_stream_size":["5721559"],
> >         "attr_date":["2019-03-29T04:36:39Z"],
> >         "attr_x_parsed_by":["org.apache.tika.parser.DefaultParser",
> >           "org.apache.tika.parser.mp4.MP4Parser"],
> >         "attr_stream_content_type":["application/octet-stream"],
> >         "attr_meta_creation_date":["2019-03-29T04:36:39Z"],
> >         "attr_creation_date":["2019-03-29T04:36:39Z"],
> >         "attr_tiff_imagelength":["1080"],
> >         "resourcename":"/Volumes/Data/inData/App/solr/example/exampledocs/1.mp4",
> >         "attr_dcterms_created":["2019-03-29T04:36:39Z"],
> >         "attr_dcterms_modified":["2019-03-29T04:36:39Z"],
> >         "last_modified":"2019-03-29T04:36:39Z",
> >         "attr_last_save_date":["2019-03-29T04:36:39Z"],
> >         "attr_xmpdm_audiosamplerate":["1000"],
> >         "attr_meta_save_date":["2019-03-29T04:36:39Z"],
> >         "attr_modified":["2019-03-29T04:36:39Z"],
> >         "attr_tiff_imagewidth":["1920"],
> >         "attr_xmpdm_duration":["2.64"],
> >         "content_type":["video/mp4"],
> >         "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n
> >  \n  \n  \n  \n  \n  \n  \n  \n  \n \n   "],
> >         "_version_":1632383499325407232}]
> >   }}
> >
> > JPEG is getting these:
> > "attr_meta":[....
> > "GPS Latitude",
> >           "37° 47' 41.99\"",
> > ....
> > "attr_gps_latitude":["37° 47' 41.99\""],
> >
> >
> > On Wed, May 1, 2019 at 2:57 PM Where is Where <whisere@gmail.com> wrote:
> >
> > > uploading video to solr via tika
> > > https://lucene.apache.org/solr/guide/7_7/uploading-data-with-solr-cell-using-apache-tika.html
> > > The index has no video GPS metadata which is extracted and indexed for
> > > images such as jpeg. I have checked both MP4 and MOV files, the files I
> > > checked all have GPS Exif data embedded in the same fields as image. Any
> > > idea? Thanks!
> > >

Mime
View raw message