tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Palsulich <tpalsul...@gmail.com>
Subject Re: 1.7 release?
Date Thu, 18 Dec 2014 20:54:55 GMT
Hi All,

It's been a few months, so I just want to follow up on this thread. We've
resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7
(TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445?
Has anyone tried their hand at the suggested (significant) fix?

Are there any other issues someone would like to fit in?

Cheers,
Tyler

[0] -
https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel

On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:
>
> Thanks Tim saw your patch and am looking now.
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: <Allison>, "Timothy B." <tallison@mitre.org>
> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> Date: Monday, October 27, 2014 at 12:30 PM
> To: "dev@tika.apache.org" <dev@tika.apache.org>
> Subject: RE: 1.7 release?
>
> >Sounds good.  As long as the default behavior remains the same, I'm
> >happy.  I'm going to play with a combination of your patch and Tyler's
> >and see what the ramifications are for embedded docs.
> >
> >To confirm, the OCR integration is fantastic.  Thank you and Tyler!
> >
> >
> >Best,
> >
> >           Tim
> >
> >-----Original Message-----
> >From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov]
> >Sent: Friday, October 24, 2014 5:36 PM
> >To: dev@tika.apache.org
> >Subject: Re: 1.7 release?
> >
> >Hey Tim,
> >
> >What do you think about my existing patch for 1445? For example to
> >just call all the parsers? I thought I was seeing behavior that was
> >slow because of that, but it turned out to be Tesseract and my machine
> >at the time?
> >
> >I think my patch for 1445 may be enough, and we should get the metadata
> >I think? Thoughts?
> >
> >I honestly think we need to deliver Tesseract in 1.7. We're close. I'll
> >even take it upon myself to try and experiment with the idea of multiple
> >parsers being called. I think a simple solution to the metadata key
> >conflict issue is simply to have a policy to add values (by default) and
> >replace if a property is set in ParseContext. Some simple updates to
> >CompositeParser would allow this.
> >
> >Thoughts?
> >
> >Cheers,
> >Chris
> >
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Chief Architect
> >Instrument Software and Science Data Systems Section (398)
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 168-519, Mailstop: 168-527
> >Email: chris.a.mattmann@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Associate Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: <Allison>, "Timothy B." <tallison@mitre.org>
> >Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> >Date: Friday, October 24, 2014 at 2:24 PM
> >To: "dev@tika.apache.org" <dev@tika.apache.org>
> >Subject: RE: 1.7 release?
> >
> >>Sorry for coming late to the game on the implications of TIKA-1445.  I
> >>don't want to hold up the release of 1.7.
> >>
> >>However, would it be possible to return to the legacy default behavior of
> >>extracting metadata from images?
> >>
> >>We can then document on the OCR parser page on the wiki that you need to
> >>install Tesseract _and_ make a change in the parser/mime config file. If
> >>you want this new capability, it will take a small bit of work until we
> >>solve TIKA-1445.
> >>
> >>I worry that the current behavior of 1.7 would be surprising to most
> >>non-dev users (well, even to at least one dev :) ).
> >>
> >>Cheers,
> >>
> >>          Tim
> >>
> >>________________________________________
> >>From: Oleg Tikhonov [olegtikhonov@gmail.com]
> >>Sent: Friday, October 24, 2014 2:24 PM
> >>To: dev@tika.apache.org
> >>Subject: Re: 1.7 release?
> >>
> >>Hi Tyler,
> >>don't mention.
> >>
> >>Cheers,
> >>Oleg
> >>On Oct 24, 2014 8:02 PM, "Tyler Palsulich" <tpalsulich@gmail.com> wrote:
> >>
> >>> Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there
> >>>any
> >>> other issues anyone would like to resolve before a new release?
> >>>
> >>> Thanks,
> >>> Tyler
> >>>
> >>> On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov <olegtikhonov@gmail.com
> >
> >>> wrote:
> >>>
> >>> > Sorry!!!
> >>> >
> >>> > On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) <
> >>> > chris.a.mattmann@jpl.nasa.gov> wrote:
> >>> >
> >>> > > Thanks Oleg, will try tomorrow for me Los angeles time!
> >>> > >
> >>> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > Chris Mattmann, Ph.D.
> >>> > > Chief Architect
> >>> > > Instrument Software and Science Data Systems Section (398)
> >>> > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>> > > Office: 168-519, Mailstop: 168-527
> >>> > > Email: chris.a.mattmann@nasa.gov
> >>> > > WWW:  http://sunset.usc.edu/~mattmann/
> >>> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > Adjunct Associate Professor, Computer Science Department
> >>> > > University of Southern California, Los Angeles, CA 90089 USA
> >>> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > >
> >>> > >
> >>> > >
> >>> > >
> >>> > >
> >>> > >
> >>> > > -----Original Message-----
> >>> > > From: Oleg Tikhonov <oleg@apache.org>
> >>> > > Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> >>> > > Date: Monday, October 20, 2014 at 11:20 PM
> >>> > > To: "dev@tika.apache.org" <dev@tika.apache.org>
> >>> > > Subject: Re: 1.7 release?
> >>> > >
> >>> > > >Please take a try with newest patch.
> >>> > > >Cheers,
> >>> > > >Oleg
> >>> > > >
> >>> > > >On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov <
> >>> olegtikhonov@gmail.com>
> >>> > > >wrote:
> >>> > > >
> >>> > > >> Taken. Thanks. in progress ...
> >>> > > >>
> >>> > > >> On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980)
<
> >>> > > >> chris.a.mattmann@jpl.nasa.gov> wrote:
> >>> > > >>
> >>> > > >>> Trunk is the current checkout/branch:
> >>> > > >>>
> >>> > > >>> http://svn.apache.org/repos/asf/tika/trunk
> >>> > > >>>
> >>> > > >>>
> >>> > > >>>
> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > >>> Chris Mattmann, Ph.D.
> >>> > > >>> Chief Architect
> >>> > > >>> Instrument Software and Science Data Systems Section
(398)
> >>> > > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109
USA
> >>> > > >>> Office: 168-519, Mailstop: 168-527
> >>> > > >>> Email: chris.a.mattmann@nasa.gov
> >>> > > >>> WWW:  http://sunset.usc.edu/~mattmann/
> >>> > > >>>
> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > >>> Adjunct Associate Professor, Computer Science Department
> >>> > > >>> University of Southern California, Los Angeles, CA
90089 USA
> >>> > > >>>
> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > >>>
> >>> > > >>>
> >>> > > >>>
> >>> > > >>>
> >>> > > >>>
> >>> > > >>>
> >>> > > >>> -----Original Message-----
> >>> > > >>> From: Oleg Tikhonov <olegtikhonov@gmail.com>
> >>> > > >>> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> >>> > > >>> Date: Monday, October 20, 2014 at 10:16 PM
> >>> > > >>> To: "dev@tika.apache.org" <dev@tika.apache.org>
> >>> > > >>> Subject: Re: 1.7 release?
> >>> > > >>>
> >>> > > >>> >Hi, I can try this on.
> >>> > > >>> >What is a trunk?
> >>> > > >>> >
> >>> > > >>> >
> >>> > > >>> >Thanks,
> >>> > > >>> >Oleg
> >>> > > >>> >
> >>> > > >>> >On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris
A (3980) <
> >>> > > >>> >chris.a.mattmann@jpl.nasa.gov> wrote:
> >>> > > >>> >
> >>> > > >>> >> Hmm any idea why this is failing on Windows?
Tyler P. and
> >>> > > >>> >> I were talking the other day - maybe we
shouldn't run the
> >>> > > >>> >> tests from TIKA-1422 unless Tesseract is
installed?
> >>>Thoughts?
> >>> > > >>> >>
> >>> > > >>> >>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > >>> >> Chris Mattmann, Ph.D.
> >>> > > >>> >> Chief Architect
> >>> > > >>> >> Instrument Software and Science Data Systems
Section (398)
> >>> > > >>> >> NASA Jet Propulsion Laboratory Pasadena,
CA 91109 USA
> >>> > > >>> >> Office: 168-519, Mailstop: 168-527
> >>> > > >>> >> Email: chris.a.mattmann@nasa.gov
> >>> > > >>> >> WWW:  http://sunset.usc.edu/~mattmann/
> >>> > > >>> >>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > >>> >> Adjunct Associate Professor, Computer Science
Department
> >>> > > >>> >> University of Southern California, Los Angeles,
CA 90089 USA
> >>> > > >>> >>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> > > >>> >>
> >>> > > >>> >>
> >>> > > >>> >>
> >>> > > >>> >>
> >>> > > >>> >>
> >>> > > >>> >>
> >>> > > >>> >> -----Original Message-----
> >>> > > >>> >> From: Hong-Thai Nguyen <thaichat04@gmail.com>
> >>> > > >>> >> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> >>> > > >>> >> Date: Thursday, October 16, 2014 at 2:03
AM
> >>> > > >>> >> To: "dev@tika.apache.org" <dev@tika.apache.org>
> >>> > > >>> >> Subject: Re: 1.7 release?
> >>> > > >>> >>
> >>> > > >>> >> >Hi Andrzej,
> >>> > > >>> >> >
> >>> > > >>> >> >We are impatient for 1.7 release too.
> >>> > > >>> >> >I'm having compiling problem of TIKA-1422
on me. If anyone
> >>>can
> >>> > > >>>build
> >>> > > >>> >> >successfully on Windows, I have no objection
to release 1.7
> >>> > > >>> >> >
> >>> > > >>> >> >Thanks,
> >>> > > >>> >> >
> >>> > > >>> >> >On Thu, Oct 16, 2014 at 10:51 AM, Andrzej
BiaƂecki <
> >>> > ab@getopt.org>
> >>> > > >>> >>wrote:
> >>> > > >>> >> >
> >>> > > >>> >> >> Hi,
> >>> > > >>> >> >>
> >>> > > >>> >> >> Any news on the 1.7 release? or
at least a 1.6.1 release
> >>>that
> >>> > > >>> >>includes
> >>> > > >>> >> >>the
> >>> > > >>> >> >> fix for broken ODF parsing...
> >>> > > >>> >> >>
> >>> > > >>> >> >> ---
> >>> > > >>> >> >> Best regards,
> >>> > > >>> >> >>
> >>> > > >>> >> >> Andrzej Bialecki
> >>> > > >>> >> >>
> >>> > > >>> >> >>
> >>> > > >>> >> >
> >>> > > >>> >> >
> >>> > > >>> >> >--
> >>> > > >>> >> >--------------
> >>> > > >>> >> >Hong-Thai
> >>> > > >>> >>
> >>> > > >>> >>
> >>> > > >>>
> >>> > > >>>
> >>> > > >>
> >>> > >
> >>> > >
> >>> >
> >>>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message