tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: [DISCUSS] Release Tika 1.11?
Date Tue, 22 Sep 2015 12:40:51 GMT
Thank _you_ for all of your work in modernizing us.  With your efforts, we'll be able to deprecate
TikaInputStream#get(PunchCard pc) soon. :)

>>Regarding FilenameUtils.getName() - I believe that its functionality can be replaced
by Path.getFileName() - and in a platform-aware manner, as each JVM distribution comes with
a specific provider implementation for the OS it's for.

I agree that we should use that anytime we're interacting with the file system.  

However, that's actually the problem for paths that are stored within the document (say, an
embedded resource).  Let's say a user creates a file on Windows, the file path information
for the embedded file (depending on the parser and the file format) may be in Windows-ese,
which is a  problem if you try to use Path.getFileName() (I think... I haven't actually tested
this) on a Linux machine.  I have actually tested this with the old File getName(), and it
did not work cross-platform IIRC.

In short, Tika needs to have the ability to extract the file name from a path that was created
on any platform (including old Mac and its ":" separator) while Tika is running on any platform.

-----Original Message-----
From: Yaniv Kunda [mailto:yaniv.kunda@answers.com] 
Sent: Monday, September 21, 2015 11:31 AM
To: dev@tika.apache.org
Subject: RE: [DISCUSS] Release Tika 1.11?

Thanks for the positive spirit!

Regarding FilenameUtils.getName() - I believe that its functionality can be replaced by Path.getFileName()
- and in a platform-aware manner, as each JVM distribution comes with a specific provider
implementation for the OS it's for.

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org]
Sent: Monday, September 21, 2015 14:27
To: dev@tika.apache.org
Subject: RE: [DISCUSS] Release Tika 1.11?

+1, it would be great to move a bit more into EOL'd Java 7 asap.

I'll take TIKA-1734 by tomorrow EDT.

As for the other 2, I'm personally ok waiting for 1.12, but I defer to the dev community.

Chris, Nick, Ray, Ken, Konstantin, if you have a chance to chime in on TIKA-1726, that might
help move things forward.

On TIKA-1706, I share Nick's and Jukka's caution, and I also share Yaniv's point about duplication
of code, bloat within Tika and missing out on
updates.   Aside from one small bit of code I'd like to keep or perhaps try
to move into commons-io (?)[0], I think I'm now +1 to going forward with
TIKA-1706 in core...unless there is a -1 from the community.

Best,

             Tim


[1] I added some customizations for old MAC OS behavior (treat ":" as file
separator) in FileNameUtils.getName() that I don't want to lose.


-----Original Message-----
From: Yaniv Kunda [mailto:yaniv.kunda@answers.com]
Sent: Sunday, September 20, 2015 7:15 AM
To: dev@tika.apache.org
Subject: RE: [DISCUSS] Release Tika 1.11?

I would really like to push the following:

https://issues.apache.org/jira/browse/TIKA-1706 - Bring back commons-io to tika-core This
requires a decision to re-include commons-io as a dependency of tika-core.
All the pros and cons have been already debated, but no decision has been made.

https://issues.apache.org/jira/browse/TIKA-1726 - Augment public methods that use a java.io.File
with methods that use a java.nio.file.Path Since this adds new methods to the public API,
I requested the group to make a decision about the new names - but have not received something
definite.
However, I did create a subtask -
https://issues.apache.org/jira/browse/TIKA-1734 Use java.nio.file.Path in TemporaryResources
- using [~tallison]'s suggestion, which has not been committed yet.

If decisions are made on the above issues, I can quickly create patches for them.

-----Original Message-----
From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov]
Sent: Saturday, September 19, 2015 08:10
To: dev@tika.apache.org
Subject: [DISCUSS] Release Tika 1.11?

Hey Guys and Gals,

I’d like to roll a 1.11 release. There is TIKA-1716 which in particular allows some neat
functionality in tika-python:
https://github.com/chrismattmann/tika-python/pull/67


Anything else to try and get into the release?

If not, I’ll produce an RC #1 by end of weekend.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory
Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department University of Southern California,
Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

-- 


This email communication (including any attachments) contains information from Answers Corporation
or its affiliates that is confidential and may be privileged. The information contained herein
is intended only for the use of the addressee(s) named above. If you are not the intended
recipient (or the agent responsible to deliver it to the intended recipient), you are hereby
notified that any dissemination, distribution, use, or copying of this communication is strictly
prohibited. If you have received this email in error, please immediately reply to sender,
delete the message and destroy all copies of it. If you have questions, please email legal@answers.com.

If you wish to unsubscribe to commercial emails from Answers and its affiliates, please go
to the Answers Subscription Center http://campaigns.answers.com/subscriptions to opt out.
 Thank you.

-- 


This email communication (including any attachments) contains information from Answers Corporation
or its affiliates that is confidential and may be privileged. The information contained herein
is intended only for the use of the addressee(s) named above. If you are not the intended
recipient (or the agent responsible to deliver it to the intended recipient), you are hereby
notified that any dissemination, distribution, use, or copying of this communication is strictly
prohibited. If you have received this email in error, please immediately reply to sender,
delete the message and destroy all copies of it. If you have questions, please email legal@answers.com.


If you wish to unsubscribe to commercial emails from Answers and its affiliates, please go
to the Answers Subscription Center http://campaigns.answers.com/subscriptions to opt out.
 Thank you.
Mime
View raw message