tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Beryozkin <sberyoz...@gmail.com>
Subject Re: Integrating Tika with Apache Beam
Date Thu, 25 May 2017 16:47:02 GMT
Hi Guys

The link to the initial code is available in JIRA, at this stage the 
focus is on preparing a solid initial PR, and then we can all improve 
Tika related code :-)

Cheers, Sergey
On 24/05/17 11:41, Sergey Beryozkin wrote:
> Hi Tim, All,
> 
> I thought I'd start a dedicated thread.
> 
> I added some initial comments to [1], I'm quite close now to creating 
> the initial PR.
> 
> Thanks, Sergey
> 
> [1] https://issues.apache.org/jira/browse/BEAM-2328
> On 23/05/17 17:42, Allison, Timothy B. wrote:
>> Another idea...if you have any interest, it would be great to get 
>> Apache Beam set up on our Rackspace VM (with Spark?) and use it for 
>> our regression tests?
>>
>> -----Original Message-----
>> From: Sergey Beryozkin [mailto:sberyozkin@gmail.com]
>> Sent: Friday, May 19, 2017 4:21 PM
>> To: user@tika.apache.org
>> Subject: Re: Extracting Text from embedded images in PDF docs
>>
>> Hi Tim
>>
>> Sure, once I get an initial PR ready I'll send an update and I'll 
>> explain what I did for a start and we will discuss it further
>>


-- 
Sergey Beryozkin

Talend Community Coders
http://coders.talend.com/

Mime
View raw message