tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <chris.mattm...@jpl.nasa.gov>
Subject Re: svn commit: r594376 - in /incubator/tika/trunk: CHANGES.txt src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java
Date Sun, 18 Nov 2007 22:01:57 GMT
I've verified this behavior as well while trying to apply and commit the
patch for TIKA-101. I think that the trunk is broken. I'll go ahead and fix
it. 

In the future, we should probably have nightly builds to catch stuff like
this. Also, please try to be more vigilant about making sure that your
environment is set to JDK 5 before committing an update.

Thanks!

Cheers,
  Chris



On 11/18/07 9:54 AM, "Jeremias Maerki" <dev@jeremias-maerki.ch> wrote:

> The constructor IOException(String, Exception) only exists since Java 6.
> I don't think that was intended, was it?
> 
> Jeremias Maerki
> 
> 
> 
> On 13.11.2007 02:04:31 jukka wrote:
>> Author: jukka
>> Date: Mon Nov 12 17:04:30 2007
>> New Revision: 594376
>> 
>> URL: http://svn.apache.org/viewvc?rev=594376&view=rev
>> Log:
>> TIKA-100 - Structured PDF parsing
>>     - Customized the PdfTextStripper class to produce XHTML SAX events
>>       (there's a somewhat similar PdfText2HTML class in PDFBox, but
>>       that class produces a character stream instead of SAX events)
>> 
>> Added:
>>     
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> (with props)
>> Modified:
>>     incubator/tika/trunk/CHANGES.txt
>>     
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
>> 
> <snip/>
>> Added: 
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> URL: 
>> http://svn.apache.org/viewvc/incubator/tika/trunk/src/main/java/org/apache/ti
>> ka/parser/pdf/PDF2XHTML.java?rev=594376&view=auto
>> 
=============================================================================>>
=
>> --- 
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> (added)
>> +++ 
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> Mon Nov 12 17:04:30 2007
>> +    protected void endDocument(PDDocument pdf) throws IOException {
>> +        try {
>> +            handler.endDocument();
>> +        } catch (SAXException e) {
>> +            throw new IOException("Unable to end a document", e);
>> +        }
>> +    }
> 

______________________________________________
Chris Mattmann, Ph.D.
Chris.Mattmann@jpl.nasa.gov
Cognizant Development Engineer
Early Detection Research Network Project
_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.



Mime
View raw message