xmlgraphics-fop-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dola Woolfe <dolac...@yahoo.com>
Subject Re: Newbie question
Date Fri, 04 Sep 2009 13:22:12 GMT


Thank you.

(Sounds like more than the 1 hour  I was allocating for it.)



----- Original Message ----
From: Jean-François El Fouly <jean-francois@elfouly.fr>
To: fop-users@xmlgraphics.apache.org
Sent: Friday, September 4, 2009 3:44:55 AM
Subject: Re: Newbie question


Le 4 sept. 09 à 03:34, Dola Woolfe a écrit :

> I'm trying to put together several elements to build a PDF translator.
> 
> 1. Load a PDF in a foreign language (???)
> 2. Translate the content (Google Translate)
> 3. Output the translated PDF (FOP)
> 
> So I'm guessing step 1 is not part of FOP. Can you perhaps recommend what I can use for
1.?
> 
> Thanks again!

I think you should try iText. You will find an explanation of what you need near the end of
"iText in Action", the authoritative book by Bruno Lowagie, the guy who designed iText in
the first place. And before proceeding in your project you *should* read the caveats in his
book: extracting text content from an existing PDF may not be as straightforward as you think
- in fact may be almost nonsense in certain situations. A PDF API will get you the text content
in the order it was technically generated, which may not be the "textual" order (the order
you read the elements in a book).
My own experience in top of this is that it is very difficult to extract text content from
non-European or large fonts (the CID-keyed fonts, roughly said, those who have more than WinAnsi
or ISO-8859-1 characters).

HTH,

Jean-François
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Mime
View raw message