xmlgraphics-fop-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dan.mccabe" <mccabe.danie...@gmail.com>
Subject Re: large image embedding problems
Date Wed, 16 Sep 2009 21:00:26 GMT

The solution we're using definitely isn't ideal, but with the point we're at,
we just needed something that would work reliably.  I may explore an
fo:external-graphic extension at some point if it's a piece we're planning
on leaving in place, but that will all be based on when I get time to
revisit it.  If we come up with a more permanent, non-hackish solution, I
will post it here and most likely submit a patch.  At the very least, I
wanted to post how we got around the problem so that if other people run
into the same issue, it might give them some ideas.

I would agree that the JP2 implementation is probably not in high demand
(certainly didn't find much information about it searching through the user
list), and with the extra dependencies that are involved, it would probably
make a lot more sense to have it as an external download to FOP.  I'll have
to get back to you on how we might make it available to others though. 
There would still be additional work to be done anyways, because I didn't
build in the switch for the PDF version number yet, since I was just trying
to prototype a solution (wouldn't have made sense to put that in if I didn't
end up needing it).

Dan



Jeremias Maerki-2 wrote:
> 
> Thanks for the extensive feedback, Dan. It's good to have that in the
> archives.
> 
> A BSD dependency is generally not a problem. However, JJ2000 has a
> peculiar license which would probably have to go by the legal corner of
> the ASF. Anyway, in the past, we were quite cautious about adding new
> dependencies. In this case it's not exactly mainstream functionality so
> only few people would actually use this.
> 
> Did you also do the switch for the PDF version I mentioned? If not, that
> means some additional work before it could be included. I guess if there
> are voices who want that in FOP we can certainly take a closer look at
> the licensing situation. I have only very limited time for FOP at the
> moment so I don't promise anything. At any rate, you could simply put
> the code somewhere on your company's website (and we link to it) or into
> a Bugzilla issue. You could even start a Google Code project for the
> plug-in, for example. After all, the plug-in parts can live outside of
> FOP in a separate JAR that can be added if someone wants JPEG2000
> support.
> 
> As for the memory situation, I've gone to great lengths during the
> creation of the image loading framework that memory consumption is kept
> at a minimum and the images are only locked in memory as long as needed.
> Of course, with special decoders (bypassing BufferedImage), we could
> bring the memory consumption down for PNG, TIFF & Co.. Image data could
> be decoded in stripes or line by line and immediately brought to the
> target format. But this would be quite some work as you'd essentially
> write new decoders from scratch in some cases.
> 
> Your solution with JPEG is interesting but probably not ideal since you
> essentially have to fork FOP (and keep it synchronized in the future) as
> it's a proprietary hack. But it could be possible to define an extension
> attribute for fo:external-graphic with an additional URL to an 8-bit
> image with the soft map. That could give you a chance to soften the
> rough edges an turn your changes into a patch to FOP. Just an idea.
> Still a very special use case.
> 
> On 16.09.2009 20:59:54 dan.mccabe wrote:
>> 
>> Hey Jeremias,
>> 
>> First and foremost, I want to thank you for all your help.  I was able to
>> follow along with your instructions and get an implementation that could
>> embed raw JP2 images into a PDF in a pretty short amount of time, which I
>> definitely couldn't have done without your help.  If you have an interest
>> in
>> the source code for this, I would be more than happy to share it.  The
>> one
>> caveat with it currently is that it relies on the JAI ImageIO project
>> (https://jai-imageio-core.dev.java.net/) to be able to parse the header
>> for
>> the JP2 file in my PreloadJP2 class, so I'm not sure if it would be
>> possible
>> or not to distribute it with the graphics commons library (it's under the
>> BSD license).  I took a look under the hood and it appears that it uses
>> JJ2000 (http://jj2000.epfl.ch/; http://jpeg2000.epfl.ch/) to do some of
>> the
>> heavy lifting, so if licensing is a problem, there may be some
>> alternatives
>> that can be explored.
>> 
>> However, after getting familiar with the FOP source code, I think I've
>> found
>> the root cause of the problem we were experiencing, which has caused us
>> to
>> go with a slightly different implementation.  I came across the issue
>> after
>> I had implemented the JP2 support and I was still not seeing the
>> transparency we needed in the resulting PDF.  I compared the resulting
>> PDF
>> with one we had generated using PNGs and noticed that all of the PNGs had
>> a
>> soft-mask associated with them while none of the JP2s did.  After taking
>> a
>> look at the implementation, I found that I needed to add code to my
>> ImageRawJP2Adapter to return a soft-mask reference when there was
>> transparency in the image.
>> 
>> This all worked fine, but it dawned on me that because the transparency
>> was
>> controlled by the soft-mask and not by the type of image itself, there
>> was
>> no reason we couldn't use JPG files as long as we found a way to specify
>> an
>> accompanying mask for it.  Because of some issues with rendering JP2
>> files
>> (we were using im4java as an interface to ImageMagick to generate the
>> images, but there are some hoops you have to jump through to get
>> ImageMagick
>> to run on some machines), it was definitely preferable to use JPGs if
>> possible.  The solution we settled on was inside of our custom image
>> handler
>> for generating the images in the SVG, we took the BufferedImage that
>> needed
>> to be saved and wrote out two JPG files for it, one for the image and one
>> for the mask.  In the setup method in ImageRawJPEGAdapater, I put in some
>> custom code to check for this accompanying mask file and add a soft-mask
>> using it if it was available.  This is definitely a bit of a hack, but
>> it'll
>> work for now, so we should be good.
>> 
>> I spent some time looking into the relationship between RenderedImage,
>> ImageRendered, ImageRenderedAdapter, PDFDocument, and PDFImageXObject,
>> and
>> it appears that the images should get garbage collected properly under
>> normal circumstances (thanks to the overloaded output( OutputStream )
>> method
>> in PDFImageXObject).  I also spent some time tracing through the code for
>> rendering SVGs to PDF, and it looked like the images got cleaned up
>> correctly there too.  However, what is clear is that whenever
>> ImageRenderedAdapter gets used with our application, OutOfMemoryErrors
>> will
>> ensue.  The images shouldn't be too large to fit in memory altogether
>> though, so I'm not entirely sure what was causing the issue.  When it
>> does
>> go down this path, the program usually gets through a couple pages before
>> it
>> errors out, which originally made me think that it was maybe holding
>> images
>> in memory for longer than they needed to be, but now I'm not so sure.
>> 
>> This may not be any news to you, but I figured as long as I had done some
>> research into figuring out what the problem was, I would share it in case
>> it
>> was helpful.  Thanks again for all the help!
>> 
>> Dan
>> 
>> 
>> Jeremias Maerki-2 wrote:
>> > 
>> > Hi Dan,
>> > 
>> > I'm afraid I don't see any other possibility than to implement this
>> > properly. At least the good news is that with JPEG you've got a full
>> > example of how to embed that format uncompressed into a PDF. Here are
>> > some pointers on what needs to be done:
>> > 
>> > XML Graphics Commons:
>> > http://xmlgraphics.apache.org/commons/image-loader.html
>> > 
>> > [1]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageRawJPEG.java?view=markup
>> > [2]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderRawJPEG.java?view=markup
>> > [3]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/PreloaderJPEG.java?view=markup
>> > [4]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderFactoryRaw.java?view=markup
>> > [5]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImagePreloader?view=markup
>> > [6]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImageLoaderFactory?view=markup
>> > 
>> > FOP:
>> > 
>> > [7]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFImageHandlerRawJPEG.java?view=markup
>> > [8]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/ImageRawJPEGAdapter.java?view=markup
>> > [9]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.render.ImageHandler?view=markup
>> > [10]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/
>> > [11]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/PDFDocument.java?view=markup
>> > 
>> > 
>> > First of all, you need a plug-in for the image loading framework in XML
>> > Graphics Commons. The "preloader" [3] is responsible for detecting the
>> > file format and extracting some basic information about the image
>> > (hopefully without loading the full image already). That way the layout
>> > engine doesn't have to load the full image into memory. Only the
>> > renderer needs access to the full image. In the case of JPEG, it
>> doesn't
>> > even have to be loaded in memory. Hopefully, the same will be possible
>> > for JPEG 2000. The preloader needs to be registered in [5].
>> > 
>> > The second step is providing an image class representing the undecoded
>> > JPEG 2000 image [1]. Then you need a loader that builds that
>> > representation [2] and a factory with metadata for the loader [4].
>> > 
>> > Once you have that FOP will be able to provide a JPEG 2000 image in its
>> > raw format. At this point, you'll have to teach FOP how to make use of
>> > that. A PDF-specific image handler [7] (which is also a plug-in [9])
>> > needs to be built. Its presence will tell the image loading framework
>> > that it can provide JPEG 2000 images in raw format. Otherwise, it will
>> > simply check if ImageIO has a codec for JPEG 2000 (but this means the
>> > image gets decoded). The image handler then uses an image adapter [8]
>> to
>> > finally embed the image into the PDF. I assume you will also need a few
>> > modifications in FOP's PDF library to support the JPXDecode filter
>> [10].
>> > 
>> > Since JPXDecode is a PDF 1.5 feature, you will also need to introduce a
>> > switch [11] between PDF 1.4 and 1.5. That is necessary because of PDF/A
>> > and PDF/X functionality which require keeping PDF on version 1.4. So
>> > JPEG 2000 should only be available when PDF 1.5 is enabled.
>> > 
>> > I guess one of the first steps should also be studying the JPEG 2000
>> > specification and the PDF specification so you can decide whether the
>> > direct embedding of JPEG 2000 images is possible in the first place.
>> > Otherwise, you might spend a lot of time on something that may not work
>> > in the end. I don't know the JPEG 2000 format so I can't tell if it's
>> > possible without diving into this myself.
>> > 
>> > HTH and good luck!
>> > 
>> > On 12.09.2009 00:24:41 dan.mccabe wrote:
>> >> 
>> >> Hey Jeremias,
>> >> 
>> >> I'm working on this problem with Bill, and it looks like we may be
>> >> reaching
>> >> a point where we need to try to tackle embedding JPEG 2000 images. 
>> >> Assuming
>> >> we do need to go down that path, do you have any recommendations for
>> >> where
>> >> we should start?
>> >> 
>> >> However, this is assuming that we can't find another way to do what we
>> >> need
>> >> to do.  Based on your description, it certainly doesn't sound like an
>> >> easy
>> >> task to get this implemented, so we really only want to do this as a
>> last
>> >> resort.  Based on the description of what we are trying to do, do you
>> >> have
>> >> any suggestions for an alternative approach that might help us reach
>> our
>> >> goal?
>> >> 
>> >> Thanks.
>> >> 
>> >> 
>> >> Jeremias Maerki-2 wrote:
>> >> > 
>> >> > FOP currently produces PDF 1.4 so there's no support for JPEG 2000,
>> >> yet.
>> >> > One could (probably) add support for embedding undecoded JPEG 2000
>> >> > images (JPXDecode) to FOP and add an option with which to control
>> the
>> >> > PDF version produced by FOP. Of course, that means digging into the
>> >> > source code of FOP and XML Graphics Commons. I can give you pointers
>> if
>> >> > you decide to do that.
>> >> > 
>> >> > However, I haven't investigated if it's as simple as with JPEG to
>> also
>> >> > embed JPEG 2000 images. I mention that since I've once tried to get
>> >> > undecoded PNG graphics directly into PDF. After all, the FlateDecode
>> >> > filter supports about the same predictors as PNG but I couldn't make
>> >> > this work in reasonable time. This just as a caveat.
>> >> > 
>> >> > On 11.09.2009 04:58:09 Bill Gamble wrote:
>> >> >> Hello Everyone,
>> >> >> We are generating PDFs which are very graphic intensive. A typical
>> PDF
>> >> >> has
>> >> >> 50 pages and has 4 4000x4000 images on a page, and the images can
>> >> >> have transparency.
>> >> >> 
>> >> >> We are using Batik for generating each page as an SVG file, and
>> then
>> >> >> referencing the SVG using the <fox:external-document when
>> converting
>> >> to
>> >> >> PDF.
>> >> >> 
>> >> >> We run into performance problems when the images embedded in the
>> SVG
>> >> file
>> >> >> are anything but JPEGs. JPEGs are lighting fast and have a
>> resulting
>> >> pdf
>> >> >> file size 10X smaller than any other format. Unfortunately the
>> >> embedded
>> >> >> images can have transparency, so standard JPEG format cannot be
>> used,
>> >> and
>> >> >> all other file formats run into memory problems and generate
>> enormous
>> >> pdf
>> >> >> files (300MB+).
>> >> >> 
>> >> >> After finding that PDF has had support for JPXDecode (for JPEG
>> 2000)
>> >> >> since
>> >> >> 1.5 I was hoping to find that JPEG 2000 could injected into the
PDF
>> >> >> without
>> >> >> the need to decode the image, but that does not appear to be the
>> case
>> >> (we
>> >> >> run into the same performance problems with JPEG 2000).
>> >> >> 
>> >> >> Can anyone comment on:
>> >> >> 
>> >> >> 1) Is this a limitation of the PDF format, or how FOP is rendering
>> the
>> >> >> PDF?
>> >> >> 2) Any suggestions or other approaches that to how to solve our
>> >> problem?
>> >> >> 
>> >> >> Thanks in advance!
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > Jeremias Maerki
>> >> > 
>> >> > 
>> >> >
>> ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> >> > For additional commands, e-mail:
>> fop-users-help@xmlgraphics.apache.org
>> >> > 
>> >> > 
>> >> > 
>> >> 
>> >> -- 
>> >> View this message in context:
>> >>
>> http://www.nabble.com/large-image-embedding-problems-tp25394304p25409319.html
>> >> Sent from the FOP - Users mailing list archive at Nabble.com.
>> >> 
>> > 
>> > 
>> > 
>> > Jeremias Maerki
>> > 
>> > 
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> > For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/large-image-embedding-problems-tp25394304p25478390.html
>> Sent from the FOP - Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 
> 
> Jeremias Maerki
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/large-image-embedding-problems-tp25394304p25480579.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Mime
View raw message