synapse-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Veithen <andreas.veit...@gmail.com>
Subject Re: VFS - Synapse Memory Leak
Date Thu, 19 Mar 2009 06:48:24 GMT
Kim,

Can you post your current synapse.xml as well as the stack trace you get now?

Andreas

On Thu, Mar 19, 2009 at 07:20, kimhorn <kim.horn@icsglobal.net> wrote:
>
> Using the last stable build from 15 March 2009 I still get exactly same
> behaviour as originally
> described with the above script. VFS still just dies. Would your fixes be in
> this ?
>
> Using the last st
>
> Andreas Veithen-2 wrote:
>>
>> I committed the code and it will be available in the next WS-Commons
>> transport build. The methods are located in
>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>> module.
>>
>> Andreas
>>
>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <kim.horn@icsglobal.net> wrote:
>>> Hello Andreas,
>>> This is great and really helps, have not had time to try it out but will
>>> soon.
>>>
>>> Contributing the java.io.Reader would be a great help but it will take me
>>> a while to get up to speed to do the Synapse iterator.
>>>
>>> In the short term I am going to use a brute force approach that is now
>>> feasible given the memory issue is resolved. Just thought of this one
>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>> POJOCommand on <out> to split file into another directory, stream in and
>>> out. Another independent VFS proxy watches that directory and submits
>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>> destination may still be an issue ?
>>>
>>> Kim
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Monday, 9 March 2009 10:55 PM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> The changes I did in the VFS transport and the message builders for
>>> text/plain and application/octet-stream certainly don't provide an
>>> out-of-the-box solution for your use case, but they are the
>>> prerequisite.
>>>
>>> Concerning your first proposed solution (let the VFS write the content
>>> to a temporary file), I don't like this because it would create a
>>> tight coupling between the VFS transport and the mediator. A design
>>> goal should be that the solution will still work if the file comes
>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>
>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>> this will require development of a custom mediator. This mediator
>>> would read the content, split it up (and store the chunks in memory or
>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>> the sub-sequence would happen synchronously to limit the memory/disk
>>> space consumption (to the maximum chunk size) and to avoid flooding
>>> the destination service.
>>>
>>> Note that it is probably not possible to implemented the mediator
>>> using a script because of the problematic String handling. Also,
>>> Spring, POJO and class mediators don't support sub-sequences (I
>>> think). Therefore it should be implemented as a full-featured Java
>>> mediator, probably taking the existing iterate mediator as a template.
>>> I can contribute the required code to get the text content in the form
>>> of a java.io.Reader.
>>>
>>> Regards,
>>>
>>> Andreas
>>>
>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <kim.horn@icsglobal.net> wrote:
>>>>
>>>> Although this is a good feature it may not solve the actual problem ?
>>>> The main first issue on my list was the memory leak.
>>>> However, the real problem is once I get this massive files I  have to
>>>> send
>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>> Streaming it straight out would just kill the destination Web service.
>>>> It
>>>> would get the memory error. The text document can be split apart easily,
>>>> as
>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>
>>>> In an earlier post; that was not responded too, I mentioned:
>>>>
>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>> through
>>>> input file and outputs smaller
>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>
>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>
>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>> process
>>>> the file by splitting it into many smaller files. These files then
>>>> trigger
>>>> another VFS proxy that submits these to the final web Service.
>>>> The problem is is that is uses the file system (not so bad).
>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>> Iterator. So replace the text message with many smaller XML elements.
>>>> Problem is that this loads whole message into memory.
>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>> (to
>>>> split the text data) or actually uses a for loop approach to chop the
>>>> file
>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>> chunk
>>>> 23 chunks into the data.
>>>> 4) Using the approach proposed now - just submit the file straight
>>>> (stream
>>>> it) to another web service that chops it up. It may return an XML
>>>> document
>>>> with many sub elelements that allows the standard Iterator to work.
>>>> Similar
>>>> to (2) but using another service rather than Java to split document.
>>>> 5) Using the approach proposed now - just submit the file straight
>>>> (stream
>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>> with
>>>> each small packet of data that then forwards it to the final WEb
>>>> Service. So
>>>> the Web Service iterates across the data; and not Synapse.
>>>>
>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>> the
>>>> front end.
>>>>
>>>> Another issue here is throttling: Splitting the file is one issues but
>>>> submitting 100's of calls in parralel to the destination service would
>>>> result in time outs... So need to work in throttling.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Ruwan Linton wrote:
>>>>>
>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>> stuff
>>>>> than trying to invent the wheel again :-)
>>>>>
>>>>> Thanks,
>>>>> Ruwan
>>>>>
>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>> <andreas.veithen@gmail.com>wrote:
>>>>>
>>>>>> Ruwan,
>>>>>>
>>>>>> It's not a question of possibility, it is a question of available
time
>>>>>> :-)
>>>>>>
>>>>>> Also note that some of the features that we might want to implement
>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>> (except that an attachment is only available once, while a file over
>>>>>> VFS can be read several times). I think there is also some existing
>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>> things but try to make the existing code reusable. This however is
>>>>>> only realistic for the next release after 1.3.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ruwan.linton@gmail.com>
>>>>>> wrote:
>>>>>> > Andreas,
>>>>>> >
>>>>>> > Can we have the caching at the file system as a property to
support
>>>>>> the
>>>>>> > multiple layers touching the full message and is it possible
make it
>>>>>> to
>>>>>> > specify a threshold for streaming? For example if the message
is
>>>>>> touched
>>>>>> > several time we might still need streaming but not for the 100KB
or
>>>>>> lesser
>>>>>> > files.
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Ruwan
>>>>>> >
>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>> andreas.veithen@gmail.com>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> I've done an initial implementation of this feature. It
is
>>>>>> available
>>>>>> >> in trunk and should be included in the next nightly build.
In order
>>>>>> to
>>>>>> >> enable this in your configuration, you need to add the following
>>>>>> >> property to the proxy:
>>>>>> >>
>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>> >>
>>>>>> >> You also need to add the following mediators just before
the <send>
>>>>>> >> mediator:
>>>>>> >>
>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>> scope="axis2"/>
>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>> >>
>>>>>> >> With this configuration Synapse will stream the data directly
from
>>>>>> the
>>>>>> >> incoming to the outgoing transport without storing it in
memory or
>>>>>> in
>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>> >> * The incoming file (or connection in case of a remote file)
will
>>>>>> only
>>>>>> >> be opened on demand. In this case this happens during execution
of
>>>>>> the
>>>>>> >> <send> mediator.
>>>>>> >> * If during the mediation the content of the file is needed
several
>>>>>> >> time (which is not the case in your example), it will be
read
>>>>>> several
>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>> >>
>>>>>> >> I tested the solution with a 2GB file and it worked fine.
The
>>>>>> >> performance of the implementation is not yet optimal, but
at least
>>>>>> the
>>>>>> >> memory consumption is constant.
>>>>>> >>
>>>>>> >> Some additional comments:
>>>>>> >> * The transport.vfs.Streaming property has no impact on
XML and
>>>>>> SOAP
>>>>>> >> processing: this type of content is processed exactly as
before.
>>>>>> >> * With the changes described here, we have now two different
>>>>>> policies
>>>>>> >> for plain text and binary content processing: in-memory
caching +
>>>>>> no
>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching
+ deferred
>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably
we
>>>>>> >> should define a wider range of policies in the future, including
>>>>>> file
>>>>>> >> system caching + streaming.
>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>> mediator
>>>>>> >> (more precisely the OperationClient) from executing the
outgoing
>>>>>> >> transport in a separate thread. This property is set by
the
>>>>>> incoming
>>>>>> >> transport. I think this is a bug since I don't see any valid
reason
>>>>>> >> why the transport that handles the incoming request should
>>>>>> determine
>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>> >> request to the target service. Maybe Asankha can comment
on this?
>>>>>> >>
>>>>>> >> Andreas
>>>>>> >>
>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <kim.horn@icsglobal.net>
>>>>>> wrote:
>>>>>> >> >
>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>> >> >
>>>>>> >> >
>>>>>> >> >
>>>>>> >> > Asankha C. Perera wrote:
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>> Java
>>>>>> >> >>> heap
>>>>>> >> >>> space
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> >> >>>
>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> >> >>>
>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>> >> >>>
>>>>>> >> >> Since the content type is text, the plain text
formatter is
>>>>>> trying
>>>>>> to
>>>>>> >> >> use a String to parse as I see.. which is a problem
for large
>>>>>> content..
>>>>>> >> >>
>>>>>> >> >> A definite bug we need to fix ..
>>>>>> >> >>
>>>>>> >> >> cheers
>>>>>> >> >> asankha
>>>>>> >> >>
>>>>>> >> >> --
>>>>>> >> >> Asankha C. Perera
>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>> >> >>
>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >
>>>>>> >> > --
>>>>>> >> > View this message in context:
>>>>>> >> >
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>> >> > Sent from the Synapse - Dev mailing list archive at
Nabble.com.
>>>>>> >> >
>>>>>> >> >
>>>>>> >> >
>>>>>> ---------------------------------------------------------------------
>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >> >
>>>>>> >> >
>>>>>> >>
>>>>>> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Ruwan Linton
>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>> > http://ruwansblog.blogspot.com/
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ruwan Linton
>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>> http://ruwansblog.blogspot.com/
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Mime
View raw message