synapse-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kim Horn" <kim.h...@icsglobal.net>
Subject RE: VFS - Synapse Memory Leak
Date Wed, 25 Mar 2009 04:02:00 GMT
Hello Andreas,

This all works really well. Streaming uses no memory at all.
Got a java mediator also streaming payloads and with massive files never 
uses much more than 40K for Synapse.

Using just the OUT_ONLY property set; uses much more memory but it stabilises
and does not grow.
Thanks.

-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Friday, 20 March 2009 8:05 PM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

Of course the memory allocated to a message will be freed once the
message has been processed. That is why it's important to set the
OUT_ONLY property: if it is not set correctly, Synapse will keep the
message context (with the payload) in a callback table to correlate it
with a future response (which in your case never comes in). Probably
there is something to improve here in Synapse:
- The VFS transport should trigger an error if there is a mismatch
between the message exchange pattern and the transport configuration
of the service (the transport.vfs.* parameters).
- Synapse should start issuing warnings when the number of entries in
the callback table reaches a certain threshold.

Andreas

On Fri, Mar 20, 2009 at 01:41, Kim Horn <kim.horn@icsglobal.net> wrote:
> Not really; I cannot see why memory should permanently grow when I pass the same file
> repeatedly through VFS. In theory this means VFS will always consume all the available
memory
> given enough time and file iterations. Therefore VFS cannot be used in a production system.
> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
> I would assume the memory no longer required would be re-claimed. I would also assume
> The overhead was not 10 times the file size; seems excessive.
>
> Yes I understand the streaming approach should in theory use a fixed and much smaller
amount of memory;
> but haven't tested that yet either. No reason given above memory leak that it should
not permanently grow
> but at a smaller rate aswell.
>
> Thanks
> Kim
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Friday, 20 March 2009 10:52 AM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> If N is the size of the file, the memory consumption caused by the
> transport is O(N) with transport.vfs.Streaming=false and O(1) with
> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
> methods in org.apache.axis2.format.ElementHelper are there to allow
> you to implement your mediator with O(1) memory usage, so that the
> overall memory consumption remains O(1). Does that answer your
> question?
>
> Andreas
>
> On Thu, Mar 19, 2009 at 23:33, Kim Horn <kim.horn@icsglobal.net> wrote:
>> It's the same Synapse.xml as specified originally and same trace. If you are using
Nabble you can see this, in case you lost the prior emails I can post them again.
>>
>> I must admit I did not set those extra parameters, you mentioned, but I don't see
why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce
the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming
? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests
to me that any use of VFS will eventually kill the Server. Even with smaller files it will
eventually use all available memory. I guess I did not understand the actual reason for this
issue from prior discussion.
>>
>> I will try your extra parameters today though.
>>
>> Thanks
>> Kim
>>
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Thursday, 19 March 2009 5:48 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> Kim,
>>
>> Can you post your current synapse.xml as well as the stack trace you get now?
>>
>> Andreas
>>
>> On Thu, Mar 19, 2009 at 07:20, kimhorn <kim.horn@icsglobal.net> wrote:
>>>
>>> Using the last stable build from 15 March 2009 I still get exactly same
>>> behaviour as originally
>>> described with the above script. VFS still just dies. Would your fixes be in
>>> this ?
>>>
>>> Using the last st
>>>
>>> Andreas Veithen-2 wrote:
>>>>
>>>> I committed the code and it will be available in the next WS-Commons
>>>> transport build. The methods are located in
>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>> module.
>>>>
>>>> Andreas
>>>>
>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <kim.horn@icsglobal.net> wrote:
>>>>> Hello Andreas,
>>>>> This is great and really helps, have not had time to try it out but will
>>>>> soon.
>>>>>
>>>>> Contributing the java.io.Reader would be a great help but it will take
me
>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>
>>>>> In the short term I am going to use a brute force approach that is now
>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>> POJOCommand on <out> to split file into another directory, stream
in and
>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>> each file to Web service. Hopefully memory will be fine. Overloading
the
>>>>> destination may still be an issue ?
>>>>>
>>>>> Kim
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>> To: dev@synapse.apache.org
>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>
>>>>> The changes I did in the VFS transport and the message builders for
>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>> out-of-the-box solution for your use case, but they are the
>>>>> prerequisite.
>>>>>
>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>> to a temporary file), I don't like this because it would create a
>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>> goal should be that the solution will still work if the file comes
>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>
>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>> this will require development of a custom mediator. This mediator
>>>>> would read the content, split it up (and store the chunks in memory or
>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>> the destination service.
>>>>>
>>>>> Note that it is probably not possible to implemented the mediator
>>>>> using a script because of the problematic String handling. Also,
>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>> I can contribute the required code to get the text content in the form
>>>>> of a java.io.Reader.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <kim.horn@icsglobal.net>
wrote:
>>>>>>
>>>>>> Although this is a good feature it may not solve the actual problem
?
>>>>>> The main first issue on my list was the memory leak.
>>>>>> However, the real problem is once I get this massive files I  have
to
>>>>>> send
>>>>>> it to a web Service that can only take it in small chunks (about
14MB) .
>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>> It
>>>>>> would get the memory error. The text document can be split apart
easily,
>>>>>> as
>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>
>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>
>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>> through
>>>>>> input file and outputs smaller
>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>
>>>>>> So I had mentioned a few solutions, in prior posts, solution now
are:
>>>>>>
>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>> process
>>>>>> the file by splitting it into many smaller files. These files then
>>>>>> trigger
>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>> 2) A Java Mediator takes the <text> package and splits it up
by wrapping
>>>>>> into many XML <data> elements that can then be acted on by
a Synapse
>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>> Problem is that this loads whole message into memory.
>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>> (to
>>>>>> split the text data) or actually uses a for loop approach to chop
the
>>>>>> file
>>>>>> into chunks based on the loop index value. E.g. Index = 23 means
a 14K
>>>>>> chunk
>>>>>> 23 chunks into the data.
>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>> document
>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>> Similar
>>>>>> to (2) but using another service rather than Java to split document.
>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>> with
>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>> Service. So
>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>
>>>>>> Then other solutions replace Synapse with a stand alone Java program
at
>>>>>> the
>>>>>> front end.
>>>>>>
>>>>>> Another issue here is throttling: Splitting the file is one issues
but
>>>>>> submitting 100's of calls in parralel to the destination service
would
>>>>>> result in time outs... So need to work in throttling.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ruwan Linton wrote:
>>>>>>>
>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>> stuff
>>>>>>> than trying to invent the wheel again :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ruwan
>>>>>>>
>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>> <andreas.veithen@gmail.com>wrote:
>>>>>>>
>>>>>>>> Ruwan,
>>>>>>>>
>>>>>>>> It's not a question of possibility, it is a question of available
time
>>>>>>>> :-)
>>>>>>>>
>>>>>>>> Also note that some of the features that we might want to
implement
>>>>>>>> have some similarities with what is done for attachments
in Axiom
>>>>>>>> (except that an attachment is only available once, while
a file over
>>>>>>>> VFS can be read several times). I think there is also some
existing
>>>>>>>> code in Axis2 that might be useful. We should not reimplement
these
>>>>>>>> things but try to make the existing code reusable. This however
is
>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ruwan.linton@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Andreas,
>>>>>>>> >
>>>>>>>> > Can we have the caching at the file system as a property
to support
>>>>>>>> the
>>>>>>>> > multiple layers touching the full message and is it
possible make it
>>>>>>>> to
>>>>>>>> > specify a threshold for streaming? For example if the
message is
>>>>>>>> touched
>>>>>>>> > several time we might still need streaming but not for
the 100KB or
>>>>>>>> lesser
>>>>>>>> > files.
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> > Ruwan
>>>>>>>> >
>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>> > wrote:
>>>>>>>> >>
>>>>>>>> >> I've done an initial implementation of this feature.
It is
>>>>>>>> available
>>>>>>>> >> in trunk and should be included in the next nightly
build. In order
>>>>>>>> to
>>>>>>>> >> enable this in your configuration, you need to add
the following
>>>>>>>> >> property to the proxy:
>>>>>>>> >>
>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>> >>
>>>>>>>> >> You also need to add the following mediators just
before the <send>
>>>>>>>> >> mediator:
>>>>>>>> >>
>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>> scope="axis2"/>
>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>> >>
>>>>>>>> >> With this configuration Synapse will stream the
data directly from
>>>>>>>> the
>>>>>>>> >> incoming to the outgoing transport without storing
it in memory or
>>>>>>>> in
>>>>>>>> >> a temporary file. Note that this has two other side
effects:
>>>>>>>> >> * The incoming file (or connection in case of a
remote file) will
>>>>>>>> only
>>>>>>>> >> be opened on demand. In this case this happens during
execution of
>>>>>>>> the
>>>>>>>> >> <send> mediator.
>>>>>>>> >> * If during the mediation the content of the file
is needed several
>>>>>>>> >> time (which is not the case in your example), it
will be read
>>>>>>>> several
>>>>>>>> >> times. The reason is of course that the content
is not cached.
>>>>>>>> >>
>>>>>>>> >> I tested the solution with a 2GB file and it worked
fine. The
>>>>>>>> >> performance of the implementation is not yet optimal,
but at least
>>>>>>>> the
>>>>>>>> >> memory consumption is constant.
>>>>>>>> >>
>>>>>>>> >> Some additional comments:
>>>>>>>> >> * The transport.vfs.Streaming property has no impact
on XML and
>>>>>>>> SOAP
>>>>>>>> >> processing: this type of content is processed exactly
as before.
>>>>>>>> >> * With the changes described here, we have now two
different
>>>>>>>> policies
>>>>>>>> >> for plain text and binary content processing: in-memory
caching +
>>>>>>>> no
>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no
caching + deferred
>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true).
Probably we
>>>>>>>> >> should define a wider range of policies in the future,
including
>>>>>>>> file
>>>>>>>> >> system caching + streaming.
>>>>>>>> >> * It is necessary to remove the transportNonBlocking
property
>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent
the <send>
>>>>>>>> mediator
>>>>>>>> >> (more precisely the OperationClient) from executing
the outgoing
>>>>>>>> >> transport in a separate thread. This property is
set by the
>>>>>>>> incoming
>>>>>>>> >> transport. I think this is a bug since I don't see
any valid reason
>>>>>>>> >> why the transport that handles the incoming request
should
>>>>>>>> determine
>>>>>>>> >> the threading behavior of the transport that sends
the outgoing
>>>>>>>> >> request to the target service. Maybe Asankha can
comment on this?
>>>>>>>> >>
>>>>>>>> >> Andreas
>>>>>>>> >>
>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <kim.horn@icsglobal.net>
>>>>>>>> wrote:
>>>>>>>> >> >
>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>> Exception in thread "vfs-Worker-4"
java.lang.OutOfMemoryError:
>>>>>>>> Java
>>>>>>>> >> >>> heap
>>>>>>>> >> >>> space
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>> >> >>>
>>>>>>>> >> >> Since the content type is text, the plain
text formatter is
>>>>>>>> trying
>>>>>>>> to
>>>>>>>> >> >> use a String to parse as I see.. which
is a problem for large
>>>>>>>> content..
>>>>>>>> >> >>
>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>> >> >>
>>>>>>>> >> >> cheers
>>>>>>>> >> >> asankha
>>>>>>>> >> >>
>>>>>>>> >> >> --
>>>>>>>> >> >> Asankha C. Perera
>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>> >> >>
>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >
>>>>>>>> >> > --
>>>>>>>> >> > View this message in context:
>>>>>>>> >> >
>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive
at Nabble.com.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Ruwan Linton
>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ruwan Linton
>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Mime
View raw message