synapse-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kimhorn <kim.h...@icsglobal.net>
Subject Re: VFS - Synapse Memory Leak
Date Thu, 19 Mar 2009 06:20:28 GMT

Using the last stable build from 15 March 2009 I still get exactly same
behaviour as originally
described with the above script. VFS still just dies. Would your fixes be in
this ?

Using the last st

Andreas Veithen-2 wrote:
> 
> I committed the code and it will be available in the next WS-Commons
> transport build. The methods are located in
> org.apache.axis2.format.ElementHelper in the axis2-transport-base
> module.
> 
> Andreas
> 
> On Thu, Mar 12, 2009 at 00:06, Kim Horn <kim.horn@icsglobal.net> wrote:
>> Hello Andreas,
>> This is great and really helps, have not had time to try it out but will
>> soon.
>>
>> Contributing the java.io.Reader would be a great help but it will take me
>> a while to get up to speed to do the Synapse iterator.
>>
>> In the short term I am going to use a brute force approach that is now
>> feasible given the memory issue is resolved. Just thought of this one
>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>> POJOCommand on <out> to split file into another directory, stream in and
>> out. Another independent VFS proxy watches that directory and submits
>> each file to Web service. Hopefully memory will be fine. Overloading the
>> destination may still be an issue ?
>>
>> Kim
>>
>>
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Monday, 9 March 2009 10:55 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> The changes I did in the VFS transport and the message builders for
>> text/plain and application/octet-stream certainly don't provide an
>> out-of-the-box solution for your use case, but they are the
>> prerequisite.
>>
>> Concerning your first proposed solution (let the VFS write the content
>> to a temporary file), I don't like this because it would create a
>> tight coupling between the VFS transport and the mediator. A design
>> goal should be that the solution will still work if the file comes
>> from another source, e.g. an attachment in an MTOM or SwA message.
>>
>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>> this will require development of a custom mediator. This mediator
>> would read the content, split it up (and store the chunks in memory or
>> an disk) and executes a sub-sequence for each chunk. The execution of
>> the sub-sequence would happen synchronously to limit the memory/disk
>> space consumption (to the maximum chunk size) and to avoid flooding
>> the destination service.
>>
>> Note that it is probably not possible to implemented the mediator
>> using a script because of the problematic String handling. Also,
>> Spring, POJO and class mediators don't support sub-sequences (I
>> think). Therefore it should be implemented as a full-featured Java
>> mediator, probably taking the existing iterate mediator as a template.
>> I can contribute the required code to get the text content in the form
>> of a java.io.Reader.
>>
>> Regards,
>>
>> Andreas
>>
>> On Mon, Mar 9, 2009 at 03:05, kimhorn <kim.horn@icsglobal.net> wrote:
>>>
>>> Although this is a good feature it may not solve the actual problem ?
>>> The main first issue on my list was the memory leak.
>>> However, the real problem is once I get this massive files I  have to
>>> send
>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>> Streaming it straight out would just kill the destination Web service.
>>> It
>>> would get the memory error. The text document can be split apart easily,
>>> as
>>> it has independant records on each line seperated by <CR> <LF>.
>>>
>>> In an earlier post; that was not responded too, I mentioned:
>>>
>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>> through
>>> input file and outputs smaller
>>> chunks for processing, in Synapse, may be a solution ? "
>>>
>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>
>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>> process
>>> the file by splitting it into many smaller files. These files then
>>> trigger
>>> another VFS proxy that submits these to the final web Service.
>>> The problem is is that is uses the file system (not so bad).
>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>> into many XML <data> elements that can then be acted on by a Synapse
>>> Iterator. So replace the text message with many smaller XML elements.
>>> Problem is that this loads whole message into memory.
>>> 3) Create another Iterator in Synapse that works on Regular expression
>>> (to
>>> split the text data) or actually uses a for loop approach to chop the
>>> file
>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>> chunk
>>> 23 chunks into the data.
>>> 4) Using the approach proposed now - just submit the file straight
>>> (stream
>>> it) to another web service that chops it up. It may return an XML
>>> document
>>> with many sub elelements that allows the standard Iterator to work.
>>> Similar
>>> to (2) but using another service rather than Java to split document.
>>> 5) Using the approach proposed now - just submit the file straight
>>> (stream
>>> it) to another web service that chops it up but calls a Synapse proxy
>>> with
>>> each small packet of data that then forwards it to the final WEb
>>> Service. So
>>> the Web Service iterates across the data; and not Synapse.
>>>
>>> Then other solutions replace Synapse with a stand alone Java program at
>>> the
>>> front end.
>>>
>>> Another issue here is throttling: Splitting the file is one issues but
>>> submitting 100's of calls in parralel to the destination service would
>>> result in time outs... So need to work in throttling.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Ruwan Linton wrote:
>>>>
>>>> I agree and can understand the time factor and also +1 for reusing
>>>> stuff
>>>> than trying to invent the wheel again :-)
>>>>
>>>> Thanks,
>>>> Ruwan
>>>>
>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>> <andreas.veithen@gmail.com>wrote:
>>>>
>>>>> Ruwan,
>>>>>
>>>>> It's not a question of possibility, it is a question of available time
>>>>> :-)
>>>>>
>>>>> Also note that some of the features that we might want to implement
>>>>> have some similarities with what is done for attachments in Axiom
>>>>> (except that an attachment is only available once, while a file over
>>>>> VFS can be read several times). I think there is also some existing
>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>> things but try to make the existing code reusable. This however is
>>>>> only realistic for the next release after 1.3.
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ruwan.linton@gmail.com>
>>>>> wrote:
>>>>> > Andreas,
>>>>> >
>>>>> > Can we have the caching at the file system as a property to support
>>>>> the
>>>>> > multiple layers touching the full message and is it possible make
it
>>>>> to
>>>>> > specify a threshold for streaming? For example if the message is
>>>>> touched
>>>>> > several time we might still need streaming but not for the 100KB
or
>>>>> lesser
>>>>> > files.
>>>>> >
>>>>> > Thanks,
>>>>> > Ruwan
>>>>> >
>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>> andreas.veithen@gmail.com>
>>>>> > wrote:
>>>>> >>
>>>>> >> I've done an initial implementation of this feature. It is
>>>>> available
>>>>> >> in trunk and should be included in the next nightly build. In
order
>>>>> to
>>>>> >> enable this in your configuration, you need to add the following
>>>>> >> property to the proxy:
>>>>> >>
>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>> >>
>>>>> >> You also need to add the following mediators just before the
<send>
>>>>> >> mediator:
>>>>> >>
>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>> scope="axis2"/>
>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>> >>
>>>>> >> With this configuration Synapse will stream the data directly
from
>>>>> the
>>>>> >> incoming to the outgoing transport without storing it in memory
or
>>>>> in
>>>>> >> a temporary file. Note that this has two other side effects:
>>>>> >> * The incoming file (or connection in case of a remote file)
will
>>>>> only
>>>>> >> be opened on demand. In this case this happens during execution
of
>>>>> the
>>>>> >> <send> mediator.
>>>>> >> * If during the mediation the content of the file is needed
several
>>>>> >> time (which is not the case in your example), it will be read
>>>>> several
>>>>> >> times. The reason is of course that the content is not cached.
>>>>> >>
>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>> >> performance of the implementation is not yet optimal, but at
least
>>>>> the
>>>>> >> memory consumption is constant.
>>>>> >>
>>>>> >> Some additional comments:
>>>>> >> * The transport.vfs.Streaming property has no impact on XML
and
>>>>> SOAP
>>>>> >> processing: this type of content is processed exactly as before.
>>>>> >> * With the changes described here, we have now two different
>>>>> policies
>>>>> >> for plain text and binary content processing: in-memory caching
+
>>>>> no
>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably
we
>>>>> >> should define a wider range of policies in the future, including
>>>>> file
>>>>> >> system caching + streaming.
>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>> mediator
>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>> >> transport in a separate thread. This property is set by the
>>>>> incoming
>>>>> >> transport. I think this is a bug since I don't see any valid
reason
>>>>> >> why the transport that handles the incoming request should
>>>>> determine
>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>> >> request to the target service. Maybe Asankha can comment on
this?
>>>>> >>
>>>>> >> Andreas
>>>>> >>
>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <kim.horn@icsglobal.net>
>>>>> wrote:
>>>>> >> >
>>>>> >> > Thats good; as this stops us using Synapse.
>>>>> >> >
>>>>> >> >
>>>>> >> >
>>>>> >> > Asankha C. Perera wrote:
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>> Java
>>>>> >> >>> heap
>>>>> >> >>> space
>>>>> >> >>>         at
>>>>> >> >>>
>>>>> >> >>>
>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>> >> >>>         at
>>>>> >> >>>
>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>> >> >>>         at
>>>>> >> >>>
>>>>> >> >>>
>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>> >> >>>
>>>>> >> >> Since the content type is text, the plain text formatter
is
>>>>> trying
>>>>> to
>>>>> >> >> use a String to parse as I see.. which is a problem
for large
>>>>> content..
>>>>> >> >>
>>>>> >> >> A definite bug we need to fix ..
>>>>> >> >>
>>>>> >> >> cheers
>>>>> >> >> asankha
>>>>> >> >>
>>>>> >> >> --
>>>>> >> >> Asankha C. Perera
>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>> >> >>
>>>>> >> >> http://esbmagic.blogspot.com
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> ---------------------------------------------------------------------
>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >
>>>>> >> > --
>>>>> >> > View this message in context:
>>>>> >> >
>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>> >> >
>>>>> >> >
>>>>> >> >
>>>>> ---------------------------------------------------------------------
>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>> >> >
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Ruwan Linton
>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>> > http://ruwansblog.blogspot.com/
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Ruwan Linton
>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>> http://ruwansblog.blogspot.com/
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
Sent from the Synapse - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Mime
View raw message