logging-log4j-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Watts <...@cliftonfarm.org>
Subject [OT] Re: logging parallel threads
Date Tue, 16 Oct 2012 16:41:28 GMT
First, remember this is the Log4J Users List, not a forum for working
out design issues related to storing data as XML.

But let's back up for a second.  I'm assuming that:
     1. you have a set of activities/tasks (possibly nested?) and you're
        interested in recording the state of these in an XML file (e.g.
        where is it in its life cycle? was it successful? details about
        the task parameters etc.)  
     2. you're NOT really interested in recording the detailed events
        produced by the activity while it executes.

Or maybe you need both?  Log4J WOULD be a good choice for #2 but not for
#1.  And if you need to store the logging output with the state data
that's doable but storing all this in a single XML file is probably not
the way to go.

If my original assumptions are correct, I can offer some general advice.
But I think you should then do further exploration on other forums.

I would say "XML => DOM => {domEdits} => XML" would be wildly
inefficient for a system that has to support frequent, parallel updates.
(Yes, DOM generally keeps the whole document in memory.)  Depending on
the scale of the system you could do "XML => DOM" at initialization,
keeping the model in memory, and then "{domEdits} => XML" on each
update.  But DOM itself is pretty inefficient for that scenario too.
You might want to consider using JAXB instead:

http://docs.oracle.com/javase/7/docs/technotes/guides/xml/jaxb/index.html

This assumes that your app is the only system that does updates to the
data.

If you're talking about a very large system where holding everything in
memory would be prohibitive (e.g. the "activities" have to be kept
around long after they've completed) or where multiple systems need to
update the data, then you really need to consider using a database for
storage.  In that case, something like JPA might be appropriate.  See
http://openjpa.apache.org/ .  JPA doesn't provide a database per se but
provides object storage backed by a database.

Alternatively, you could use the filesystem as your database and persist
each "activity" to it's own XML file using JAXB or DOM.

It's not clear to me why you're wedded to a single XML file as the
storage medium.  What consumes the XML?  What kind of operations do they
perform on it?  What would happen when a consumer starts reading the
file mid-update?  Maybe you've already thought that through but it seems
to me that a single XML file is not the best storage choice for a system
that has high concurrency requirements.

As far as storing logging output with state data here are a couple
approaches to consider:

Create a distinct Logger for each task configured with a WriterAppender
connected to a java.io.StringWriter.  Persist the (in memory) logging
output with the state data whenever the state object (Activity) is
updated.  Expensive operation!

Write the logging output to a unique file for each task.  Then either
reference the file in the state data or suck the whole log into the
state object once the task is complete.

If all of this is getting too complicated you could always just use a
simple log file and build the XML file by reading the log.  For example,
if the XML is just for reporting then whenever a report is requested:
        if logFile is newer than xmlFile
        	rebuildXmlFile
        generateReport

Hope this helps.  Good Luck!


On Tue, 2012-10-16 at 05:43 -0700, pa7751 wrote:
> Fine, so if we go with the assumption that log4j is an overkill, then what
> could be the best way to do this. We have so many reads and writes happening
> to the same xml file, so is it ok to have a DOM parser like xerces, getting
> the whole xml DOM in memory, then make updates at every task and then
> persist at every task, then again read and update and so on
> 
> 
> 
> Tim Watts-3 wrote:
> > 
> > Again, this isn't a logging problem you're describing.  You should be
> > looking to some sort of database for solutions.  You could probably
> > torture log4j into doing some of this but there's no advantage to it.
> > There are better suited tools for the problem you describe.
> > 
> > 
> > On Mon, 2012-10-15 at 00:43 -0700, pa7751 wrote:
> >> Hi
> >> 
> >> Basically I have an xml document that contains many tasks. As each of
> >> these
> >> tasks gets executed, I log in a file (which is again an xml) whether the
> >> task got successfully completed. So the first question is that whether
> >> log4j
> >> can be used for this?
> >> 
> >> Next since these tasks can also be executed in parallel by the engine, I
> >> want to know how logging can be done to a single log file.
> >> 
> >> Third, since for each of the tasks we will be recording a status like
> >> 'begin', 'end' to signify that a thread started or a thread completed
> >> execution or a thread that stared but did not complete execution maybe
> >> due
> >> to a sudden power failure etc. Since this will be done for every task,
> >> there
> >> will be many edits happening to the log file. Then how can we ensure that
> >> the structure of the xml remains correct
> >> 
> >> Fourth, considering the many edits/inserts happening in the xml, how can
> >> we
> >> get a good performance?
> >> 
> >> 
> >> Tim Watts-3 wrote:
> >> > 
> >> > On Sat, 2012-10-13 at 05:55 -0700, pa7751 wrote:
> >> >> Hi
> >> >> 
> >> >> I need some help to know if log4j can be used in the following
> >> scenarios
> >> >> and
> >> >> how
> >> >> 
> >> >> 1. In my application there are many small tasks that I need to do,
say
> >> >> for
> >> >> e.g., 100. At every task, I need to log to an xml file, the structure
> >> of
> >> >> which I have to define. Is this possible using log4j?
> >> >> 
> >> >> 2. Since every task has to write to the same log file, and many of
> >> these
> >> >> tasks could be executing in parallel, how can I ensure the integrity
> >> of
> >> >> the
> >> >> structure of the xml log file i.e. if I have an <activity> tag
and
> >> >> activity
> >> >> can have more activities or tasks, then can I ensure that tasks get
> >> >> added/deleted only to the activity that I suggest for parallel running
> >> >> tasks?
> >> >> 
> >> >> 3. Considering the multiple updates happening to the xml file i.e.
at
> >> >> every
> >> >> task, the xml file will be edited so does that mean that the DOM is
> >> >> created
> >> >> in memory for every task. So how will performance be impacted? Is
> >> there
> >> >> any
> >> >> way we can get good performance?
> >> > 
> >> > Presumably you're targeting log4j 1.2.x.
> >> > 
> >> > Are you thinking of using o.a.log4j.xml.XMLLayout?  Doesn't sound like
> >> > it.
> >> > 
> >> > You write about adding & deleting activities, and editing the xml
> >> > output.  This doesn't sound like a logging problem.  It sounds like
> >> > you're trying to capture "current state" not "a history of events". 
> >> The
> >> > latter fits a logging problem, the former does not.  So I would say
> >> > log4j is probably not the right tool given your problem description.
> >> > Maybe if you restructure the problem it could simplify things for you?
> >> > 
> >> > That said, log4j can support multi-threaded /logging/ efficiently.
> >> > 
> >> > 
> >> > 
> >> >  
> >> > 
> >> 
> > 
> > 
> >  
> > 
> 


Mime
View raw message