jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <>
Subject Inference with MapReduce (a la RIOT infer command)
Date Tue, 01 Nov 2011 15:08:32 GMT
I just want to share an approach to do inference a la RIOT infer command
line but faster (i.e. using MapReduce).

I've done only limited testing, but it should work. It's quite simple and
it is just a map only job.

The driver is [1] and the map function is
[2]. Now, I am interested in what parts of OWL can be done in a similar way.

In comparison to other (very interesting) approaches (for example: [3]) this
is extremely simple, but its simplicity is a very big plus in practice.
It also satisfies a lot of use cases.

Next step is: how to I do the same when I receive a (typically small) update?
How to I intercept the update?
What if the update deletes stuff (with stuff 1) vocabulary data 2) instance
data)? 2) is what I think is more likely to happen in practice.


I've been using Apache Whirr to test this and it works perfectly with small
Hadoop clusters (i.e. < 10 nodes). Unfortunately, I am having issues with
larger clusters (i.e. > 20 nodes) [4]. Apache Whirr just went out incubation
and it's a really great project, I really recommend you look at it if you
ever need to have an Hadoop cluster running on EC2. Whirr also is not limited
to Hadoop.


View raw message