mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: DBSCAN implementation in Mahout
Date Tue, 02 Dec 2014 18:08:53 GMT
Correction. MR.SCAN is Univ. of Wisconsin's paper. Google Beijing was
another paper on the subject but i found mr.scan having a bit more elegant
simplicity in it.

On Mon, Dec 1, 2014 at 12:41 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> if memory serves me, DeLiClu (density-link) is current best density thing
> since it does not require parameter searches.
>
> What is parallelization strategy you are proposing?
>
> I know there were a bunch of attempts to parallelize/partition the dbscan
> problem, one of more interesting is perhaps of Google's MR.SCAN paper, but
> even the latter is not qutie embarassingly parallel (requires partitioning
> overlap between subtasks which is a function of epsilon neighborhood).
> Nevertheless, this seemed to yield significantly interesting performance.
>
> also, MR version of Mahout has (or used to have) mean shift, which is just
> fine, if not better, for irregularly-shaped density clustering. Not sure of
> its performance though. their translations into spark perhaps would be
> interesting enough.
>
>
>
> On Sat, Nov 29, 2014 at 12:31 PM, 3316 Chirag Nagpal <
> chiragnagpal_12102@aitpune.edu.in> wrote:
>
>> Hi Dimitry,
>>
>> Thanks for the reply....
>>
>> Since Density based clustering algorithms, are being utilised
>> extensively, especially by the GIS research groups, it is a bit sad that
>> there isn't a Map Reduce implementation available..
>>
>> I think I will propose to write MapReduce code for DBSCAN and OPTICS for
>> GSoC '15.
>>
>> I would like to take your input as to how much of significance would this
>> be of to the community in general?
>>
>> Thanks,
>>
>> Chirag Nagpal
>> University of Pune, India
>> www.chiragnagpal.com
>> ________________________________________
>> From: Dmitriy Lyubimov <dlieu.7@gmail.com>
>> Sent: Saturday, November 29, 2014 11:29 PM
>> To: user@mahout.apache.org
>> Subject: Re: DBSCAN implementation in Mahout
>>
>> No there is no dbscan, optics or any other density flavor afaik
>>
>> Sent from my phone.
>> On Nov 28, 2014 11:41 AM, "3316 Chirag Nagpal" <
>> chiragnagpal_12102@aitpune.edu.in> wrote:
>>
>> > ?
>> >
>> > Hello
>> > I am Chirag Nagpal, a third year student of Computer Engineering at the
>> > University of Pune, India and currently interning at SERC, Indian
>> Institute
>> > of Science, Bangalore
>> >
>> > My work involves using density based clustering algorithms like DBSCAN
>> on
>> > geo-referenced data like Tweets. Typically the dataset consists of
>> millions
>> > of points. I would like to know if there is any Map Reduce
>> implementation
>> > of DBSCAN available.
>> >
>> > thank you
>> > Chirag ?
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message