nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: [ANNOUNCE] New Nutch committer and PMC - Thamme Gowda N.
Date Tue, 24 May 2016 09:48:47 GMT
Welcome Thamme Gowda!

Cheers,
Markus

 
 
-----Original message-----
> From:Thamme Gowda <tgowdan@gmail.com>
> Sent: Monday 23rd May 2016 0:56
> To: dev@nutch.apache.org; user@nutch.apache.org
> Subject: Re: [ANNOUNCE] New Nutch committer and PMC - Thamme Gowda N.
> 
> Hi Sebastian, 
>  thanks for the invitation and setting this up. 
> 
> Hello everybody, 
> 
> I am so glad to be on board. 
> 
> About me: 
>   Im currently a grad student (masters) at Univ. of Southern California (USC), Los Angeles.
Im fortunate enough to meet professor Chris Mattmann at USC. 
> Prior to my grad studies, I worked as a full-stack developer at few startups in Bangalore,
India. I am also a tech co-founder of a text analysis platform, http://datoin.com <http://datoin.com>.
I found my interest in A.I. so here I am at USC grad school. I am on my way for an internship
at NASA JPL this summer. 
> 
> How I met Nutch: 
>  In 2014, with my team at Datoin.com we integrated Crawler/Input component to our platform.
We picked Nutch because we had rest of the platform on Hadoop. Boom! that was when I first
put my hands on nutch code. 
>  Last fall I took a graduate level Information Retrieval (IR) course at USC taught by
prof. Mattmann. Then joined hands with his team at NASA JPL to work on IR related projects.
We use and improve Nutch. 
> 
> Some of my recent work related to Nutch: 
> Added an extension point and an extension to pass certain external URLS when db.ignore.external
is set. Fixed bugs and improved common crawl dumper. A clustering toolkit for clustering Nutch
output based on CSS styles and DOM structures [2]... 
> 
> More coming soon this summer! 
> 
> I am interested in after-crawl analysis and bringing them back to Nutch as extensions.

> I also presented "Clustering the output of Nutch ...." at recent ApacheCon NA [1]. 
> 
> I also love work on these: 
> 	reusable JVM containers to make it fast and efficient. Thinking of spark execution backend
(A step ahead - a switchable execution backend to support MR and Spark, just like what Gora
did to storage backend).		stats and analytics of crawl job in real-time	 
> I am exicted to be involved with the community to imrove Nutch. 
> 
> - 
> Thanks and Regards, 
> Thamme 
> 
> [1] http://www.slideshare.net/thammegowda/clustering-output-of-apache-nutch-using-apache-spark
<http://www.slideshare.net/thammegowda/clustering-output-of-apache-nutch-using-apache-spark>[2] https://github.com/uscdataScience/autoextractor/wiki/Clustering-Tutorial
<https://github.com/uscdataScience/autoextractor/wiki/Clustering-Tutorial>
> 
> -- 
> Thamme Gowda  
> Grad Student at USC <http://usc.edu>  
> @thammegowda <https://twitter.com/thammegowda> | 213-536-3552 
> http://scf.usc.edu/~tnarayan/ <http://scf.usc.edu/~tnarayan/>
> 
> On Sun, May 22, 2016 at 1:02 PM, Sebastian Nagel <wastl.nagel@googlemail.com <mailto:wastl.nagel@googlemail.com>>
wrote:
> Dear all,
 
> 
 
> it is my pleasure to announce that Thamme Gowda N. has joined us
 
> as committer and member of the Nutch PMC.  Congratulations on your
 
> new role within the Apache Nutch community!
 
> 
 
> Thamme, would you mind telling us about yourself, your relation
 
> to Nutch, what youve done so far, etc.?
 
> 
 
> Cheers and welcome on board!
 
> 
 
> Sebastian (on behalf of the Nutch PMC)
 
> 

Mime
View raw message