nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: Nutch-1741 in GSOC 2015
Date Mon, 18 May 2015 22:22:30 GMT
Hi Cihad,
Thank you for introducing yourself.
You now have write access to the Nutch wiki so you can augment the wiki
page and begin working on some documentation and issues from within Jira.
Really looking forward to working alongside all you guys on your projects.
Best
Lewis

On Mon, May 18, 2015 at 3:19 PM, Cihad Guzel <cguzelg@gmail.com> wrote:

> Hi all,
>
> I want to introduce myself.
>
> I am a Computer Engineer and I am doing master now. I like coding.I have
> been following some open source project for about 3 years. I am goaling to
> make some contribution with GSOC in opensource community.
>
> I also worked about frontend, middleware, backed development via
> enterprise java technologies. Furthermore, experienced “Web Technologies”,
> "Search Technologies", "Cloud Computing", "Distributed Systems" and "Big
> Data". I took place in search engine project that Apache technologies were
> used such as  Solr, HBase, Hadoop, Nutch, Gora and I used Nutch project
> actively in this project. You can see more information on my linkedin
> profile[1] about me.
>
> I mention some information for my process. My subject is "Nutch-1741 -
> Support of Sitemaps in Nutch 2.x" [2] .You know that the url’s can be got
> from only pages that were scanned before in nutch crawler system. Also, the
> degrees of importance and “change frequence” of these urls are not known
> only guessed. But, it is possible to find the whole of urls in a up-to-date
> sitemap file. For this reason, sitemap files in website should be crawled.
>
> I have explained the features for this project on my proposal. I’ll add it
> to wiki and you can see details of it on wiki at when I share . You can see
> nutch sitemap lifecycle the drawing [3].
>
> [1] https://tr.linkedin.com/in/cihadguzel
>
> [2] https://issues.apache.org/jira/browse/NUTCH-1741
>
> [3]
> https://issues.apache.org/jira/secure/attachment/12707721/SitemapCrawlerLifeCycle.pdf
>
> Kind Regards
>
>
> 2015-05-19 1:16 GMT+03:00 Cihad Guzel <cguzelg@gmail.com>:
>
>> Ok Lewis,
>> I signed up to wiki, my wiki username: cihadguzel
>>
>> Thanks
>>
>> 2015-05-18 23:44 GMT+03:00 Lewis John Mcgibbney <
>> lewis.mcgibbney@gmail.com>:
>>
>>> Fantastic Cihad,
>>> Thank you for introducing yourself.
>>> As you are in the community bonding period right now, please feel free
>>> to provide your wiki username to me and I will grant you access to the wiki.
>>> Please also feel free to pick up some lingering issues for Nutch 2.3.1
>>>
>>> https://issues.apache.org/jira/browse/NUTCH-1945?jql=project%20%3D%20NUTCH%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%202.3.1%20ORDER%20BY%20priority%20DESC
>>> Thanks
>>> Lewis
>>>
>>>
>>> On Mon, May 18, 2015 at 1:26 PM, Cihad Guzel <cguzelg@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I had applied the GSoC 2015 for Apache Nutch Project and my application
>>>> is accepted. The main reason why I have choosen the Nutch Project for GSOC
>>>> is knowing the Nutch closely. My subject is "Nutch-1741 - Support of
>>>> Sitemaps in Nutch 2.x"[1] . Thanks Lewis John McGibbney and Talat Uyarer
>>>> for being my mentors on this process. I hope I can contribute to this
>>>> project.
>>>>
>>>> [1] https://issues.apache.org/jira/browse/NUTCH-1741
>>>>
>>>> Kind Regards
>>>>
>>>
>>>
>>>
>>> --
>>> *Lewis*
>>>
>>
>>
>


-- 
*Lewis*

Mime
View raw message