nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cihad Guzel <cguz...@gmail.com>
Subject Re: Nutch-1741 in GSOC 2015
Date Mon, 18 May 2015 23:14:31 GMT
Hi Lewis.
I don't edit to wiki for my proposal. Could you provide permit for editing?

2015-05-19 1:22 GMT+03:00 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com>:

> Hi Cihad,
> Thank you for introducing yourself.
> You now have write access to the Nutch wiki so you can augment the wiki
> page and begin working on some documentation and issues from within Jira.
> Really looking forward to working alongside all you guys on your projects.
> Best
> Lewis
>
> On Mon, May 18, 2015 at 3:19 PM, Cihad Guzel <cguzelg@gmail.com> wrote:
>
>> Hi all,
>>
>> I want to introduce myself.
>>
>> I am a Computer Engineer and I am doing master now. I like coding.I have
>> been following some open source project for about 3 years. I am goaling to
>> make some contribution with GSOC in opensource community.
>>
>> I also worked about frontend, middleware, backed development via
>> enterprise java technologies. Furthermore, experienced “Web Technologies”,
>> "Search Technologies", "Cloud Computing", "Distributed Systems" and "Big
>> Data". I took place in search engine project that Apache technologies were
>> used such as  Solr, HBase, Hadoop, Nutch, Gora and I used Nutch project
>> actively in this project. You can see more information on my linkedin
>> profile[1] about me.
>>
>> I mention some information for my process. My subject is "Nutch-1741 -
>> Support of Sitemaps in Nutch 2.x" [2] .You know that the url’s can be
>> got from only pages that were scanned before in nutch crawler system. Also,
>> the degrees of importance and “change frequence” of these urls are not
>> known only guessed. But, it is possible to find the whole of urls in a
>> up-to-date sitemap file. For this reason, sitemap files in website should
>> be crawled.
>>
>> I have explained the features for this project on my proposal. I’ll add
>> it to wiki and you can see details of it on wiki at when I share . You can
>> see nutch sitemap lifecycle the drawing [3].
>>
>> [1] https://tr.linkedin.com/in/cihadguzel
>>
>> [2] https://issues.apache.org/jira/browse/NUTCH-1741
>>
>> [3]
>> https://issues.apache.org/jira/secure/attachment/12707721/SitemapCrawlerLifeCycle.pdf
>>
>> Kind Regards
>>
>>
>> 2015-05-19 1:16 GMT+03:00 Cihad Guzel <cguzelg@gmail.com>:
>>
>>> Ok Lewis,
>>> I signed up to wiki, my wiki username: cihadguzel
>>>
>>> Thanks
>>>
>>> 2015-05-18 23:44 GMT+03:00 Lewis John Mcgibbney <
>>> lewis.mcgibbney@gmail.com>:
>>>
>>>> Fantastic Cihad,
>>>> Thank you for introducing yourself.
>>>> As you are in the community bonding period right now, please feel free
>>>> to provide your wiki username to me and I will grant you access to the wiki.
>>>> Please also feel free to pick up some lingering issues for Nutch 2.3.1
>>>>
>>>> https://issues.apache.org/jira/browse/NUTCH-1945?jql=project%20%3D%20NUTCH%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%202.3.1%20ORDER%20BY%20priority%20DESC
>>>> Thanks
>>>> Lewis
>>>>
>>>>
>>>> On Mon, May 18, 2015 at 1:26 PM, Cihad Guzel <cguzelg@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I had applied the GSoC 2015 for Apache Nutch Project and my
>>>>> application is accepted. The main reason why I have choosen the Nutch
>>>>> Project for GSOC is knowing the Nutch closely. My subject is "Nutch-1741
-
>>>>> Support of Sitemaps in Nutch 2.x"[1] . Thanks Lewis John McGibbney and
>>>>> Talat Uyarer for being my mentors on this process. I hope I can contribute
>>>>> to this project.
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/NUTCH-1741
>>>>>
>>>>> Kind Regards
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Lewis*
>>>>
>>>
>>>
>>
>
>
> --
> *Lewis*
>

Mime
View raw message