tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Re: [GSoc2013] - Outer Join
Date Tue, 16 Jul 2013 10:32:35 GMT
In addition, I would like to recommand you to create subtasks in Jira that can be partially
done. It may reduce the merge conflict cases of your working branch, and we can review your
work incrementally.

Best wishes,
Hyunsik Choi

On Jul 16, 2013, at 5:43 AM, camelia c <camelie_1985@yahoo.com> wrote:

> Hello,
> 
> Thank You very much for Your reply and the reference book.
> I managed to follow the steps and created a github account with a mirror of Apache TAJO
at https://github.com/camelia-c/incubator-tajo .
> I uploaded a diagram at https://sites.google.com/site/gsoc2013tajo34/github
> 
> In the diagram, the outerjoin_1 branch is intended for development whereas the master
branch is for stable code. For the moment I didn't include any of my code on outer join yet
because I want to firstly set up correctly the repository.
> 
> Still, I am not completely sure of the following aspects:
> 
> 1) I mention that immediately after the git clone command, I issued the following commands:
> 
> $ git remote add --track master upstream git://github.com/apache/incubator-tajo.git
> 
> $ git fetch upstream
> From git://github.com/apache/incubator-tajo
>  * [new branch]      master     -> upstream/master
> 
> $ git rebase upstream/master
> Current branch master is up to date.
> 
> But I am not sure of whether  they are enough to perform automatic synchronization....or
should I still perform manual synchronization periodically?
> 
> 2) After this setup, is it still necessary to run periodically the command You suggested
last time (git pull origin master)?
> In this configuration, the command      $git pull origin master  
> is going to synchronize the local repository on my computer with  git://github.com/apache/incubator-tajo.git
 or with  git://github.com/camelia-c/incubator-tajo.git?
> 
> 
> 3) If I run this command from the outerjoin_1 branch ,i.e. 
> 
> git checkout outerjoin_1
> git pull origin master
> 
> is it going to affect only the outerjoin_1 branch?
> 
> 4) And the last question, regarding execution:  now the interactive shell is launched
with $TAJO_HOME/bin/tsql  instead of $TAJO_HOME/bin/tajo cli ?
> 
> 
> Thank You very much!
> 
> Yours sincerely,
> Camelia
> 
> 
> 
> 
> ________________________________
> From: Hyunsik Choi <hyunsik@apache.org>
> To: camelia c <camelie_1985@yahoo.com> 
> Cc: tajo-dev <dev@tajo.incubator.apache.org> 
> Sent: Saturday, July 13, 2013 7:28 AM
> Subject: Re: [GSoc2013] - Outer Join
> 
> 
> Hi Camellia,
> 
> I leave inline comments on your questions.
> 
> 
> On Fri, Jul 12, 2013 at 9:13 PM, camelia c <camelie_1985@yahoo.com> wrote:
> 
>> Hello,
>> 
>> Thank You very much for Your feedback!
>> 
>> I completed the outer joins to inner joins rewriting part and I plan to
>> follow Your advice and move the rewriting methods to LogicalOptimizer.
>> The new processing is described in
>> https://sites.google.com/site/gsoc2013tajo34/home/validation , where I
>> also uploaded the source code as files.
>> 
>> 
> Your work looks good. However, first of all, I would like to encourage you
> to learn SCM like Git.
> 
> Actually, your source code cannot be merged into the current Tajo source
> code because Tajo source code has been changed by multiple developers. It
> is very hard to manage Individual source code files against updating source
> tree.
> 
> The main objective of Google summer code is to encourage open source
> participation. So, you need to learn an overall system of open source
> development. Above all, you should learn SCM like Git.
> 
> 
>> 1)  I think that the allTables data structure as well as the
>> validateOuterJoin and recursiveWhere methods should remain in class
>> QueryAnalyzer, as they belong to the stage where the query is analyzed and
>> validated.
>> In my opinion, only methods rewriteOuterJoin,
>> recursiveRewriteMultiNullSupplier, recursiveRewriteNullRestricted should be
>> moved to class LogicalOptimizer as they perform optimizations on the
>> logical plan.
>> What do You think about this?
>> 
>> 
> Sounds great. Let's go ahead with that :)
> 
> 
>> 2) I would like to kindly ask You how can I continually rebase my work on
>> the latest Tajo version, "rebase continually your work on updated source
>> code"?
>> Usually I issue this command:
>> 
>> mvn package -DskipTests -Pdist -Dtar
>> 
>> What should I do before this?
>> 
> 
> If you download the source code via git, just type as follows:
> 
> $ git pull origin master
> 
> Probably, you meet some conflicts. If you don't know git, you should learn
> git in order to solve the conflicts. You can refer some manuals available
> online. I would like to recommend this one (http://git-scm.com/book).
> 
> 
>> 
>> 3) I read on the mailing lists that the Tajo Cli changed and was improved.
>> But besides the query acceptance, does this affect in any way the stages of
>> the query processing, after its parsing?
>> 
>> 
> Tajo Cli change was in only client side. It does not affect the part in
> which you have worked.
> 
> 
>> 4) Also, I read some posts on the mailing list related to integration
>> tests.
>> Where can I find these and how should I use them in order to verify that
>> my work integrates well with the rest of the source code?
>> 
>> 
> The following command verifies unit tests and integration tests. It
> verifies most parts of Tajo.
> 
> $ mvn clean install
> 
> 
>> My work so far only affects queries containing at least one outer join, so
>> for queries consisting only of inner joins no modification is made.
> 
> 
>> As a final remark, it was easier to manage the recursion without
>> EvalTreeUtil. Hope it's ok.
>> 
> 
> That's great.
> 
> 
>> 
>> 
>> Thank You in advance!
>> 
>> Yours  sincerely,
>> Camelia
>> 
>> 
>> 
>> 
>> 
> 
> Best regards,
> Hyunsik


Mime
View raw message