tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From camelia c <camelie_1...@yahoo.com>
Subject Re: [GSoc2013] - Outer Join
Date Mon, 15 Jul 2013 20:43:14 GMT

Thank You very much for Your reply and the reference book.
I managed to follow the steps and created a github account with a mirror of Apache TAJO at
https://github.com/camelia-c/incubator-tajo .
I uploaded a diagram at https://sites.google.com/site/gsoc2013tajo34/github

In the diagram, the outerjoin_1 branch is intended for development whereas the master branch
is for stable code. For the moment I didn't include any of my code on outer join yet because
I want to firstly set up correctly the repository.

Still, I am not completely sure of the following aspects:

1) I mention that immediately after the git clone command, I issued the following commands:

$ git remote add --track master upstream git://github.com/apache/incubator-tajo.git

$ git fetch upstream
From git://github.com/apache/incubator-tajo
 * [new branch]      master     -> upstream/master

$ git rebase upstream/master
Current branch master is up to date.

But I am not sure of whether  they are enough to perform automatic synchronization....or
should I still perform manual synchronization periodically?

2) After this setup, is it still necessary to run periodically the command You suggested last
time (git pull origin master)?
In this configuration, the command      $git pull origin master  
is going to synchronize the local repository on my computer with  git://github.com/apache/incubator-tajo.git 
or with  git://github.com/camelia-c/incubator-tajo.git?

3) If I run this command from the outerjoin_1 branch ,i.e. 

git checkout outerjoin_1
git pull origin master

is it going to affect only the outerjoin_1 branch?

4) And the last question, regarding execution:  now the interactive shell is launched with
$TAJO_HOME/bin/tsql  instead of $TAJO_HOME/bin/tajo cli ?

Thank You very much!

Yours sincerely,

 From: Hyunsik Choi <hyunsik@apache.org>
To: camelia c <camelie_1985@yahoo.com> 
Cc: tajo-dev <dev@tajo.incubator.apache.org> 
Sent: Saturday, July 13, 2013 7:28 AM
Subject: Re: [GSoc2013] - Outer Join

Hi Camellia,

I leave inline comments on your questions.

On Fri, Jul 12, 2013 at 9:13 PM, camelia c <camelie_1985@yahoo.com> wrote:

> Hello,
> Thank You very much for Your feedback!
> I completed the outer joins to inner joins rewriting part and I plan to
> follow Your advice and move the rewriting methods to LogicalOptimizer.
> The new processing is described in
> https://sites.google.com/site/gsoc2013tajo34/home/validation , where I
> also uploaded the source code as files.
Your work looks good. However, first of all, I would like to encourage you
to learn SCM like Git.

Actually, your source code cannot be merged into the current Tajo source
code because Tajo source code has been changed by multiple developers. It
is very hard to manage Individual source code files against updating source

The main objective of Google summer code is to encourage open source
participation. So, you need to learn an overall system of open source
development. Above all, you should learn SCM like Git.

> 1)  I think that the allTables data structure as well as the
> validateOuterJoin and recursiveWhere methods should remain in class
> QueryAnalyzer, as they belong to the stage where the query is analyzed and
> validated.
> In my opinion, only methods rewriteOuterJoin,
> recursiveRewriteMultiNullSupplier, recursiveRewriteNullRestricted should be
> moved to class LogicalOptimizer as they perform optimizations on the
> logical plan.
> What do You think about this?
Sounds great. Let's go ahead with that :)

> 2) I would like to kindly ask You how can I continually rebase my work on
> the latest Tajo version, "rebase continually your work on updated source
> code"?
> Usually I issue this command:
> mvn package -DskipTests -Pdist -Dtar
> What should I do before this?

If you download the source code via git, just type as follows:

$ git pull origin master

Probably, you meet some conflicts. If you don't know git, you should learn
git in order to solve the conflicts. You can refer some manuals available
online. I would like to recommend this one (http://git-scm.com/book).

> 3) I read on the mailing lists that the Tajo Cli changed and was improved.
> But besides the query acceptance, does this affect in any way the stages of
> the query processing, after its parsing?
Tajo Cli change was in only client side. It does not affect the part in
which you have worked.

> 4) Also, I read some posts on the mailing list related to integration
> tests.
> Where can I find these and how should I use them in order to verify that
> my work integrates well with the rest of the source code?
The following command verifies unit tests and integration tests. It
verifies most parts of Tajo.

$ mvn clean install

> My work so far only affects queries containing at least one outer join, so
> for queries consisting only of inner joins no modification is made.

> As a final remark, it was easier to manage the recursion without
> EvalTreeUtil. Hope it's ok.

That's great.

> Thank You in advance!
> Yours  sincerely,
> Camelia

Best regards,
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message