tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Fwd: [GSoc2013] - Outer Join
Date Tue, 16 Jul 2013 09:17:50 GMT
Hello,

Great job! I leave inline comments on your questions.

Best regards,
Hyunsik


On Jul 16, 2013, at 5:43 AM, camelia c <camelie_1985@yahoo.com> wrote:

> Hello,
>
> Thank You very much for Your reply and the reference book.
> I managed to follow the steps and created a github account with a mirror
of Apache TAJO at https://github.com/camelia-c/incubator-tajo .
> I uploaded a diagram at
https://sites.google.com/site/gsoc2013tajo34/github
>
> In the diagram, the outerjoin_1 branch is intended for development
whereas the master branch is for stable code. For the moment I didn't
include any of my code on outer join yet because I want to firstly set up
correctly the repository.

Your setup looks correct. You can work on your repository. In general, the
master branch is used as a seed branch for new branch, and most works are
performed in another branch.

>
> Still, I am not completely sure of the following aspects:
>
> 1) I mention that immediately after the git clone command, I issued the
following commands:
>
> $ git remote add --track master upstream git://
github.com/apache/incubator-tajo.git
>
> $ git fetch upstream
> From git://github.com/apache/incubator-tajo
>  * [new branch]      master     -> upstream/master
>

Use 'git pull' in order to synchronise your branch against apache
repository. In your case where the apache remote repository is named
'upstream', just type in a certain branch that you want to update as
follows:

$ git checkout [working_branch]
$ git pull upstream master

It will fetch the updated source code and will try to merge it with your
working code.

> $ git rebase upstream/master
> Current branch master is up to date.
>
> But I am not sure of whether  they are enough to perform automatic
synchronization....or should I still perform manual synchronization
periodically?
>

Most users perform rebase source code manually because in many cases a
merge needs hand work.

> 2) After this setup, is it still necessary to run periodically the
command You suggested last time (git pull origin master)?
> In this configuration, the command      $git pull origin master
> is going to synchronize the local repository on my computer with  git://
github.com/apache/incubator-tajo.git  or with  git://
github.com/camelia-c/incubator-tajo.git?
>

Yes, occasionally, you should pull the updated source code from apache git
repository.


>
> 3) If I run this command from the outerjoin_1 branch ,i.e.
>
> git checkout outerjoin_1
> git pull origin master
>
> is it going to affect only the outerjoin_1 branch?

Yes, the 'pull' only affects your current branch (i.e., outer join_1 in
your example)

>
> 4) And the last question, regarding execution:  now the interactive shell
is launched with $TAJO_HOME/bin/tsql  instead of $TAJO_HOME/bin/tajo cli ?
>

Yes, recently, tsql was added for more convenience. tsql is equivalent to
bin/tajo cli.

>
> Thank You very much!
>
> Yours sincerely,
> Camelia
>
>
>
>
> ________________________________
> From: Hyunsik Choi <hyunsik@apache.org>
> To: camelia c <camelie_1985@yahoo.com>
> Cc: tajo-dev <dev@tajo.incubator.apache.org>
> Sent: Saturday, July 13, 2013 7:28 AM
> Subject: Re: [GSoc2013] - Outer Join
>
>
> Hi Camellia,
>
> I leave inline comments on your questions.
>
>
> On Fri, Jul 12, 2013 at 9:13 PM, camelia c <camelie_1985@yahoo.com> wrote:
>
>> Hello,
>>
>> Thank You very much for Your feedback!
>>
>> I completed the outer joins to inner joins rewriting part and I plan to
>> follow Your advice and move the rewriting methods to LogicalOptimizer.
>> The new processing is described in
>> https://sites.google.com/site/gsoc2013tajo34/home/validation , where I
>> also uploaded the source code as files.
>>
>>
> Your work looks good. However, first of all, I would like to encourage you
> to learn SCM like Git.
>
> Actually, your source code cannot be merged into the current Tajo source
> code because Tajo source code has been changed by multiple developers. It
> is very hard to manage Individual source code files against updating
source
> tree.
>
> The main objective of Google summer code is to encourage open source
> participation. So, you need to learn an overall system of open source
> development. Above all, you should learn SCM like Git.
>
>
>> 1)  I think that the allTables data structure as well as the
>> validateOuterJoin and recursiveWhere methods should remain in class
>> QueryAnalyzer, as they belong to the stage where the query is analyzed
and
>> validated.
>> In my opinion, only methods rewriteOuterJoin,
>> recursiveRewriteMultiNullSupplier, recursiveRewriteNullRestricted should
be
>> moved to class LogicalOptimizer as they perform optimizations on the
>> logical plan.
>> What do You think about this?
>>
>>
> Sounds great. Let's go ahead with that :)
>
>
>> 2) I would like to kindly ask You how can I continually rebase my work on
>> the latest Tajo version, "rebase continually your work on updated source
>> code"?
>> Usually I issue this command:
>>
>> mvn package -DskipTests -Pdist -Dtar
>>
>> What should I do before this?
>>
>
> If you download the source code via git, just type as follows:
>
> $ git pull origin master
>
> Probably, you meet some conflicts. If you don't know git, you should learn
> git in order to solve the conflicts. You can refer some manuals available
> online. I would like to recommend this one (http://git-scm.com/book).
>
>
>>
>> 3) I read on the mailing lists that the Tajo Cli changed and was
improved.
>> But besides the query acceptance, does this affect in any way the stages
of
>> the query processing, after its parsing?
>>
>>
> Tajo Cli change was in only client side. It does not affect the part in
> which you have worked.
>
>
>> 4) Also, I read some posts on the mailing list related to integration
>> tests.
>> Where can I find these and how should I use them in order to verify that
>> my work integrates well with the rest of the source code?
>>
>>
> The following command verifies unit tests and integration tests. It
> verifies most parts of Tajo.
>
> $ mvn clean install
>
>
>> My work so far only affects queries containing at least one outer join,
so
>> for queries consisting only of inner joins no modification is made.
>
>
>> As a final remark, it was easier to manage the recursion without
>> EvalTreeUtil. Hope it's ok.
>>
>
> That's great.
>
>
>>
>>
>> Thank You in advance!
>>
>> Yours  sincerely,
>> Camelia
>>
>>
>>
>>
>>
>
> Best regards,
> Hyunsik

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message