spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kazuaki Ishizaki" <ISHIZ...@jp.ibm.com>
Subject Re: SPIP: SPARK-25728 Structured Intermediate Representation (Tungsten IR) for generating Java code
Date Tue, 13 Nov 2018 18:55:23 GMT
Hi all,
I spend some time to consider great points. Sorry for my delay.
I put comments in green into h
ttps://docs.google.com/document/d/1Jzf56bxpMpSwsGV_hSzl9wQG22hyI731McQcjognqxY/edit

Here are summary of comments:
1) For simplicity and expressiveness, introduce nodes to represent a 
structure (e.g. for, while)
2) For simplicity, measure some statistics (e.g. node / java bytecode, 
memory consumption)
3) For ease of understanding, use simple APIs like the original statements 
(op2, for, while, ...)

We would appreciate it if you put any comments/suggestions on 
GoogleDoc/dev-ml for going forward.

Kazuaki Ishizaki, 



From:   "Kazuaki Ishizaki" <ISHIZAKI@jp.ibm.com>
To:     Reynold Xin <rxin@databricks.com>
Cc:     dev <dev@spark.apache.org>, Takeshi Yamamuro 
<linguin.m.s@gmail.com>, Xiao Li <lixiao@databricks.com>
Date:   2018/10/31 00:56
Subject:        Re: SPIP: SPARK-25728 Structured Intermediate 
Representation (Tungsten IR) for generating Java code



Hi Reynold,
Thank you for your comments. They are great points.

1) Yes, it is not easy to design the expressive and enough IR. We can 
learn concepts from good examples like HyPer, Weld, and others. They are 
expressive and not complicated. The detail cannot be captured yet, 
2) To introduce another layer takes some time to learn new things. This 
SPIP tries to reduce learning time to preparing clean APIs for 
constructing generated code. I will try to add some examples for APIs that 
are equivalent to current string concatenations (e.g. "a" + " * " + "b" + 
" / " + "c").

It is important for us to learn from failures than learn from successes. 
We would appreciate it if you could list up failures that you have seen.

Best Regards,
Kazuaki Ishizaki



From:        Reynold Xin <rxin@databricks.com>
To:        Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com>
Cc:        Xiao Li <lixiao@databricks.com>, dev <dev@spark.apache.org>, 
Takeshi Yamamuro <linguin.m.s@gmail.com>
Date:        2018/10/26 03:46
Subject:        Re: SPIP: SPARK-25728 Structured Intermediate 
Representation (Tungsten IR) for generating Java code



I have some pretty serious concerns over this proposal. I agree that there 
are many things that can be improved, but at the same time I also think 
the cost of introducing a new IR in the middle is extremely high. Having 
participated in designing some of the IRs in other systems, I've seen more 
failures than successes. The failures typically come from two sources: (1) 
in general it is extremely difficult to design IRs that are both 
expressive enough and are simple enough; (2) typically another layer of 
indirection increases the complexity a lot more, beyond the level of 
understanding and expertise that most contributors can obtain without 
spending years in the code base and learning about all the gotchas.

In either case, I'm not saying "no please don't do this". This is one of 
those cases in which the devils are in the details that cannot be captured 
by a high level document, and I want to explicitly express my concern 
here.




On Thu, Oct 25, 2018 at 12:10 AM Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com> 
wrote:
Hi Xiao,
Thank you very much for becoming a shepherd.
If you feel the discussion settles, we would appreciate it if you would 
start a voting.

Regards,
Kazuaki Ishizaki



From:        Xiao Li <lixiao@databricks.com>
To:        Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com>
Cc:        dev <dev@spark.apache.org>, Takeshi Yamamuro <
linguin.m.s@gmail.com>
Date:        2018/10/22 16:31
Subject:        Re: SPIP: SPARK-25728 Structured Intermediate 
Representation (Tungsten IR) for generating Java code



Hi, Kazuaki, 

Thanks for your great SPIP! I am willing to be the shepherd of this SPIP. 

Cheers,

Xiao


On Mon, Oct 22, 2018 at 12:05 AM Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com> 
wrote:
Hi Yamamuro-san,
Thank you for your comments. This SPIP gets several valuable comments and 
feedback on Google Doc: 
https://docs.google.com/document/d/1Jzf56bxpMpSwsGV_hSzl9wQG22hyI731McQcjognqxY/edit?usp=sharing
.
I hope that this SPIP could go forward based on these feedback.

Based on this SPIP procedure 
http://spark.apache.org/improvement-proposals.html, can I ask one or more 
PMCs to become a shepherd of this SPIP?
I would appreciate your kindness and cooperation. 

Best Regards,
Kazuaki Ishizaki



From:        Takeshi Yamamuro <linguin.m.s@gmail.com>
To:        Spark dev list <dev@spark.apache.org>
Cc:        ISHIZAKI@jp.ibm.com
Date:        2018/10/15 12:12
Subject:        Re: SPIP: SPARK-25728 Structured Intermediate 
Representation (Tungsten IR) for generating Java code



Hi, ishizaki-san,

Cool activity, I left some comments on the doc.

best,
takeshi


On Mon, Oct 15, 2018 at 12:05 AM Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com> 
wrote:
Hello community,

I am writing this e-mail in order to start a discussion about adding 
structure intermediate representation for generating Java code from a 
program using DataFrame or Dataset API, in addition to the current 
String-based representation.
This addition is based on the discussions in a thread at 
https://github.com/apache/spark/pull/21537#issuecomment-413268196

Please feel free to comment on the JIRA ticket or Google Doc.

JIRA ticket: https://issues.apache.org/jira/browse/SPARK-25728
Google Doc: 
https://docs.google.com/document/d/1Jzf56bxpMpSwsGV_hSzl9wQG22hyI731McQcjognqxY/edit?usp=sharing


Looking forward to hear your feedback

Best Regards,
Kazuaki Ishizaki


-- 
---
Takeshi Yamamuro



-- 






Mime
View raw message