spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: python converter in HBaseConverter.scala(spark/examples)
Date Mon, 05 Jan 2015 17:06:53 GMT
HBaseConverter is in Spark source tree. Therefore I think it makes sense
for this improvement to be accepted so that the example is more useful.

Cheers

On Mon, Jan 5, 2015 at 7:54 AM, Nick Pentreath <nick.pentreath@gmail.com>
wrote:

> Hey
>
> These converters are actually just intended to be examples of how to set
> up a custom converter for a specific input format. The converter interface
> is there to provide flexibility where needed, although with the new
> SparkSQL data store interface the intention is that most common use cases
> can be handled using that approach rather than custom converters.
>
> The intention is not to have specific converters living in Spark core,
> which is why these are in the examples project.
>
> Having said that, if you wish to expand the example converter for others
> reference do feel free to submit a PR.
>
> Ideally though, I would think that various custom converters would be part
> of external projects that can be listed with http://spark-packages.org/ I
> see your project is already listed there.
>
> —
> Sent from Mailbox <https://www.dropbox.com/mailbox>
>
>
> On Mon, Jan 5, 2015 at 5:37 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> In my opinion this would be useful - there was another thread where
>> returning
>> only the value of first column in the result was mentioned.
>>
>> Please create a SPARK JIRA and a pull request.
>>
>> Cheers
>>
>> On Mon, Jan 5, 2015 at 6:42 AM, tgbaggio <gen.tang86@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > In HBaseConverter.scala
>> > <
>> >
>> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala
>> > >
>> > , the python converter HBaseResultToStringConverter return only the
>> value
>> > of
>> > first column in the result. In my opinion, it limits the utility of
>> this
>> > converter, because it returns only one value per row and moreover it
>> loses
>> > the other information of record, such as column:cell, timestamp.
>> >
>> > Therefore, I would like to propose some modifications about
>> > HBaseResultToStringConverter which will be able to return all records
>> in
>> > the
>> > hbase with more complete information: I have already written some code
>> in
>> > pythonConverters.scala
>> > <
>> >
>> https://github.com/GenTang/spark_hbase/blob/master/src/main/scala/examples/pythonConverters.scala
>> > >
>> > and it works
>> >
>> > Is it OK to modify the code in HBaseConverters.scala, please?
>> > Thanks a lot in advance.
>> >
>> > Cheers
>> > Gen
>> >
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> >
>> http://apache-spark-developers-list.1001551.n3.nabble.com/python-converter-in-HBaseConverter-scala-spark-examples-tp10001.html
>> > Sent from the Apache Spark Developers List mailing list archive at
>> > Nabble.com.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> > For additional commands, e-mail: dev-help@spark.apache.org
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message