spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Pentreath" <nick.pentre...@gmail.com>
Subject Re: python converter in HBaseConverter.scala(spark/examples)
Date Mon, 05 Jan 2015 18:03:58 GMT
Absolutely; as I mentioned by all means submit a PR - I just wanted to point out that any specific
converter is not "officially" supported, although the interface is of course.


I'm happy to review a PR just ping me when ready.


—
Sent from Mailbox

On Mon, Jan 5, 2015 at 7:06 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> HBaseConverter is in Spark source tree. Therefore I think it makes sense
> for this improvement to be accepted so that the example is more useful.
> Cheers
> On Mon, Jan 5, 2015 at 7:54 AM, Nick Pentreath <nick.pentreath@gmail.com>
> wrote:
>> Hey
>>
>> These converters are actually just intended to be examples of how to set
>> up a custom converter for a specific input format. The converter interface
>> is there to provide flexibility where needed, although with the new
>> SparkSQL data store interface the intention is that most common use cases
>> can be handled using that approach rather than custom converters.
>>
>> The intention is not to have specific converters living in Spark core,
>> which is why these are in the examples project.
>>
>> Having said that, if you wish to expand the example converter for others
>> reference do feel free to submit a PR.
>>
>> Ideally though, I would think that various custom converters would be part
>> of external projects that can be listed with http://spark-packages.org/ I
>> see your project is already listed there.
>>
>> —
>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>>
>>
>> On Mon, Jan 5, 2015 at 5:37 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> In my opinion this would be useful - there was another thread where
>>> returning
>>> only the value of first column in the result was mentioned.
>>>
>>> Please create a SPARK JIRA and a pull request.
>>>
>>> Cheers
>>>
>>> On Mon, Jan 5, 2015 at 6:42 AM, tgbaggio <gen.tang86@gmail.com> wrote:
>>>
>>> > Hi,
>>> >
>>> > In HBaseConverter.scala
>>> > <
>>> >
>>> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala
>>> > >
>>> > , the python converter HBaseResultToStringConverter return only the
>>> value
>>> > of
>>> > first column in the result. In my opinion, it limits the utility of
>>> this
>>> > converter, because it returns only one value per row and moreover it
>>> loses
>>> > the other information of record, such as column:cell, timestamp.
>>> >
>>> > Therefore, I would like to propose some modifications about
>>> > HBaseResultToStringConverter which will be able to return all records
>>> in
>>> > the
>>> > hbase with more complete information: I have already written some code
>>> in
>>> > pythonConverters.scala
>>> > <
>>> >
>>> https://github.com/GenTang/spark_hbase/blob/master/src/main/scala/examples/pythonConverters.scala
>>> > >
>>> > and it works
>>> >
>>> > Is it OK to modify the code in HBaseConverters.scala, please?
>>> > Thanks a lot in advance.
>>> >
>>> > Cheers
>>> > Gen
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> >
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/python-converter-in-HBaseConverter-scala-spark-examples-tp10001.html
>>> > Sent from the Apache Spark Developers List mailing list archive at
>>> > Nabble.com.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> > For additional commands, e-mail: dev-help@spark.apache.org
>>> >
>>> >
>>>
>>
>>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message