lens-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Sayantan.R...@cognizant.com>
Subject RE: Need help and guidance on how to get data to Lens
Date Fri, 23 Sep 2016 07:07:49 GMT
Thanks a lot Puneet for the quick turn around.

Regards
Sayantan

From: Puneet Gupta [mailto:puneet.gupta@inmobi.com]
Sent: Friday, September 23, 2016 12:36 PM
To: user@lens.apache.org
Subject: Re: Need help and guidance on how to get data to Lens

Hi Sayantan

1. The custom file formats like ORC should be configured in the fact definition  . You ll
find storage_table definition inside Fact xml. That should contain the file format
Example
<storage_table>
    <update_periods>
        <update_period>HOURLY</update_period>
        <update_period>DAILY</update_period>
    </update_periods>
    <storage_name>org_hdfs</storage_name>
    <table_desc compressed="false" external="true" input_format="org.apache.hadoop.hive.ql.io<http://hadoop.hive.ql.io>.RCFileInputFormat"
num_buckets="0" output_format="org.apache.hadoop.hive.ql.io<http://hadoop.hive.ql.io>.RCFileOutputFormat"
serde_class_name="org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe" storage_handler_name="">
        <part_cols>
            <column comment="date partition" name="dt" _type="string"/>
        </part_cols>
        <table_parameters>
            <property name="cube.storagetable.start.times" value="2016-01-01"/>
        </table_parameters>
        <time_part_cols>dt</time_part_cols>
    </table_desc>
</storage_table>
Note: Fact tables are allowed to evolve over time and can have multiple storages . Each storage
can have a different file format and start time . Lens Server will stitch the query across
fact storages if required based on time range.
Further different data partitions of the storage can also have evolving schema (new fields
can be added and will be valid based on field's start time)

2.  Lens already supports the following drivers : Hive , JDBC , Elastic Search, Druid (via
JDBC driver and plyql).  The doc is available here https://cwiki.apache.org/confluence/display/LENS/Lens+Driver
For oracle you should be able to use JDBC Driver ( it has been tested for HSQLDB, Vertica,
InfoBright, MySql)
For Teradata, either you should be able to use JDBC driver (not sure) or implement your own
teradata driver for lens.
To add a new driver type you will have to configure the driver implementation in lens-site.xml
<property>
<name>lens.server.drivers</name>
<value>hive:org.apache.lens.driver.hive.HiveDriver,jdbc:org.apache.lens.driver.jdbc.JDBCDriver</value>
<description>Drivers enabled for this lens server instance</description>
</property>


Thanks,
Puneet Gupta

On Fri, Sep 23, 2016 at 11:08 AM, <Sayantan.Raha@cognizant.com<mailto:Sayantan.Raha@cognizant.com>>
wrote:
Hi,

I need some guidance on how to bring data from Hive Tables into Lens. The tables are storing
data in ORC format.
What I have been able to do, is point the HDFS data file locations and use them in Lens Storage
specs. This is working for delimited files but not for Orc or other formats.

Can you please provide some reference/guidance on following:

1.       How to integrate Hive Tables with custom Serde / Formats

a.       What customizations to Lens config xmls are needed if any

b.      Can I do this only via Java Apis? Pls  provide the Api link if available

c.       Or can this be done also via Lens Cli and XMLs? Pls  provide the Api link if available

2.       How to integrate third party DB like Oracle/Tearadata into Lens

a.       What customizations to Lens config xmls are needed if any

b.      Can I do this only via Java Apis?

c.       Or can this be done also via Lens Cli and XMLs

Even if you have relevant links pointing to sources please provide. Thanks for your help as
always.

Regards
Sayantan Raha
This e-mail and any files transmitted with it are for the sole use of the intended recipient(s)
and may contain confidential and privileged information. If you are not the intended recipient(s),
please reply to the sender and destroy all copies of the original message. Any unauthorized
review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or
any action taken in reliance on the contents of this e-mail is strictly prohibited and may
be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications
sent to and from Cognizant e-mail addresses may be monitored.


_____________________________________________________________
The information contained in this communication is intended solely for the use of the individual
or entity to whom it is addressed and others authorized to receive it. It may contain confidential
or legally privileged information. If you are not the intended recipient you are hereby notified
that any disclosure, copying, distribution or taking any action in reliance on the contents
of this information is strictly prohibited and may be unlawful. If you have received this
communication in error, please notify us immediately by responding to this email and then
delete it from your system. The firm is neither liable for the proper and complete transmission
of the information contained in this communication nor for any delay in its receipt.
This e-mail and any files transmitted with it are for the sole use of the intended recipient(s)
and may contain confidential and privileged information. If you are not the intended recipient(s),
please reply to the sender and destroy all copies of the original message. Any unauthorized
review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or
any action taken in reliance on the contents of this e-mail is strictly prohibited and may
be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications
sent to and from Cognizant e-mail addresses may be monitored.
Mime
View raw message