Hi Lens Developers,
How does the 'update_period' work in storage tables? For example, I defined
the following storage table:
<storage_table>
<update_periods>
<update_period>*DAILY*</update_period>
</update_periods>
<storage_name>holdem</storage_name>
<table_desc external="true" field_delimiter=","
collection_delimiter=":"
table_location="hdfs://*********/dimension1/division2/*20160825*">
<part_cols>
<column comment="Time column" name="dt" _type="STRING"/>
</part_cols>
<time_part_cols>dt</time_part_cols>
</table_desc>
</storage_table>
</storage_tables>
And add the partition data in the same location:
<x_partition fact_or_dimension_table_name="dimension1_division2"
location="hdfs://*********/dimension1/division2/*20160825*" update_period="
*DAILY*"
xmlns="uri:lens:cube:0.1" xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
<time_partition_spec>
<part_spec_element key="dt" value="*2016-08-25T00:00:00*"/>
</time_partition_spec>
</x_partition>
Then, if the next day's data is generated by my ETL job in location:
hdfs://*********/dimension1/division2/*20160826*
Will Lens automatically add the partition to the storage table with
dt=*2016-08-26T00:00:00
?*If yes, is the folder name *20160826 *configurable? If not, how does Lens
handle the update periods?
Thanks,
--
*Tao Yan*
Software Engineer
Data Analytics Infrastructure Tools and Services
206.250.5345
tyan@linkedin.com
https://www.linkedin.com/in/taousc
|