"value" is required in a "property" tag. A property tag is name and value. This special property defines that whenever queried on order_time time dimension, try to answer with facts that have "ot" as partition column. If not, it falls back to delivery_time time dimension. There again, the mapping is checked, which is "dt", so facts that have "dt" as partition column are preferred. This can happen in a chain until a valid combination is found which can answer the query. 


On Wed, Aug 10, 2016 at 12:13 AM Tao Yan <tyan@linkedin.com> wrote:
So, the field 'value' can be used to define both alias and real values?

On Tue, Aug 9, 2016 at 11:40 AM, Rajat Khandelwal <prongs@apache.org> wrote:

Yes, that is correct.


On Wed, Aug 10, 2016, 00:09 Tao Yan <tyan@linkedin.com> wrote:
Thanks for the explanation. I got the idea of tentative partitions.

Is the property 
<property name="cube.timedim.partition.order_time" value="ot" />
 referring to the dim_attribute
<dim_attribute name="order_time" _type="TIMESTAMP" /> ?

Is the value 'ot' an alias of the partition column name?


On Tue, Aug 9, 2016 at 12:47 AM, Rajat Khandelwal <prongs@apache.org> wrote:
The idea of partitions is that there is a stream of data that keeps flowing in, each line of which has delivery time and order time both. The data is partitioned and stored in partitions later and facts are created on top of that partitioned data. So if data is partitioned in order time and the time range queried is also order time, then we can pick the partitions directly. But if the data is partitioned on delivery time, as is the case for this fact, and the query is on order time, then you pick some tentative partitions of delivery time, but you still have to filter over order time. Here the purpose of picking partitions is to reduce the amount of data read. When you pick delivery time partitions, it doesn't guarantee what order times data has, it only provides a tentative range. Hence, the partitions are picked based on the relation of time dimensions, but the extra filter still has to be supplied. 

Hope it's clear.  

On Tue, Aug 9, 2016 at 10:30 AM Rajat Khandelwal <prongs@apache.org> wrote:

That's the original time range asked in the cube query.


On Tue, Aug 9, 2016, 01:46 Tao Yan <tyan@linkedin.com> wrote:
Hi Rajat,

Thanks for the explain. I have a clear understanding about dim relation now, however, how does the following condition generated:

and ((sales.order_time) >= '2015-04-13 03:00:00') and ((sales.order_time) < '2015-04-13 04:00:00'))) 
order_time is a dim attributes defined in sales cube, and it does not have any relationships with properties or other dim attributes (The property cube.timedim.partition.order_time does, but is it the same thing?), I wonder how the query could pick up this column.

Thanks,

On Mon, Aug 8, 2016 at 4:12 AM, Rajat Khandelwal <prongs@apache.org> wrote:
I've created a jira for this: https://issues.apache.org/jira/browse/LENS-1269

You should be able to make change in your data locally and see the non-empty results: 

cat /Users/rajat.khandelwal/Git/lens/lens-dist/target/apache-lens-2.7.0-SNAPSHOT-bin/apache-lens-2.7.0-SNAPSHOT-bin/client/examples/data/sales_aggr_fact2_local3/sales-aggr-fact2-local3.data
2015-04-13 03:00:00,2015-04-13 04:00:00,1,1,1,1,1,1,5,0
2015-04-13 03:00:00,2015-04-13 04:00:00,2,1,2,2,2,1,8,2


On Mon, Aug 8, 2016 at 2:46 PM Rajat Khandelwal <prongs@apache.org> wrote:
More details on how it converts to the fallback range: 

Range asked: 
(order_time, '2015-04-13-03', '2015-04-13-04')

Timedim Relation : 
<property name="cube.timedim.relation.order_time" value="delivery_time+[-20 days,-1 hour]" />
<property name="cube.timedim.relation.order_time" value="delivery_time+[-2 hours,-1hour]" />
The cube defines the first relation, the fact defines the second relation. A fact can optionally override the timedim relation defined by cube. So the second one is picked : 
Derivation:

2015-04-13-03 <=ot < 2015-04-13-04
dt - 2 hours <= ot <= dt - 1 hour => ot + 1 hour <= dt <= ot + 2 hour
=> 2015-04-13-04 <= dt < 2015-04-13-06. 

Hope it's clearer now. 


On Mon, Aug 8, 2016 at 1:29 PM Rajat Khandelwal <prongs@apache.org> wrote:
Hi 

So it seems right now the query is designed to return 0 rows. The query is there just to demonstrate the fallback feature. 

Cube query is 

cube select customer_city_name, store_cost from sales where time_range_in(order_time, '2015-04-13-03', '2015-04-13-04')
Driver query comes out to be
SELECT (customer_city.name), sum((sales.store_cost)) FROM a.local_sales_aggr_fact2 sales join a.local_city_table customer_city on sales.customer_city_id = customer_city.id and (customer_city.dt = 'latest') WHERE (((((sales.dt) = '2015-04-13-04') or ((sales.dt) = '2015-04-13-05')) and ((sales.order_time) >= '2015-04-13 03:00:00') and ((sales.order_time) < '2015-04-13 04:00:00'))) GROUP BY (customer_city.name)
Then I modified the driver query and ran: 
SELECT (customer_city.name), sales.order_time, sales.store_cost FROM a.local_sales_aggr_fact2 sales join a.local_city_table customer_city on sales.customer_city_id = customer_city.id and (customer_city.dt = 'latest') WHERE (((((sales.dt) = '2015-04-13-04') or ((sales.dt) = '2015-04-13-05'))))
Found the following results:

Bangalore^A2015-04-13 00:00:00^A0.0
Hyderabad^A2015-04-13 00:00:00^A2.0

It's apparent that both the rows will be filtered out by the translated query. 

I'll look deeper and replace the example query with something that demonstrates the fallback feature and also returns non-empty results. 

Let us know if you have any confusion understanding the feature itself. I think the video should have covered it. 


Regards



--

Tao Yan
Software Engineer
Data Analytics Infrastructure Tools and Services



https://www.linkedin.com/in/taousc



--

Tao Yan
Software Engineer
Data Analytics Infrastructure Tools and Services



https://www.linkedin.com/in/taousc



--

Tao Yan
Software Engineer
Data Analytics Infrastructure Tools and Services



206.250.5345
https://www.linkedin.com/in/taousc