drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmintriy Gavrilovich <dhavrilov...@cybervisiontech.com>
Subject OpenTSDB plugin development for Drill
Date Wed, 01 Feb 2017 10:01:10 GMT
Hi everyone. 

TLTR;

I have started to develop an OpenTSDB Plugin for Drill available here:
https://github.com/mapr-demos/drill/tree/openTSDB-plugin/contrib/storage-opentsdb

This is a work in progress and I have some ideas, and questions, see below

DETAILS


I am developing a storage plugin for OpenTSDB time series DB and I faced some problems due
to completely  different APIs that drill expect and TSDB uses. 

As OpenTSDB do not have any java client or jdbc driver, only REST API. 
Here is a sample json call to tsdb:
{
    "start": 1356998400,
    "end": 1356998460,
    "queries": [
        {
            "aggregator": "sum",
            "metric": "sys.cpu.0",
            "rate": "true",
            "tags": {
                "host": "*",
                "dc": "lga"
            }
        },
        {
            "aggregator": "sum",
            "tsuids": [
                "000001000002000042",
                "000001000002000043"
              ]
            }
        }
    ]
}

Sample query with filters:
{
    "start": 1356998400,
    "end": 1356998460,
    "queries": [
        {
            "aggregator": "sum",
            "metric": "sys.cpu.0",
            "rate": "true",
            "filters": [
                {
                   "type":"wildcard",
                   "tagk":"host",
                   "filter":"*",
                   "groupBy":true
                },
                {
                   "type":"literal_or",
                   "tagk":"dc",
                   "filter":"lga|lga1|lga2",
                   "groupBy":false
                },
            ]
        },
        {
            "aggregator": "sum",
            "tsuids": [
                "000001000002000042",
                "000001000002000043"
              ]
            }
        }
    ]
}

Sample response: 
[
    {
        "metric": "tsd.hbase.puts",
        "tags": {},
        "aggregatedTags": [
            "host"
        ],
        "annotations": [
            {
                "tsuid": "00001C0000FB0000FB",
                "description": "Testing Annotations",
                "notes": "These would be details about the event, the description is just
a summary",
                "custom": {
                    "owner": "jdoe",
                    "dept": "ops"
                },
                "endTime": 0,
                "startTime": 1365966062
            }
        ],
        "globalAnnotations": [
            {
                "description": "Notice",
                "notes": "DAL was down during this period",
                "custom": null,
                "endTime": 1365966164,
                "startTime": 1365966064
            }
        ],
        "tsuids": [
            "0023E3000002000008000006000001"
        ],
        "dps": {
            "1365966001": 25595461080,
            "1365966061": 25595542522,
            "1365966062": 25595543979,
...
            "1365973801": 25717417859
        }
    }
]

So the main problem is to convert values from SQL syntax to OpenTSDB values and push it to
the API. Also we do not have fixed columns. We have a map in our tag column and each tag can
be a search filter. This cause problems then we try to perform search using where clause.

Query string like where host = * and dc = lga should be transformed like this: 
"tags": {
                "host": "*",
                "dc": "lga"
            }

I have already a working prototype available here:
https://github.com/mapr-demos/drill/tree/openTSDB-plugin/contrib/storage-opentsdb

With the following supported SQL statement:

SELECT * FROM <table_name:aggregation_function>;


Now I  would like to go further and implement more time series related features for example:

1- select avg|sum|min|max(speedmetric.value)
2- from openTSDB(metric=sensor.speed, downsample='1m', interpolate='avg') speedmetric 
3- where speedmetric.tags.id in (001, 002) 
4- and speedmetric.timestamp >='value' and speedmetric.timestamp <= 'value' 
5- group by speedmetric.tags.hostname


Where:

1 - Where the aggregation function, should be pushed down to the OpenTSDB REST Call
-> How can I override the aggregation function for my plugin

2 - I currently working on converting string from this clause to map to use it in TSDB query
3 - tags what we are searching for
4 - time period for search. In fact is is two timestamp values “from” and “to”. This
values are required
5 - don’t exactly know how transform this to the TSDB API.   


Now we are using this syntax to use aggregation function :

The syntax for SELECT query with aggregation function is:
SELECT * FROM <table_name:aggregation_function>;

It transforms it such api request: 
`/api/query?start=5y-ago&m=sum:warp.speed` as get request. More complicated requests should
use post requests.

Many thanks, Dmitriy Gavrilovich
dhavrilovich@cybervisiontech.com
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message