Hi everyone.
TLTR;
I have started to develop an OpenTSDB Plugin for Drill available here:
https://github.com/mapr-demos/drill/tree/openTSDB-plugin/contrib/storage-opentsdb
This is a work in progress and I have some ideas, and questions, see below
DETAILS
I am developing a storage plugin for OpenTSDB time series DB and I faced some problems due
to completely different APIs that drill expect and TSDB uses.
As OpenTSDB do not have any java client or jdbc driver, only REST API.
Here is a sample json call to tsdb:
{
"start": 1356998400,
"end": 1356998460,
"queries": [
{
"aggregator": "sum",
"metric": "sys.cpu.0",
"rate": "true",
"tags": {
"host": "*",
"dc": "lga"
}
},
{
"aggregator": "sum",
"tsuids": [
"000001000002000042",
"000001000002000043"
]
}
}
]
}
Sample query with filters:
{
"start": 1356998400,
"end": 1356998460,
"queries": [
{
"aggregator": "sum",
"metric": "sys.cpu.0",
"rate": "true",
"filters": [
{
"type":"wildcard",
"tagk":"host",
"filter":"*",
"groupBy":true
},
{
"type":"literal_or",
"tagk":"dc",
"filter":"lga|lga1|lga2",
"groupBy":false
},
]
},
{
"aggregator": "sum",
"tsuids": [
"000001000002000042",
"000001000002000043"
]
}
}
]
}
Sample response:
[
{
"metric": "tsd.hbase.puts",
"tags": {},
"aggregatedTags": [
"host"
],
"annotations": [
{
"tsuid": "00001C0000FB0000FB",
"description": "Testing Annotations",
"notes": "These would be details about the event, the description is just
a summary",
"custom": {
"owner": "jdoe",
"dept": "ops"
},
"endTime": 0,
"startTime": 1365966062
}
],
"globalAnnotations": [
{
"description": "Notice",
"notes": "DAL was down during this period",
"custom": null,
"endTime": 1365966164,
"startTime": 1365966064
}
],
"tsuids": [
"0023E3000002000008000006000001"
],
"dps": {
"1365966001": 25595461080,
"1365966061": 25595542522,
"1365966062": 25595543979,
...
"1365973801": 25717417859
}
}
]
So the main problem is to convert values from SQL syntax to OpenTSDB values and push it to
the API. Also we do not have fixed columns. We have a map in our tag column and each tag can
be a search filter. This cause problems then we try to perform search using where clause.
Query string like where host = * and dc = lga should be transformed like this:
"tags": {
"host": "*",
"dc": "lga"
}
I have already a working prototype available here:
https://github.com/mapr-demos/drill/tree/openTSDB-plugin/contrib/storage-opentsdb
With the following supported SQL statement:
SELECT * FROM <table_name:aggregation_function>;
Now I would like to go further and implement more time series related features for example:
1- select avg|sum|min|max(speedmetric.value)
2- from openTSDB(metric=sensor.speed, downsample='1m', interpolate='avg') speedmetric
3- where speedmetric.tags.id in (001, 002)
4- and speedmetric.timestamp >='value' and speedmetric.timestamp <= 'value'
5- group by speedmetric.tags.hostname
Where:
1 - Where the aggregation function, should be pushed down to the OpenTSDB REST Call
-> How can I override the aggregation function for my plugin
2 - I currently working on converting string from this clause to map to use it in TSDB query
3 - tags what we are searching for
4 - time period for search. In fact is is two timestamp values “from” and “to”. This
values are required
5 - don’t exactly know how transform this to the TSDB API.
Now we are using this syntax to use aggregation function :
The syntax for SELECT query with aggregation function is:
SELECT * FROM <table_name:aggregation_function>;
It transforms it such api request:
`/api/query?start=5y-ago&m=sum:warp.speed` as get request. More complicated requests should
use post requests.
Many thanks, Dmitriy Gavrilovich
dhavrilovich@cybervisiontech.com
|