It would be excellent if I could declare variables somehow. I don't know how this would be implemented, but I use Presto so much for repeated queries that it would be nice to be able to just change variables at the top instead of manually changing values (for instance, dates that are used many times throughout the query) by hand.10 votes
We are reviewing technical feasibility & options for supporting variables / parameters in Presto.
This should be pretty self explanatory. Currently TD supports only one timestamp column, while many tables will have more than one timestamp field. It's clumsy to have to store timestamps as strings, and then perform conversion each time the field is used.10 votes
Currently exploring feasibility of multiple timestamp columns.
If Hive or Presto query run for a unexpected long time, we would like to kill it automatically.
So, it would be nice if we can set timeout as query option.
When we set 30 min as timeout, Hive/Presto query is killed automatically if it's running over 30 mins.8 votes
It would be nice to support like plpythonu not only SQL based UDF.
Of course, I don't care If it wouldn't support file I/O and network access.7 votes
Thanks for the suggestion! This is something we are actively discussing internally.
Right now, the two location UDFs report slightly different names for some countries. Ones we've found are:
'United States of America' vs 'United States'
'South Korea' vs 'Republic of Korea'
'Hong Kong S.A.R.' vs 'Hong Kong'
Ideally, there should be a single standard at least within TD for full county names.5 votes
Reviewing feasibility based on the underlying data sources.
Since hive supports `with` statement, it might be easy to support view table internal emulation.
I need view table because it would be help to write short query.4 votes
Thanks for the feedback! This will be part of our new Console
In the current situation in Japan, it is necessary to add the JST to the end of the argument every time when using TD time UDFs, TD_TIME_ADD, TD_TIME_RANGE, and TD_TIME_FORMAT and so on. It is useful to be able to set the default to JST.4 votes
It would be nice to support clustering algorithm, such as k-means, as Hivemall UDFs.4 votes
Oracle has a pivot function which makes it easy to take values in a particular column and turn those values into columns. Here is a description of that function: http://www.techonthenet.com/oracle/pivot.php
It would be amazing if we could have this functionality on TD too.4 votes
Let say we have table AAA without column aaa, then
'select aaa from AAA' get error like following.
Query 20150821_071516_06355_77wcr failed: Column 'aaa' cannot be resolved
When we issue SQL with retry 1, it immediately returns, but with retry 5 for example, we need to wait long time.
It is better for retry feature to understand what type of error need to handle with retry.
It is better syntax error returns immediately regardless of retry setting.3 votes
Investigating to determine feasibility & effort.
Currently, TreasureData can data data by time column.
And, it needs to use td table:partial_delete command.
But, we want to delete data by using Hive/Presto.
And then, we want to use any conditions to delete data.3 votes
We can get country name from IP address using TD_IP_TO_COUNTRY_CODE function. Getting city name from IP is very useful to analyze data in detail.3 votes
When using Presto's create/insert query, we are hard to find the number of inserted records.
We want to easily find it.3 votes
Hive jobs currently don't have "#records" field available2 votes
Get a notification when throughput is zero records rather than when there is no communication between TD and the TD-Agent server. Our Replica DB failed and I spent a while staring at our TD configuration when the problem was obviously the DB.2 votes
Customers with regulatory requirements might need certain data fields be encrypted on disk. The question is how to provide encryption with minimal impact to performance and usability of the system.2 votes
This is scheduled to be delivered in February.
Support TD_TIME_ADD(time, '-1M') to get last month data in Hive/Presto.
Because, currently, we need to write the following.
), -- 1st day of this month
), -- last day of last month
), -- 1st day of last month
- Don't see your idea?