samedi 31 mai 2014

TranslateApiException: The Azure Market Place Translator Subscription associated with the request credentials has zero balance. : ID=3444.V2_Json.Translate.4E90F008


I'm working with Hive and I have a table structured as follows:


CREATE TABLE t1 (
id INT,
created TIMESTAMP,
some_value BIGINT
);

I need to find every row in t1 that is less than 180 days old. The following query yields no rows even though there is data present in the table that matches the search predicate.


select * 
from t1
where created > date_sub(from_unixtime(unix_timestamp()), 180);

What is the appropriate way to perform a date comparison in Hive?




How about:


where unix_timestamp() - created < 180 * 24 * 60 * 60

Date math is usually simplest if you can just do it with the actual timestamp values.


Or do you want it to only cut off on whole days? Then I think the problem is with how you are converting back and forth between ints and strings. Try:


where created > unix_timestamp(date_sub(from_unixtime(unix_timestamp(),'yyyy-MM-dd'),180),'yyyy-MM-dd')

Walking through each UDF:



  1. unix_timestamp() returns an int: current time in seconds since epoch

  2. from_unixtime(,'yyyy-MM-dd') converts to a string of the given format, e.g. '2012-12-28'

  3. date_sub(,180) subtracts 180 days from that string, and returns a new string in the same format.

  4. unix_timestamp(,'yyyy-MM-dd') converts that string back to an int


If that's all getting too hairy, you can always write a UDF to do it yourself.




I think maybe it's a Hive bug dealing with the timestamp type. I've been trying to use it recently and getting incorrect results. If I change your schema to use a string instead of timestamp, and supply values in the



yyyy-MM-dd HH:mm:ss



format, then the select query worked for me.


According to the documentation, Hive should be able to convert a BIGINT representing epoch seconds to a timestamp, and that all existing datetime UDFs work with the timestamp data type.


with this simple query:



select from_unixtime(unix_timestamp()), cast(unix_timestamp() as timestamp) from test_tt limit 1;



I would expect both fields to be the same, but I get:



2012-12-29 00:47:43 1970-01-16 16:52:22.063



I'm seeing other weirdness as well.




TIMESTAMP is milliseconds
unix_timestamp is in seconds
You need to multiply the RHS by 1000.


where created > 1000 * date_sub(from_unixtime(unix_timestamp()), 180);



After reviewing this and referring to Date Difference less than 15 minutes in Hive I came up with a solution. While I'm not sure why Hive doesn't perform the comparison effectively on dates as strings (they should sort and compare lexicographically), the following solution works:


FROM (
SELECT id, value,
unix_timestamp(created) c_ts,
unix_timestamp(date_sub(from_unixtime(unix_timestamp()), 180), 'yyyy-MM-dd') c180_ts
FROM t1
) x
JOIN t1 t ON x.id = t.id
SELECT to_date(t.Created),
x.id, AVG(COALESCE(x.HighestPrice, 0)), AVG(COALESCE(x.LowestPrice, 0))
WHERE unix_timestamp(t.Created) > x.c180_ts
GROUP BY to_date(t.Created), x.id ;



Alternatively you may also use datediff. Then the where clause would be
in case of String timestamp (jdbc format) :


datediff(from_unixtime(unix_timestamp()), created) < 180;

in case of Unix epoch time:


datediff(from_unixtime(unix_timestamp()), from_unixtime(created)) < 180;


I'm working with Hive and I have a table structured as follows:


CREATE TABLE t1 (
id INT,
created TIMESTAMP,
some_value BIGINT
);

I need to find every row in t1 that is less than 180 days old. The following query yields no rows even though there is data present in the table that matches the search predicate.


select * 
from t1
where created > date_sub(from_unixtime(unix_timestamp()), 180);

What is the appropriate way to perform a date comparison in Hive?



How about:


where unix_timestamp() - created < 180 * 24 * 60 * 60

Date math is usually simplest if you can just do it with the actual timestamp values.


Or do you want it to only cut off on whole days? Then I think the problem is with how you are converting back and forth between ints and strings. Try:


where created > unix_timestamp(date_sub(from_unixtime(unix_timestamp(),'yyyy-MM-dd'),180),'yyyy-MM-dd')

Walking through each UDF:



  1. unix_timestamp() returns an int: current time in seconds since epoch

  2. from_unixtime(,'yyyy-MM-dd') converts to a string of the given format, e.g. '2012-12-28'

  3. date_sub(,180) subtracts 180 days from that string, and returns a new string in the same format.

  4. unix_timestamp(,'yyyy-MM-dd') converts that string back to an int


If that's all getting too hairy, you can always write a UDF to do it yourself.



I think maybe it's a Hive bug dealing with the timestamp type. I've been trying to use it recently and getting incorrect results. If I change your schema to use a string instead of timestamp, and supply values in the



yyyy-MM-dd HH:mm:ss



format, then the select query worked for me.


According to the documentation, Hive should be able to convert a BIGINT representing epoch seconds to a timestamp, and that all existing datetime UDFs work with the timestamp data type.


with this simple query:



select from_unixtime(unix_timestamp()), cast(unix_timestamp() as timestamp) from test_tt limit 1;



I would expect both fields to be the same, but I get:



2012-12-29 00:47:43 1970-01-16 16:52:22.063



I'm seeing other weirdness as well.



TIMESTAMP is milliseconds
unix_timestamp is in seconds
You need to multiply the RHS by 1000.


where created > 1000 * date_sub(from_unixtime(unix_timestamp()), 180);


After reviewing this and referring to Date Difference less than 15 minutes in Hive I came up with a solution. While I'm not sure why Hive doesn't perform the comparison effectively on dates as strings (they should sort and compare lexicographically), the following solution works:


FROM (
SELECT id, value,
unix_timestamp(created) c_ts,
unix_timestamp(date_sub(from_unixtime(unix_timestamp()), 180), 'yyyy-MM-dd') c180_ts
FROM t1
) x
JOIN t1 t ON x.id = t.id
SELECT to_date(t.Created),
x.id, AVG(COALESCE(x.HighestPrice, 0)), AVG(COALESCE(x.LowestPrice, 0))
WHERE unix_timestamp(t.Created) > x.c180_ts
GROUP BY to_date(t.Created), x.id ;


Alternatively you may also use datediff. Then the where clause would be
in case of String timestamp (jdbc format) :


datediff(from_unixtime(unix_timestamp()), created) < 180;

in case of Unix epoch time:


datediff(from_unixtime(unix_timestamp()), from_unixtime(created)) < 180;

0 commentaires:

Enregistrer un commentaire