mardi 27 mai 2014

Java - Custom UserDefinedFunction dans la ruche - Stack Overflow


Problem Statement


I created the below UserDefinedFunction to get the yesterday's date in the format I wanted as I will be passing the format into this below method from the query.


public final class YesterdayDate extends UDF {

public String evaluate(final String format) {
DateFormat dateFormat = new SimpleDateFormat(format);
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE, -1);
return dateFormat.format(cal.getTime()).toString();
}
}

So whenever I try to run the query like below by adding the jar to classpath and creating the temporary function yesterdaydate, I always get zero result back-


hive> create temporary function yesterdaydate as 'com.example.hive.udf.YesterdayDate';
OK
Time taken: 0.512 seconds

Below is the query I am running-


hive> SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT 10;
OK

And I always get zero result back but the data is there in that table for Aug 5th.


What wrong I am doing? Any suggestion will be appreciated.


Query should be like this if today's date is Aug 6th- then the query will be like below by using the above user defined function-


SELECT * FROM REALTIME where dt= '20120805' LIMIT 10;

NOTE:- As I am working with Hive 0.6 so it doesn’t support variable substitution thing, so I cannot use hiveconf here and the above table has been partitioned on dt(date) column.




SELECT FROM_UNIXTIME(UNIX_TIMESTAMP()-1*24*60*60,'%Y%m%d');



Problem Statement


I created the below UserDefinedFunction to get the yesterday's date in the format I wanted as I will be passing the format into this below method from the query.


public final class YesterdayDate extends UDF {

public String evaluate(final String format) {
DateFormat dateFormat = new SimpleDateFormat(format);
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE, -1);
return dateFormat.format(cal.getTime()).toString();
}
}

So whenever I try to run the query like below by adding the jar to classpath and creating the temporary function yesterdaydate, I always get zero result back-


hive> create temporary function yesterdaydate as 'com.example.hive.udf.YesterdayDate';
OK
Time taken: 0.512 seconds

Below is the query I am running-


hive> SELECT * FROM REALTIME where dt= yesterdaydate('yyyyMMdd') LIMIT 10;
OK

And I always get zero result back but the data is there in that table for Aug 5th.


What wrong I am doing? Any suggestion will be appreciated.


Query should be like this if today's date is Aug 6th- then the query will be like below by using the above user defined function-


SELECT * FROM REALTIME where dt= '20120805' LIMIT 10;

NOTE:- As I am working with Hive 0.6 so it doesn’t support variable substitution thing, so I cannot use hiveconf here and the above table has been partitioned on dt(date) column.



SELECT FROM_UNIXTIME(UNIX_TIMESTAMP()-1*24*60*60,'%Y%m%d');


0 commentaires:

Enregistrer un commentaire