I've loaded tab separated files into S3 that with this type of folders under the bucket: bucket --> se --> y=2013 --> m=07 --> d=14 --> h=00
each subfolder has 1 file that represent on hour of my traffic.
I then created an EMR workflow to run in interactive mode with hive.
When I log in to the master and get into hive I run this command:
CREATE EXTERNAL TABLE se (
id bigint,
oc_date timestamp)
partitioned by (y string, m string, d string, h string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://bi_data';
I get this error message:
FAILED: Error in metadata: java.lang.IllegalArgumentException: The bucket name parameter must be specified when listing objects in a bucket
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Can anybody help?
UPDATE Even if I try to use string fields only, I get the same error. Create table with strings:
CREATE EXTERNAL TABLE se (
id string,
oc_date string)
partitioned by (y string, m string, d string, h string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://bi_data';
Hive version 0.8.1.8
So, the solution is that I had two mistakes:
When writing only the bucket name you should have a trailing slash in the S3 path. reference here
The underscore is also an issue, the bucket name should be DNS compliant.
Hope I helped someone with this.
DATE, DATETIME, and TIMESTAMP types aren't supported yet. Please use STRING instead. or kindly provide your hive version. Thanks
I've loaded tab separated files into S3 that with this type of folders under the bucket: bucket --> se --> y=2013 --> m=07 --> d=14 --> h=00
each subfolder has 1 file that represent on hour of my traffic.
I then created an EMR workflow to run in interactive mode with hive.
When I log in to the master and get into hive I run this command:
CREATE EXTERNAL TABLE se (
id bigint,
oc_date timestamp)
partitioned by (y string, m string, d string, h string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://bi_data';
I get this error message:
FAILED: Error in metadata: java.lang.IllegalArgumentException: The bucket name parameter must be specified when listing objects in a bucket
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Can anybody help?
UPDATE Even if I try to use string fields only, I get the same error. Create table with strings:
CREATE EXTERNAL TABLE se (
id string,
oc_date string)
partitioned by (y string, m string, d string, h string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://bi_data';
Hive version 0.8.1.8
So, the solution is that I had two mistakes:
When writing only the bucket name you should have a trailing slash in the S3 path. reference here
The underscore is also an issue, the bucket name should be DNS compliant.
Hope I helped someone with this.
DATE, DATETIME, and TIMESTAMP types aren't supported yet. Please use STRING instead. or kindly provide your hive version. Thanks
0 commentaires:
Enregistrer un commentaire