mardi 13 mai 2014

Google apps script - compresser JSON charge utile à GZIP pour le remplissage des bigQuery avec Urlfetchapp ? (Utilities.zip pas gzip compatible) - Stack Overflow


I'm at my wits end here so any pointers much appreciated.


I'm querying the Google Analytics API, converting the response to appropriate JSON format and loading it into bigQuery using using multipart requests using Urlfetchapp. But this leads to me hitting the Urlfetchapp 100MB quota per day very quickly so I'm looking at ways to compress the JSON to GZIP and load that into bigQuery (I considered Google Cloud Storage but I'd have the same problem as saving the data to GCS first requires Urlfetchapp as well so that's why this is a Google Apps Scripts issue).


I've converted the data to blob, then zipped it using Utilities.zip and sent the bytes but after much debugging it turns out that the format is .zip, not .gzip..


Here is the json string created in my Apps Script (NEWLINE_DELIMITED_JSON)


{"ga_accountname":"photome","ga_querycode":"493h3v63078","ga_startdate":"2013-10-23 00:00:00","ga_enddate":"2013-10-23 00:00:00","ga_segmentname":"#_all_visits","ga_segmentexp":"ga:hostname=~dd.com","ga_landingPagePath":"/","ga_pagePath":"/","ga_secondPagePath":"(not set)","ga_source":"(direct)","ga_city":"Boden","ga_keyword":"(not set)","ga_country":"Sweden","ga_pageviews":"1","ga_bounces":"0","ga_visits":"1"}

I've got the rest of the API requests worked out (using uploadType resumable, job configuration sending okay, zipped blob bytes getting uploaded okay but bigQuery says "Input contained no data". Here are my Urlfetchapp parameters.


        // Sending job configuration first
var url = 'https://www.googleapis.com/upload/bigquery/v2/projects/' + bqProjectId +'/jobs?uploadType=resumable';
var options = {
'contentType': 'application/json; charset=UTF-8',
'contentLength': newJobSize,
'headers': {
'Accept-Encoding': 'gzip, deflate',
'Accept': 'application/json',
'X-Upload-Content-Length': zipSize,
'X-Upload-Content-Type': 'application/octet-stream'
},
'method' : 'post',
'payload' : jobData,
'oAuthServiceName' : 'bigQuery',
'oAuthUseToken' : 'always'
};

// Sending job data
var url = jobReq.getHeaders().Location;

var options = {
'contentType': 'application/octet-stream',
'contentLength': zipSize,
'contentRange': '0-'+zipSize,
'method' : 'put',
'payload' : zipBytes,
'oAuthServiceName' : 'bigQuery',
'oAuthUseToken' : 'always'
};

What options have I got? I'm fairly new to APIs but can I get Urlfetchapp to compress the payload to GZIP for me?



I'm at my wits end here so any pointers much appreciated.


I'm querying the Google Analytics API, converting the response to appropriate JSON format and loading it into bigQuery using using multipart requests using Urlfetchapp. But this leads to me hitting the Urlfetchapp 100MB quota per day very quickly so I'm looking at ways to compress the JSON to GZIP and load that into bigQuery (I considered Google Cloud Storage but I'd have the same problem as saving the data to GCS first requires Urlfetchapp as well so that's why this is a Google Apps Scripts issue).


I've converted the data to blob, then zipped it using Utilities.zip and sent the bytes but after much debugging it turns out that the format is .zip, not .gzip..


Here is the json string created in my Apps Script (NEWLINE_DELIMITED_JSON)


{"ga_accountname":"photome","ga_querycode":"493h3v63078","ga_startdate":"2013-10-23 00:00:00","ga_enddate":"2013-10-23 00:00:00","ga_segmentname":"#_all_visits","ga_segmentexp":"ga:hostname=~dd.com","ga_landingPagePath":"/","ga_pagePath":"/","ga_secondPagePath":"(not set)","ga_source":"(direct)","ga_city":"Boden","ga_keyword":"(not set)","ga_country":"Sweden","ga_pageviews":"1","ga_bounces":"0","ga_visits":"1"}

I've got the rest of the API requests worked out (using uploadType resumable, job configuration sending okay, zipped blob bytes getting uploaded okay but bigQuery says "Input contained no data". Here are my Urlfetchapp parameters.


        // Sending job configuration first
var url = 'https://www.googleapis.com/upload/bigquery/v2/projects/' + bqProjectId +'/jobs?uploadType=resumable';
var options = {
'contentType': 'application/json; charset=UTF-8',
'contentLength': newJobSize,
'headers': {
'Accept-Encoding': 'gzip, deflate',
'Accept': 'application/json',
'X-Upload-Content-Length': zipSize,
'X-Upload-Content-Type': 'application/octet-stream'
},
'method' : 'post',
'payload' : jobData,
'oAuthServiceName' : 'bigQuery',
'oAuthUseToken' : 'always'
};

// Sending job data
var url = jobReq.getHeaders().Location;

var options = {
'contentType': 'application/octet-stream',
'contentLength': zipSize,
'contentRange': '0-'+zipSize,
'method' : 'put',
'payload' : zipBytes,
'oAuthServiceName' : 'bigQuery',
'oAuthUseToken' : 'always'
};

What options have I got? I'm fairly new to APIs but can I get Urlfetchapp to compress the payload to GZIP for me?


0 commentaires:

Enregistrer un commentaire