Core Python Programming: Google Drive

Showing posts with label Google Drive. Show all posts

Thursday, June 1, 2017

Managing Shared (formerly Team) Drives with Python and the Google Drive API

2023 UPDATE: We are working to put updated versions of all the code into GitHub... stay tuned. The link will provided in all posts once the code sample(s) is(are) available.

2019 UPDATE: "G Suite" is now called "Google Workspace", "Team Drives" is now known as "Shared Drives", and the corresponding supportsTeamDrives flag has been renamed to supportsAllDrives. Please take note of these changes regarding the post below.

NOTE 1: Team Drives is only available for G Suite Business Standard users or higher. If you're developing an application for Team Drives, you'll need similar access.
NOTE 2: The code featured here is also available as a video + overview post as part of this series.

Introduction

Team Drives is a relatively new feature from the Google Drive team, created to solve some of the issues of a user-centric system in larger organizations. Team Drives are owned by an organization rather than a user and with its use, locations of files and folders won't be a mystery any more. While your users do have to be a G Suite Business (or higher) customer to use Team Drives, the good news for developers is that you won't have to write new apps from scratch or learn a completely different API.

Instead, Team Drives features are accessible through the same Google Drive API you've come to know so well with Python. In this post, we'll demonstrate a sample Python app that performs core features that all developers should be familiar with. By the time you've finished reading this post and the sample app, you should know how to:

Create Team Drives
Add members to Team Drives
Create a folder in Team Drives
Import/upload files to Team Drives folders

Using the Google Drive API

The demo script requires creating files and folders, so you do need full read-write access to Google Drive. The scope you need for that is:

'https://www.googleapis.com/auth/drive' — Full (read-write) access to Google Drive

If you're new to using Google APIs, we recommend reviewing earlier posts & videos covering the setting up projects and the authorization boilerplate so that we can focus on the main app. Once we've authorized our app, assume you have a service endpoint to the API and have assigned it to the DRIVE variable.

Create Team Drives

New Team Drives can be created with DRIVE.teamdrives().create(). Two things are required to create a Team Drive: 1) you should name your Team Drive. To make the create process idempotent, you need to create a unique request ID so that any number of identical calls will still only result in a single Team Drive being created. It's recommended that developers use a language-specific UUID library. For Python developers, that's the uuid module. From the API response, we return the new Team Drive's ID. Check it out:

def create_td(td_name):
    request_id = str(uuid.uuid4())
    body = {'name': td_name}
    return DRIVE.teamdrives().create(body=body,
            requestId=request_id, fields='id').execute().get('id')

Add members to Team Drives

To add members/users to Team Drives, you only need to create a new permission, which can be done with DRIVE.permissions().create(), similar to how you would share a file in regular Drive with another user. The pieces of information you need for this request are the ID of the Team Drive, the new member's email address as well as the desired role... choose from: "organizer", "owner", "writer", "commenter", "reader". Here's the code:

def add_user(td_id, user, role='commenter'):
    body = {'type': 'user', 'role': role, 'emailAddress': user}
    return DRIVE.permissions().create(body=body, fileId=td_id,
            supportsTeamDrives=True, fields='id').execute().get('id')

Some additional notes on permissions: the user can only be bestowed permissions equal to or less than the person/admin running the script... IOW, they cannot grant someone else greater permission than what they have. Also, if a user has a certain role in a Team Drive, they can be granted greater access to individual elements in the Team Drive. Users who are not members of a Team Drive can still be granted access to Team Drive contents on a per-file basis.

Create a folder in Team Drives

Nothing to see here! Yep, creating a folder in Team Drives is identical to creating a folder in regular Drive, with DRIVE.files().create(). The only difference is that you pass in a Team Drive ID rather than regular Drive folder ID. Of course, you also need a folder name too. Here's the code:

def create_td_folder(td_id, folder):
    body = {'name': folder, 'mimeType': FOLDER_MIME, 'parents': [td_id]}
    return DRIVE.files().create(body=body,
            supportsTeamDrives=True, fields='id').execute().get('id')

Import/upload files to Team Drives folders

Uploading files to a Team Drives folder is also identical to to uploading to a normal Drive folder, and also done with DRIVE.files().create(). Importing is slightly different than uploading because you're uploading a file and converting it to a G Suite/Google Apps document format, i.e., uploading CSV as a Google Sheet, or plain text or Microsoft Word® file as Google Docs. In the sample app, we tackle the former:

def import_csv_to_td_folder(folder_id, fn, mimeType):
    body = {'name': fn, 'mimeType': mimeType, 'parents': [folder_id]}
    return DRIVE.files().create(body=body, media_body=fn+'.csv',
            supportsTeamDrives=True, fields='id').execute().get('id')

The secret to importing is the MIMEtype. That tells Drive whether you want conversion to a G Suite/Google Apps format (or not). The same is true for exporting. The import and export MIMEtypes supported by the Google Drive API can be found in my SO answer here.

Driver app

All these functions are great but kind-of useless without being called by a main application, so here we are:

FOLDER_MIME = 'application/vnd.google-apps.folder'
SOURCE_FILE = 'inventory' # on disk as 'inventory.csv'
SHEETS_MIME = 'application/vnd.google-apps.spreadsheet'

td_id = create_td('Corporate shared TD')
print('** Team Drive created')
perm_id = add_user(td_id, 'email@example.com')
print('** User added to Team Drive')
folder_id = create_td_folder(td_id, 'Manufacturing data')
print('** Folder created in Team Drive')
file_id = import_csv_to_td_folder(folder_id, SOURCE_FILE, SHEETS_MIME)
print('** CSV file imported as Google Sheets in Team Drives folder')

The first set of variables represent some MIMEtypes we need to use as well as the CSV file we're uploading to Drive and requesting it be converted to Google Sheets format. Below those definitions are calls to all four functions described above.

Conclusion

If you run the script, you should get output that looks something like this, with each print() representing each API call:

$ python3 td_demo.py
** Team Drive created
** User added to Team Drive
** Folder created in Team Drive
** CSV file imported as Google Sheets in Team Drives folder

When the script has completed, you should have a new Team Drives folder called "Corporate shared TD", and within, a folder named "Manufacturing data" which contains a Google Sheets file called "inventory".

Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!)—by using, copying, and/or modifying this code or any other piece of source from this blog, you implicitly agree to its Apache2 license:

from __future__ import print_function
import uuid

from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = 'https://www.googleapis.com/auth/drive'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

def create_td(td_name):
    request_id = str(uuid.uuid4()) # random unique UUID string
    body = {'name': td_name}
    return DRIVE.teamdrives().create(body=body,
            requestId=request_id, fields='id').execute().get('id')

def add_user(td_id, user, role='commenter'):
    body = {'type': 'user', 'role': role, 'emailAddress': user}
    return DRIVE.permissions().create(body=body, fileId=td_id,
            supportsTeamDrives=True, fields='id').execute().get('id')

def create_td_folder(td_id, folder):
    body = {'name': folder, 'mimeType': FOLDER_MIME, 'parents': [td_id]}
    return DRIVE.files().create(body=body,
            supportsTeamDrives=True, fields='id').execute().get('id')

def import_csv_to_td_folder(folder_id, fn, mimeType):
    body = {'name': fn, 'mimeType': mimeType, 'parents': [folder_id]}
    return DRIVE.files().create(body=body, media_body=fn+'.csv',
            supportsTeamDrives=True, fields='id').execute().get('id')

FOLDER_MIME = 'application/vnd.google-apps.folder'
SOURCE_FILE = 'inventory' # on disk as 'inventory.csv'... CHANGE!
SHEETS_MIME = 'application/vnd.google-apps.spreadsheet'

td_id = create_td('Corporate shared TD')
print('** Team Drive created')
perm_id = add_user(td_id, 'email@example.com') # CHANGE!
print('** User added to Team Drive')
folder_id = create_td_folder(td_id, 'Manufacturing data')
print('** Folder created in Team Drive')
file_id = import_csv_to_td_folder(folder_id, SOURCE_FILE, SHEETS_MIME)
print('** CSV file imported as Google Sheets in Team Drives folder')

As with our other code samples, you can now customize it to learn more about the API, integrate into other apps for your own needs, for a mobile frontend, sysadmin script, or a server-side backend!

Code challenge

Write a simple application that moves folders (and its files or folders) in regular Drive to Team Drives. Each folder you move should be a corresponding folder in Team Drives. Remember that files in Team Drives can only have one parent, and the same goes for folders.

Monday, July 11, 2016

Exporting a Google Sheet spreadsheet as CSV

Introduction

Today, we'll follow-up to my earlier post on the Google Sheets API and multiple posts (first, second, third) on the Google Drive API by answering one common question: How do you download a Google Sheets spreadsheet as a CSV file? The "FAQ"ness of the question itself as well as various versions of Google APIs has led to many similar StackOverflow questions: one, two, three, four, five, just to list a few. Let's answer this question definitively and walk through a Python code sample that does exactly that. The main assumption is that you have a Google Sheet file in your Google Drive named "inventory".

Choosing the right API

Upon first glance, developers may think the Google Sheets API is the one to use. Unfortunately that isn't the case. The Sheets API is the one to use for spreadsheet-oriented operations, such as inserting data, reading spreadsheet rows, managing individual tab/sheets within a spreadsheet, cell formatting, creating charts, adding pivot tables, etc., It isn't meant to perform file-based requests like exporting a Sheet in CSV (comma-separated values) format. For file-oriented operations with a Google Sheet, you would use the Google Drive API.

Using the Google Drive API

As mentioned earlier, Google Drive features numerous API scopes of authorization. As usual, we always recommend you use the most restrictive scope possible that allows your app to do its work. You'll request fewer permissions from your users (which makes them happier), and it also makes your app more secure, possibly preventing modifying, destroying, or corrupting data, or perhaps inadvertently going over quotas. Since we're only exporting a Google Sheets file from Google Drive, the only scope we need is:

'https://www.googleapis.com/auth/drive.readonly' — Read-only access to file content or metadata

The earlier post I wrote on the Google Drive API featured sample code that exported an uploaded Google Docs file as PDF and download that from Drive. This post will not only feature a change to exporting a Google Sheets file in CSV format, but also demonstrate one additional feature of the Drive API: querying

Since we've fully covered the authorization boilerplate fully in earlier posts and videos, we're going to skip that here and jump right to the action, creating of a service endpoint to Drive. The API name is (of course 'drive', and the current version of the API is 3, so use the string 'v3' in this call to the apiclient.discovey.build() function:

DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

Query and export files from Google Drive

While unnecessary, we'll create a few string constants representing the filename, source and destination file MIME types to make the code easier to understand:

FILENAME = 'inventory'
SRC_MIMETYPE = 'application/vnd.google-apps.spreadsheet'
DST_MIMETYPE = 'text/csv'

In this simple example, we're only going to export one Google Sheets file as CSV, arbitrarily choosing a file named, "inventory." So to perform the query, you need both the filename and its MIME type, "application/vnd.google-apps.spreadsheet". Query components are conjoined with the "and" keyword, so your query string will look like this: q='name="%s" and mimeType="%s"' % (FILENAME, SRC_MIMETYPE).

Since there may be more than one Google Sheets file named 'inventory". we opt for newest one and thus need to sort all matching files in descending order of last modification time then name if "mtime"s are identical via an "order by" clause: orderBy='modifiedTime desc,name'. Here is the complete call to DRIVE.files().list() to issue the query:

files = DRIVE.files().list(
    q='name="%s" and mimeType="%s"' % (FILENAME, SRC_MIMETYPE),
    orderBy='modifiedTime desc,name').execute().get('files', [])

If any files match, the payload will contain a 'files' key, else we default to an empty list and display to the user on the last line that no files were found. Otherwise, grab the first match, the most recently-modified 'inventory' file, create a suitable CSV filename from it, and change all spaces to underscores:

fn = '%s.csv' % os.path.splitext(files[0]['name'].replace(' ', '_'))[0]

The final Drive API call requests an export of 'inventory' as a CSV file, and if successful, the downloaded data is written with the filename above. In either case, the user is notified of success or failure of the export:

data = DRIVE.files().export(fileId=files[0]['id'], mimeType=DST_MIMETYPE).execute()
if data:
    with open(fn, 'wb') as f:
        f.write(data)
    print('DONE')
else:
    print('ERROR (could not download file)')

Note that if downloading as CSV, the Drive API only exports of the first sheet in a Sheets file... you won't get any others. However, it does support 3 other download formats that will get you all the sheets.

If you create a Sheets file named 'inventory', run the script, grant the script access to your Google Drive (via the OAuth2 prompt that pops up in the browser), and then you should get output that looks like this:

$ python drive_sheets_csv_export.py # or python3
Exporting "inventory" as "inventory.csv"... DONE

Conclusion

Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!). If I were to divide the script into 4 major sections, they would be:

Get creds & build Google Drive service endpoint
Source and destination file info
Query Google Drive for matching files
Export most recent matching Sheets file as CSV

Here's the code itself:

from __future__ import print_function
import os

from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = 'https://www.googleapis.com/auth/drive.readonly'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

FILENAME = 'inventory'
SRC_MIMETYPE = 'application/vnd.google-apps.spreadsheet'
DST_MIMETYPE = 'text/csv'

files = DRIVE.files().list(
    q='name="%s" and mimeType="%s"' % (FILENAME, SRC_MIMETYPE),
    orderBy='modifiedTime desc,name').execute().get('files', [])

if files:
    fn = '%s.csv' % os.path.splitext(files[0]['name'].replace(' ', '_'))[0]
    print('Exporting "%s" as "%s"... ' % (files[0]['name'], fn), end='')
    data = DRIVE.files().export(fileId=files[0]['id'], mimeType=DST_MIMETYPE).execute()
    if data:
        with open(fn, 'wb') as f:
            f.write(data)
        print('DONE')
    else:
        print('ERROR (could not download file)')
else:
    print('!!! ERROR: File not found')

As with our other code samples, you can now customize for your own needs, for a mobile frontend, sysadmin script, or a server-side backend, perhaps accessing other Google APIs. Hope this helps answer yet another frequently asked question!

Wednesday, December 23, 2015

Migrating to Google Drive API v3

NOTE: The code covered in this and the previous post are also available in a video walkthrough. Mar 2018 UPDATE: Modernized the code a bit, shortening it, and changed to R/W scope because drive.file doesn't work if the file hasn't been created yet. The same fixes were made to the Drive API v2 sample in the preceding blog post.

Introduction

In a blog post last week, we introduced readers to performing uploads and downloads files to/from Google Drive from a simple Python command-line script. In an official Google blog post later that same day, the Google Drive API team announced a new version of the API. Great timing huh? Well, good thing I knew it was coming, so that I could prepare this post for you, which is a primer on how to migrate from the current version of the API (v2) to the new one (v3).

As stated by the Drive team, v2 isn't being deprecated, and there are no new features in v3, thus migration isn't required. The new version is mainly for new apps/integrations as well as developers with v2 apps who wish to take advantage of the improvements. This post is intended for those in the latter group, covering porting existing apps to v3. Ready? Let's go straight to the action.

Migrating from Google Drive API v2 to v3

Most of this post will be just examining all the "diffs" between the v2 code sample from the previous post (renamed from drive_updown.py to drive_updown2.py) and its v3 equivalent (drive_updown3.py). We'll take things step-by-step to provide more details, but let's start with all the diffs first:

--- drive_updown2.py   2018-03-11 21:42:33.000000000 -0700
+++ drive_updown3.py   2018-03-11 21:44:57.000000000 -0700
@@ -11,23 +11,24 @@
 if not creds or creds.invalid:
     flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
     creds = tools.run_flow(flow, store)
-DRIVE = discovery.build('drive', 'v2', http=creds.authorize(Http()))
+DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))
 
 FILES = (
-    ('hello.txt', False),
-    ('hello.txt', True),
+    ('hello.txt', None),
+    ('hello.txt', 'application/vnd.google-apps.document'),
 )
 
-for filename, convert in FILES:
-    metadata = {'title': filename}
-    res = DRIVE.files().insert(convert=convert, body=metadata,
-            media_body=filename, fields='mimeType,exportLinks').execute()
+for filename, mimeType in FILES:
+    metadata = {'name': filename}
+    if mimeType:
+        metadata['mimeType'] = mimeType
+    res = DRIVE.files().create(body=metadata, media_body=filename).execute()
     if res:
         print('Uploaded "%s" (%s)' % (filename, res['mimeType']))
 
 if res:
     MIMETYPE = 'application/pdf'
-    res, data = DRIVE._http.request(res['exportLinks'][MIMETYPE])
+    data = DRIVE.files().export(fileId=res['id'], mimeType=MIMETYPE).execute()
     if data:
         fn = '%s.pdf' % os.path.splitext(filename)[0]
         with open(fn, 'wb') as fh:

We'll start with the building of the service endpoint, with the trivial change of the API version string from 'v2' to 'v3':

-DRIVE = build('drive', 'v2', http=creds.authorize(Http()))
+DRIVE = build('drive', 'v3', http=creds.authorize(Http()))

The next change is the deprecation of the conversion flag. The problem with a Boolean variable is that it limits the possible types of file formats supported. By changing it to a file mimeType instead, the horizons are broadened:

 FILES = (
-    ('hello.txt', False),
-    ('hello.txt', True),
+    ('hello.txt', None),
+    ('hello.txt', 'application/vnd.google-apps.document'),
 )

Your next question will be: "What are the mimeTypes for the supported Google Apps document formats?" The answers can be found at this page in the official docs. This changes the datatype in our array of 2-tuples, so we need to change the loop variable to reflect this... we'll use the mimeType instead of a conversion flag:

-for filename, convert in FILES:
+for filename, mimeType in FILES:

Another change related to deprecating the convert flag is that the mimeType isn't a parameter to the API call. Instead, it's another piece of metadata, so we need to add mimeType to the metadata object.

Related to this is a name change: since a file's name is its name and not its title, it makes more sense to use "name" as the metadata value:

-    metadata = {'title': filename}
+    metadata = {'name': filename}
+    if mimeType:
+        metadata['mimeType'] = mimeType

Why the if statement? Not only did v3 see a change to using mimeTypes, but rather than being a parameter like the conversion flag in v2, the mimeType has been moved into the file's metadata, so if we're doing any conversion, we need to add it to our metadata field (then remove the convert parameter down below).

Next is yet another name change: when creating files on Google Drive, "create()" makes more sense as a method name than "insert()". Reducing the size of payload is another key ingredient of v3. We mentioned in the previous post that insert() returns more than 30 fields in the response payload unless you use the fields parameter to specify exactly which you wish returned. In v3, the default response payload only returns four fields, including all the ones we need in this script, so use of the fields parameter isn't required any more:

-    res = DRIVE.files().insert(convert=convert, body=metadata,
-            media_body=filename, fields='mimeType,exportLinks').execute()
+    res = DRIVE.files().create(body=metadata, media_body=filename).execute()

The final improvement we can demonstrate: users no longer have to make an authorized HTTP GET request with a link to export and download a file in an alternate format like PDF®. Instead, it's now a "normal" API call (to the new "export()" method) with the mimeType as a parameter. The only other parameter you need is the file ID, which comes back as part of the (default) response payload when the create() call was made:

-    res, data = DRIVE._http.request(res['exportLinks'][MIMETYPE])
+    data = DRIVE.files().export(fileId=res['id'], mimeType=MIMETYPE).execute()

That's it! If you run the script, grant the script access to your Google Drive (via the OAuth2 prompt that pops up in the browser), and then you should get output that looks like this:

$ python drive_updown3.py # or python3
Uploaded "hello.txt" (text/plain)
Uploaded "hello.txt" (application/vnd.google-apps.document)
Downloaded "hello.pdf" (application/pdf)

Conclusion

The entire v2 script (drive_updown2.py) was spelled out in full in the previous post, and it hasn't changed since then. Below is the v3 script (drive_updown3.py) for your convenience which runs on both Python 2 and Python 3 (unmodified!):

#!/usr/bin/env python

from __future__ import print_function
import os

from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = 'https://www.googleapis.com/auth/drive'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

FILES = (
    ('hello.txt', None),
    ('hello.txt', 'application/vnd.google-apps.document'),
)

for filename, mimeType in FILES:
    metadata = {'name': filename}
    if mimeType:
        metadata['mimeType'] = mimeType
    res = DRIVE.files().create(body=metadata, media_body=filename).execute()
    if res:
        print('Uploaded "%s" (%s)' % (filename, res['mimeType']))

if res:
    MIMETYPE = 'application/pdf'
    data = DRIVE.files().export(fileId=res['id'], mimeType=MIMETYPE).execute()
    if data:
        fn = '%s.pdf' % os.path.splitext(filename)[0]
        with open(fn, 'wb') as fh:
            fh.write(data)
        print('Downloaded "%s" (%s)' % (fn, MIMETYPE))

Just as in the previous post(s), you can now customize this code for your own needs, for a mobile frontend, sysadmin script, or a server-side backend, perhaps accessing other Google APIs. Hope we accomplished our goal by pointing out some of the shortcomings that are in v2 and how they were improved in v3! All of the content in this and the previous post are spelled out visually in this video that I created for you.

Monday, December 14, 2015

Google Drive: Uploading & Downloading files with Python

UPDATE: Since this post was published, the Google Drive team released a newer version of their API. After reading this one, go to the next post to learn about migrating your app from v2 to v3 as well as link to my video which walks through the code samples in both posts.

Introduction

So far in this series of blogposts covering authorized Google APIs, we've used Python to access Google Drive, Gmail, and Google Calendar. Today, we're revisiting Google Drive with a small snippet that uploads plain text files to Drive, with & without conversion to a Google Apps format (Google Docs), then exports & downloads the converted one as PDF®.

Earlier posts demonstrated the structure and "how-to" use Google APIs in general, so more recent posts, including this one, focus on solutions and apps, and use of specific APIs. Once you review the earlier material, you're ready to start with authorization scopes then see how to use the API itself.

Google Drive API Scopes

Google Drive features numerous API scopes of authorization. As usual, we always recommend you use the most restrictive scope possible that allows your app to do its work. You'll request fewer permissions from your users (which makes them happier), and it also makes your app more secure, possibly preventing modifying, destroying, or corrupting data, or perhaps inadvertently going over quotas. Since we need to upload/create files in Google Drive, the minimum scope we need is:

'https://www.googleapis.com/auth/drive' — Read/write access to Drive

Using the Google Drive API

Let's get going with our example today that uploads and downloads a simple plain text file to Drive. The file will be uploaded twice, once as-is, and the second time, converted to a Google Docs document. The last part of the script will request an export of the (uploaded) Google Doc as PDF and download that from Drive.

Since we've fully covered the authorization boilerplate fully in earlier posts and videos, we're going to skip that here and jump right to the action, creating of a service endpoint to Drive. The API name is (of course) 'drive', and the current version of the API is 2, so use the string 'v2' in this call to the apiclient.discovey.build() function:

DRIVE = build('drive', 'v2', http=creds.authorize(Http()))

Let's also create a FILES array object (tuple, list, etc.) which holds 2-tuples of the files to upload. These pairs are made up of a filename and a flag indicating whether or not you wish the file to be converted to a Google Apps format:

FILES = (
    ('hello.txt', False),
    ('hello.txt', True),
)

Since we're uploading a plain text file, a conversion to Apps format means Google Docs. (You can imagine that if it was a CSV file, the target format would be Google Sheets instead.) With the setup complete, let's move on to the code that performs the file uploads.

We'll loop through FILES, cycling through each file-convert flag pair and call the files.insert() method to perform the upload. The four parameters needed are: 1) the conversion flag, 2) the file metadata, which is only the filename (see below), 3) the media_body, which is also the filename but has a different purpose — it specifies where the file content will come from, meaning the file will be opened and its data transferred to the API, and 4), a set of fields you want returned.

for filename, convert in FILES:
    metadata = {'title': filename}
    res = DRIVE.files().insert(convert=convert, body=metadata,
            media_body=filename, fields='mimeType,exportLinks').execute()
    if res:
        print('Uploaded "%s" (%s)' % (filename, res['mimeType']))

It's important to give the fields() parameter because if you don't, more than 30(!) are returned by default from the API. There's no need to waste all that network traffic if all you need are just a couple. In our case, we only want the mimeType, to confirm what the file was saved as, and exportLinks, which we'll explore in a moment. If files are uploaded successfully, the print() lets the user know, and then we move on to the final section of the script.

Before we dig into the last bit of code, it's important to realize that the res variable still contains the result from the second upload, the one where the file is converted to Google Docs. This is important because this is where we need to extract the download link for the format you want (res['exportLinks'][MIMETYPE]). The way to download the file is to make an authorized HTTP GET call, passing in that link. In our case, it's the PDF version. If the download is successful, the data variable will have the payload to write to disk. If all's good, let the user know:

if res:
    MIMETYPE = 'application/pdf'
    res, data = DRIVE._http.request(res['exportLinks'][MIMETYPE])
    if data:
        fn = '%s.pdf' % os.path.splitext(filename)[0]
        with open(fn, 'wb') as fh:
            fh.write(data)
        print('Downloaded "%s" (%s)' % (fn, MIMETYPE))

Final note: this code sample is slightly different from previous posts in two big ways: 1) now that the Google APIs Client Library runs on Python 3, I'll try to produce only code samples for this blog that run unmodified under both 2.x and 3.x interpreters — the primary one-line difference being the import of the print() function, and 2) we're going to incorporate the use of the run_flow() function from oauth2client.tools and only fallback to the deprecated run() function if necessary — more info on this change available in this earlier post.

If you run the script, grant the script access to your Google Drive (via the OAuth2 prompt that pops up in the browser), and then you should get output that looks like this:

$ python drive_updown3.py # or python3
Uploaded "hello.txt" (text/plain)
Uploaded "hello.txt" (application/vnd.google-apps.document)
Downloaded "hello.pdf" (application/pdf)

Conclusion

Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!):

#!/usr/bin/env python

from __future__ import print_function
import os

from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = 'https://www.googleapis.com/auth/drive'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
DRIVE = discovery.build('drive', 'v2', http=creds.authorize(Http()))

FILES = (
    ('hello.txt', False),
    ('hello.txt', True),
)

for filename, convert in FILES:
    metadata = {'title': filename}
    res = DRIVE.files().insert(convert=convert, body=metadata,
            media_body=filename, fields='mimeType,exportLinks').execute()
    if res:
        print('Uploaded "%s" (%s)' % (filename, res['mimeType']))

if res:
    MIMETYPE = 'application/pdf'
    res, data = DRIVE._http.request(res['exportLinks'][MIMETYPE])
    if data:
        fn = '%s.pdf' % os.path.splitext(filename)[0]
        with open(fn, 'wb') as fh:
            fh.write(data)
        print('Downloaded "%s" (%s)' % (fn, MIMETYPE))

You can now customize this code for your own needs, for a mobile frontend, sysadmin script, or a server-side backend, perhaps accessing other Google APIs. If you want to see another example of using the Drive API, check out this earlier post listing the files in Google Drive and its accompanying video as well as a similar example in the official docs or its equivalent in Java (server-side, Android), iOS (Objective-C, Swift), C#/.NET, PHP, Ruby, JavaScript (client-side, Node.js, Google Apps Script), or Go. That's it... hope you find these code samples useful in helping you get started with the Drive API!

UPDATE: Since this post was published, the Google Drive team released a newer version of their API. Go to the next post to learn about migrating your app from v2 to v3 as well as link to my video which walks through the code samples in both posts.

EXTRA CREDIT: Feel free to experiment and try something else to test your skills and challenge yourself as there's a lot more to Drive than just uploading and downloading files. Experiment with creating folders and manipulate files there, work with a folder of photos and organize them using the image metadata available to you, implement a search engine for your Drive files, etc. There are so many things you can do!

Thursday, November 6, 2014

Authorized Google API access from Python (part 2 of 2)

Listing your files with the Google Drive API

NOTE: You can also watch a video walkthrough of the common code covered in this blogpost here.

UPDATE (Mar 2020): You can build this application line-by-line with our codelab (self-paced, hands-on tutorial) introducing developers to G Suite APIs. The deprecated auth library comment from the previous update below is spelled out in more detail in the green sidebar towards the bottom of step/module 5 (Install the Google APIs Client Library for Python). Also, the code sample is now maintained in a GitHub repo which includes a port to the newer auth libraries so you have both versions to refer to.

UPDATE (Apr 2019): In order to have a closer relationship between the GCP and G Suite worlds of Google Cloud, all G Suite Python code samples have been updated, replacing some of the older G Suite API client libraries with their equivalents from GCP. NOTE: using the newer libraries requires more initial code/effort from the developer thus will seem "less Pythonic." However, we will leave the code sample here with the original client libraries (deprecated but not shutdown yet) to be consistent with the video.

UPDATE (Aug 2016): The code has been modernized to use oauth2client.tools.run_flow() instead of the deprecated oauth2client.tools.run_flow(). You can read more about that change here.

UPDATE (Jun 2016): Updated to Python 2.7 & 3.3+ and Drive API v3.

Introduction

In this final installment of a (currently) two-part series introducing Python developers to building on Google APIs, we'll extend from the simple API example from the first post (part 1) just over a month ago. Those first snippets showed some skeleton code and a short real working sample that demonstrate accessing a public (Google) API with an API key (that queried public Google+ posts). An API key however, does not grant applications access to authorized data.

Authorized data, including user information such as personal files on Google Drive and YouTube playlists, require additional security steps before access is granted. Sharing of and hardcoding credentials such as usernames and passwords is not only insecure, it's also a thing of the past. A more modern approach leverages token exchange, authenticated API calls, and standards such as OAuth2.

In this post, we'll demonstrate how to use Python to access authorized Google APIs using OAuth2, specifically listing the files (and folders) in your Google Drive. In order to better understand the example, we strongly recommend you check out the OAuth2 guides (general OAuth2 info, OAuth2 as it relates to Python and its client library) in the documentation to get started.

The docs describe the OAuth2 flow: making a request for authorized access, having the user grant access to your app, and obtaining a(n access) token with which to sign and make authorized API calls with. The steps you need to take to get started begin nearly the same way as for simple API access. The process diverges when you arrive on the Credentials page when following the steps below.

Google API access

In order to Google API authorized access, follow these instructions (the first three of which are roughly the same for simple API access):

Go to the Google Developers Console and login.

Use your Gmail or Google credentials; create an account if needed

Click "Create a Project" from pulldown under your username (at top)

Enter a Project Name (mutable, human-friendly string only used in the console)
Enter a Project ID (immutable, must be unique and not already taken)

Once project has been created, enable APIs you wish to use

You can toggle on any API(s) that support(s) simple or authorized API access.
For the code example below, we use the Google Drive API.
Other ideas: YouTube Data API, Google Sheets API, etc.
Find more APIs (and version#s which you need) at the OAuth Playground.

Select "Credentials" in left-nav

Click "Create credentials" and select OAuth client ID
In the new dialog, select your application type — we're building a command-line script which is an "Installed application"
In the bottom part of that same dialog, specify the type of installed application; choose "Other" (cmd-line scripts are not web nor mobile)
Click "Create Client ID" to generate your credentials

Finally, click "Download JSON" to save the new credentials to your computer... perhaps choose a shorter name like "client_secret.json" or "client_id.json"

NOTEs: Instructions from the previous blogpost were to get an API key. This time, in the steps above, we're creating and downloading OAuth2 credentials. You can also watch a video walkthrough of this app setup process of getting simple or authorized access credentials in the "DevConsole" here.

Accessing Google APIs from Python

In order to access authorized Google APIs from Python, you still need the Google APIs Client Library for Python, so in this case, do follow those installation instructions from part 1.

We will again use googleapiclient.discovery.build(), which is required to create a service endpoint for interacting with an API, authorized or otherwise. However, for authorized data access, we need additional resources, namely the httplib2 and oauth2client packages. Here are the first five lines of the new boilerplate code for authorized access:

from __future__ import print_function

from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = # one or more scopes (strings)

SCOPES is a critical variable: it represents the set of scopes of authorization an app wants to obtain (then access) on behalf of user(s). What's does a scope look like?

Each scope is a single character string, specifically a URL. Here are some examples:

'https://www.googleapis.com/auth/plus.me' — access your personal Google+ settings
'https://www.googleapis.com/auth/drive.metadata.readonly' — read-only access your Google Drive file or folder metadata
'https://www.googleapis.com/auth/youtube' — access your YouTube playlists and other personal information

You can request one or more scopes, given as a single space-delimited string of scopes or an iterable (list, generator expression, etc.) of strings. If you were writing an app that accesses both your YouTube playlists as well as your Google+ profile information, your SCOPES variable could be either of the following:
SCOPES = 'https://www.googleapis.com/auth/plus.me https://www.googleapis.com/auth/youtube'

That is space-delimited and made tiny by me so it doesn't wrap in a regular-sized browser window; or it could be an easier-to-read, non-tiny, and non-wrapped tuple:

SCOPES = (
'https://www.googleapis.com/auth/plus.me',
'https://www.googleapis.com/auth/youtube',
)

Our example command-line script will just list the files on your Google Drive, so we only need the read-only Drive metadata scope, meaning our SCOPES variable will be just this:
SCOPES = 'https://www.googleapis.com/auth/drive.metadata.readonly'
The next section of boilerplate represents the security code:

store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)

Once the user has authorized access to their personal data by your app, a special "access token" is given to your app. This precious resource must be stored somewhere local for the app to use. In our case, we'll store it in a file called "storage.json". The lines setting the store and creds variables are attempting to get a valid access token with which to make an authorized API call.

If the credentials are missing or invalid, such as being expired, the authorization flow (using the client secret you downloaded along with a set of requested scopes) must be created (by client.flow_from_clientsecrets()) and executed (by tools.run_flow()) to ensure possession of valid credentials. The client_secret.json file is the credentials file you saved when you clicked "Download JSON" from the DevConsole after you've created your OAuth2 client ID.

If you don't have credentials at all, the user much explicitly grant permission — I'm sure you've all seen the OAuth2 dialog describing the type of access an app is requesting (remember those scopes?). Once the user clicks "Accept" to grant permission, a valid access token is returned and saved into the storage file (because you passed a handle to it when you called tools.run_flow()).

Note: tools.run() deprecated by tools.run_flow()
You may have seen usage of the older tools.run() function, but it has been deprecated by tools.run_flow(). We explain this in more detail in another blogpost specifically geared towards migration.

Once the user grants access and valid credentials are saved, you can create one or more endpoints to the secure service(s) desired with googleapiclient.discovery.build(), just like with simple API access. Its call will look slightly different, mainly that you need to sign your HTTP requests with your credentials rather than passing an API key:

DRIVE = discovery.build(API, VERSION, http=creds.authorize(Http()))

In our example, we're going to list your files and folders in your Google Drive, so for API, use the string 'drive'. The API is currently on version 3 so use 'v3' for VERSION:

DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

If you want to get comfortable with OAuth2, what it's flow is and how it works, we recommend that you experiment at the OAuth Playground. There you can choose from any number of APIs to access and experience first-hand how your app must be authorized to access personal data.

Going back to our working example, once you have an established service endpoint, you can use the list() method of the files service to request the file data:

files = DRIVE.files().list().execute().get('files', [])

If there's any data to read, the response dict will contain an iterable of files that we can loop over (or default to an empty list so the loop doesn't fail), displaying file names and types:

for f in files:
print(f['name'], f['mimeType'])

Conclusion

To find out more about the input parameters as well as all the fields that are in the response, take a look at the docs for files().list(). For more information on what other operations you can execute with the Google Drive API, take a look at the reference docs and check out the companion video for this code sample. Don't forget the codelab and this sample's GitHub repo. That's it!

Below is the entire script for your convenience:

'''
drive_list.py -- Google Drive API demo; maintained at:
    http://github.com/googlecodelabs/gsuite-apis-intro
'''
from __future__ import print_function

from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = 'https://www.googleapis.com/auth/drive.readonly.metadata'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)

DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))
files = DRIVE.files().list().execute().get('files', [])
for f in files:
    print(f['name'], f['mimeType'])

When you run it, you should see pretty much what you'd expect, a list of file or folder names followed by their MIMEtypes — I named my script drive_list.py:

$ python3 drive_list.py
Google Maps demo application/vnd.google-apps.spreadsheet
Overview of Google APIs - Sep 2014 application/vnd.google-apps.presentation
tiresResearch.xls application/vnd.google-apps.spreadsheet
6451_Core_Python_Schedule.doc application/vnd.google-apps.document
out1.txt application/vnd.google-apps.document
tiresResearch.xls application/vnd.ms-excel
6451_Core_Python_Schedule.doc application/msword
out1.txt text/plain
Maps and Sheets demo application/vnd.google-apps.spreadsheet
ProtoRPC Getting Started Guide application/vnd.google-apps.document
gtaskqueue-1.0.2_public.tar.gz application/x-gzip
Pull Queues application/vnd.google-apps.folder
gtaskqueue-1.0.1_public.tar.gz application/x-gzip
appengine-java-sdk.zip application/zip
taskqueue.py text/x-python-script
Google Apps Security Whitepaper 06/10/2010.pdf application/pdf

Obviously your output will be different, depending on what files are in your Google Drive. But that's it... hope this is useful. You can now customize this code for your own needs and/or to access other Google APIs. Thanks for reading!

EXTRA CREDIT: To test your skills, add functionality to this code that also displays the last modified timestamp, the file (byte)size, and perhaps shave the MIMEtype a bit as it's slightly harder to read in its entirety... perhaps take just the final path element? One last challenge: in the output above, we have both Microsoft Office documents as well as their auto-converted versions for Google Apps... perhaps only show the filename once and have a double-entry for the filetypes!

Core Python Programming

Thursday, June 1, 2017

Managing Shared (formerly Team) Drives with Python and the Google Drive API

Introduction

Using the Google Drive API

Create Team Drives

Add members to Team Drives

Create a folder in Team Drives

Import/upload files to Team Drives folders

Driver app

Conclusion

Code challenge

Monday, July 11, 2016

Exporting a Google Sheet spreadsheet as CSV

Introduction

Choosing the right API

Using the Google Drive API

Query and export files from Google Drive

Conclusion

Wednesday, December 23, 2015

Migrating to Google Drive API v3

Introduction

Migrating from Google Drive API v2 to v3

Conclusion

Monday, December 14, 2015

Google Drive: Uploading & Downloading files with Python

Introduction

Google Drive API Scopes

Using the Google Drive API

Conclusion

Thursday, November 6, 2014

Authorized Google API access from Python (part 2 of 2)

Introduction

Google API access

Accessing Google APIs from Python

Conclusion

About Me

Blog Archive