Google Drive Connector Overview
About the Google Drive Connector
The Glean Google Drive connector enable secure and efficient data fetching from your Google Drive tenant within Google Workspace. User permissions are strictly enforced and all data remains securely within your environment.
- Glean requires authentication to the GDrive instance in order to fetch relevant information.
- Authentication is done by creating a Service Account in GDrive.
- Glean understands all user access permissions and strictly enforces permissions for users at the time of the query which ensures that users are not able to see results which they do not have access to.
- Quicklinks are provided to quickly create Docs, Sheets, and Slides content in GDrive.
Integration Features
For GDrive, Glean will capture the following content:
- Folders
- Documents
- Native file types such as Docs, Slides, and Sheets.
- All supported files in GDrive.
- People data and identity information as described in the associated GDrive People Data Connector document.
API Usage & Permissions
Glean will use the Google Directory, Drive, and Reports APIs to ingest data.
A new OAuth Client will be created with the following API scopes:
- https://www.googleapis.com/auth/admin.directory.group.readonly
- https://www.googleapis.com/auth/admin.directory.user.readonly
- https://www.googleapis.com/auth/drive.readonly
- https://www.googleapis.com/auth/admin.reports.audit.readonly
More information: API Endpoints
Setup Prerequisites
Setting up the GDrive connector requires the creation of a service account with a custom role scoped as follows:
Admin Console Privileges
- Organization Units > Read
- Users > Read
- Services > Drive and Docs > Settings
- Reports
Admin API Privileges
- Organization Units > Read
- Users > Read
- Groups > Read
Crawling Restrictions
The Google Drive connector will crawl your entire Google Drive tenant by default. You may optionally redlist (explicitly exclude) or greenlist (explicitly include) content that you want/do not want Glean to crawl.
Specifically, content can be restricted based on:
- Shared Drives only
- Shared Drive ID
- Folder ID
- User / AD Group
- Time
More information: Restricting Content