Configure SharePoint using Site-specifc FullControl
Warning
Applying the FullControl SharePoint permission at an individual site level, instead of tenant-wide, means that your SharePoint administrators will need to explicitly authorize Glean for all current (and future) sites that you want to crawl.
Glean will be unable to automatically crawl sites unless they have been explicitly specified.
More information: Permission Alternatives.
During normal configuration of the SharePoint connector, Glean requests the FullControl permission at a SharePoint tenant level.
- This allows Glean to crawl and index each site collection within your SharePoint tenant without them having to be explicitly specified.
- In turn, this means that as sites are added or removed, your Glean configuration does not need to change.
Some organizations have concerns around the breadth of access this permission gives Glean's SharePoint crawler. As an alternative, the FullControl permission can be granted at a site-level rather than for the entire tenant.
This however comes with the following caveats:
- All site collections that you wish to crawl must have their URLs explicitly defined in the Glean UI.
- You must apply the
FullControlpermission explicitly, both for every Glean-SharePoint app created in Entra ID, AND for each site you want to crawl.- This can be time consuming and puts higher operational overhead on your organization if you want Glean to provide search capability for additional sites in the future.
Why is the FullControl permission required?
FullControl permissions are required to fetch role assignments and access permissions for the site pages and associated web components of each site. The Graph API only exposes access permissions for Document Library items, hence it cannot be used to obtain the information needed by Glean.
The SharePoint REST API endpoint responsible for returning this data returns a HTTP 403 Forbidden response when the API is queried with any other permission other than FullControl (i.e. Read permission).
Glean does not perform any write actions to your SharePoint tenant. Only read actions (i.e. HTTP GET) are performed.
For more information, please see this StackOverflow post.
What alternatives are there to FullControl?
Unfortunately, there are no alternatives to FullControl at this time. Some of the data required by Glean can only be obtained using:
- Endpoints only present in v1 of the SharePoint REST API, and
- SharePoint API v1 endpoints that require
FullControlto return data as the permission of least priviledge.
If either of these change in the future, the use of FullControl will no longer be required, and Glean will deprecate its use.
For customers that have a Glean cloud-prem deployment, you can implement WAF rules to restrict the Glean SharePoint crawler to only be able to perform HTTP GET (i.e. read) requests towards the SharePoint REST API endpoints documented here.
More information:
Requirements
- The user setting up this connector must be the Global Admin role.
- A list of the SharePoint site URLs that are in scope to be crawled by Glean.
- PowerShell 7 (with the SharePoint PnP.PowerShell module, v2.3.0+, installed).
Process
1 - Configure the SharePoint Connector
Proceed through the standard procedure for configuring the SharePoint connector until you reach Step 8 - Configure SharePoint REST API Permissions.
2 - Grant SharePoint REST API permissions for each Site and App
Error prevention
You will need to follow this section to enable the SharePoint REST API permissions for all of:
- Each of the additional apps created above, AND
- Each of the sites you wish to crawl.
Failing to action these steps will cause crawling to fail.
-
Navigate to the permission request page for the target site:
https://<sharepoint_domain>.sharepoint.com/sites/<site_name>/_layouts/15/appinv.aspx- E.g. If your SharePoint domain is
company.sharepoint.com, and the site you are applying the permission to is calledmysite, navigate to:https://company.sharepoint.com/sites/mysite/_layouts/15/appinv.aspx
- E.g. If your SharePoint domain is
-
For each Glean-SharePoint app created in Entra ID (the parent app and all additional apps), complete the following:
-
For App Id, paste in the Application (client) ID value and click the Lookup button. The Title field will automatically populate with the name of the associated App Registration (e.g.
Glean SharePoint Crawler,Glean SharePoint Crawler - 2, etc) -
For App Domain enter:
glean.com -
For Redirect URL enter:
https://glean.com -
In the Permission Request XML field, paste the following:
<AppPermissionRequests AllowAppOnlyPolicy="true"> <AppPermissionRequest Scope="http://sharepoint/content/sitecollection" Right="FullControl" /> <AppPermissionRequest Scope="http://sharepoint/content/sitecollection/web" Right="FullControl" /> </AppPermissionRequests> -
Click Create to apply the permissions.

-
Repeat steps a-e for each additional app.
-
Heads Up!
You can check which of the Glean-SharePoint apps have been authorized for a specific site by navigating to:
https://<sharepoint_domain>.sharepoint.com/sites/<site_name>/_layouts/15/appprincipals.aspx?Scope=Web
3 - Validate Settings
Back in the Glean UI, click Save. Glean will go through and validate that the required permissions for each Glean-SharePoint app (both in the Graph API and SharePoint REST API) have been granted.
Error: Unable to fetch O365 Sharepoint site groups.
Depending upon the age of your SharePoint Online tenant, you might receive the following error:
Unable to fetch O365 Sharepoint site groups. Please check that the sharepoint/content/sitecollection scopes are enabled with FullControl for Sharepoint REST API.
This is normal!
If your SharePoint Online tenant is newer (typically 2020 onwards), then the method of authenticating to the SharePoint REST API (Azure Access Control Services (ACS)) is disabled by default. This was enabled by default in older tenants to assist with migration from SharePoint on-premises.
To use the SharePoint REST API, you need to enable ACS. You can enable ACS using PowerShell:
-
Install the required modules (PowerShell 7+ is required):
Install-Module -Name PnP.PowerShell -RequiredVersion 2.3.0 Install-Module -Name Microsoft.Online.SharePoint.PowerShell- You can check the latest stable version of the PnP.PowerShell module here.
- The default version of PowerShell that comes with Windows 10 and 11 is PowerShell 5.1. You can install PowerShell 7.X alongside PowerShell 5.1.
- To check your PowerShell version, run the
$PSVersionTablecommand in PowerShell and review the version next to thePSVersionfield. - Microsoft have installation (and migration) instructions located here.
-
Connect to your SharePoint domain:
Connect-PnPOnline -Url https://<sharepointdomain>-admin.sharepoint.com -Interactive- The
-Interactiveflag will open a browser window for you to authenticate using SSO. This allows MFA to be used (if configured).
- The
-
Enable ACS:
Set-PnPTenant -DisableCustomAppAuthentication $false-
You can check the status of this flag at anytime by using the
Get-PnPTenantcommand:PS /Users/username> Get-PnPTenant [...snip...] DisableCustomAppAuthentication : False [...snip...]
-
-
Attempt to click Save again in the Glean UI. DO NOT start crawling just yet.
4 - List the Site URLs to be Crawled
Glean cannot automatically determine the sites that have been granted the FullControl permission, so you must explicitly tell Glean to crawl these sites.
- Navigate to SharePoint > Manage Data > Inclusion rules
-
Provide a comma-separated list of all the Site URLs to be crawled.
- This can also just be the subsites of the site collections with permissions.
- If a site collection and all associated subsites should be crawled, provide all the URLs explicitly in the inclusion rules list.

5 - Start Crawling
Click on the Overview tab, followed by the Start Crawling button to begin indexing your organization's SharePoint content.
Success
You have successfully connected SharePoint and OneDrive to Glean!
You can check the status of your crawl by navigating to Workspace Settings > Setup > Apps, and examining the Items Indexed, Crawler, and Crawling status fields.
Depending on the amount of content in your SharePoint and OneDrive environments, crawling can take anywhere from 24 hours to 1+ week to fully complete.