Box Connector Support for Content Restrictions
Overview
- Greenlist restrictions permit Glean to only crawl and index specified content (specific include).
- Redlist restrictions permit Glean to crawl and index everything except the specified content (specific exclude).
| Restriction Type | Greenlist | Redlist | Details |
|---|---|---|---|
| Time-based Restrictions | ✅ | ❌ | Restrict crawling to include/exclude content created/modified/viewed after a certain date. |
| User-based Restrictions | ❌ | ✅ | Restrict crawling to include/exclude content created/modified/viewed by specific users or a specific group (plus public content). |
| Content-based Restrictions | ❌ | ✅ | Restrict crawling to include/exclude specific content, documents, messages, or objects (see below). |
Supported Restrictions
| Restriction | Greenlist | Redlist | Details |
|---|---|---|---|
| Date | ✅ | ❌ | Restrict crawling to only content created/modified/viewed after a specific date, e.g. YYYY-MM-DD |
| User (Owner) | ❌ | ⚠️ | Restrict crawling to exclude content owned by specific users. |
| Content (Folder) | ❌ | ⚠️ | Restrict crawling to exclude content within specific folders. |
| Content (File) | ❌ | ✅ | Restrict crawling to exclude specific files. |
Info
When specifying restrictions for Owners, Folders, or Files, the ID of the owner, folder, or file within Box must be specified. For example:
Owner IDs:
23400261190,23401260091
Folder IDs:
119142000606,518142000607
File IDs:
31600504200,31600504201
Error Prevention
When specifying folder IDs for crawling exclusions, you must specify the ID of the parent folder as well as the folder IDs for all children.
Applying Restrictions
| Method | Supported | Details |
|---|---|---|
| Admin UI | ❌ | Restrictions cannot currently be applied in the Admin UI. |
| Glean Support | ✅ | Restrictions can be applied by Glean Support on request. |
Limitations
Folder & File Redlisting
Glean monitors the event feed for your Box environment to pick up new content and changes to existing content.
When a new event in Box is detected (creation/modification/view of a file), there is currently no way for Glean to determine whether the content is within a redlisted folder or not. In this scenario, the content will be crawled, even if the parent folder is redlisted. This is corrected on the next incremental crawl.
Glean is currently investigating solutions to the above with assistance from Box. Please contact Glean Support for more information.
User (Owner) Redlisting
Any decendant files created by other users in a folder created by a redlisted user will still be indexed by Glean.
For example:
- There are 2 users: User A and User B. User A is redlisted, User B is not.
- User A creates a folder in Box and shares it with User B.
- User B adds a file to the folder created by User A.
- The file created by User B will be indexed by Glean, despite being in a folder owned by redlisted User A, because User B has ownership of the file.
Glean is currently investigating solutions to the above with assistance from Box. Please contact Glean Support for more information.