Skip to content

Conversation

maxesse
Copy link

@maxesse maxesse commented Aug 20, 2025

Closes #3603

The change excludes specific /contentstorage/ urls from the sync in all API calls to Sharepoint. These URLs should not be attempted to be accessed as they're created internally by Sharepoint for Teams private channels, loop components, etc. (it's fairly undocumented what they're used for to be honest), and have a different permission model that will cause 401 errors and the connector to stop syncing.

Checklists

Pre-Review Checklist

  • this PR does NOT contain credentials of any kind, such as API keys or username/passwords (double check config.yml.example)
  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • this PR has a thorough description
  • Tested the changes locally
  • For bugfixes: backport safely to all minor branches still receiving patch releases

Release Note

Fixes an issue where a Sharepoint Online sync configured to crawl the entire tenant by selecting * in the site list, might stop with 401 errors when trying to access URLs containing /contentstorage/.

@maxesse maxesse requested a review from a team as a code owner August 20, 2025 09:13
Copy link

cla-checker-service bot commented Aug 20, 2025

💚 CLA has been signed

@artem-shelkovnikov
Copy link
Member

buildkite test this

@maxesse
Copy link
Author

maxesse commented Aug 20, 2025

Ok I signed the contributor agreement too, hopefully the cla-checker-service will do another pass.

@artem-shelkovnikov
Copy link
Member

CLA checked is happy, but test coverage fell below 92% now - you can verify it by running make ftest.

I also see that you've added the check in a lot of places - is it needed?

Theoretically, if you just ignore the site itself, you won't need to propagate the change down to all entities, or is it not the case?

@maxesse maxesse force-pushed the fix/sharepoint-contentstorage-urls branch from 319658c to a433cff Compare October 6, 2025 14:59
@maxesse
Copy link
Author

maxesse commented Oct 6, 2025

CLA checked is happy, but test coverage fell below 92% now - you can verify it by running make ftest.

I also see that you've added the check in a lot of places - is it needed?

Theoretically, if you just ignore the site itself, you won't need to propagate the change down to all entities, or is it not the case?

Hi Artem - sorry for the long silence, between holidays and lots of other stuff I didn't have time to fix this. You're right, it should be enough to keep the check in the core places. I updated it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sharepoint Connector fails sync due to crawling /contentstorage/ URLs
2 participants