Cloudflare’s Firewall and Accessibility Scanning
Cloudflare is a very popular service that, among other things, can help protect your site against malicious actors. Unfortunately, sometimes services like tidyDOM that programmatically crawl your site can trigger Cloudflare’s protections and prevent us from scanning your site.
Fortunately, it’s easy to prevent this from happening by updating a few settings in your Cloudflare account.
Here are the steps:
- Go to the cloudflare panel and choose the domain name of the site your are scanning with tidyDOM.
- Go to the "Firewall" tab, and select "Firewall Rules".
- Click "Create a Firewall Rule".
- Match the settings below, using the noted IP addresses in the "Value" field. We may add additional IP addresses in the future, which will require you to update this firewall rule in Cloudflare. If that happens, you will receive a notification in advance and the list here will be updated.
- Select a name
- Inside the "When incoming requests match" section
- Set "Field" to IP Address
- Set "Operator" to "is in"
- In the Value field, add the IP addresses listed above
- Under "Then", choose "Bypass"
- Select all the available features
- Click "Deploy".
Since Cloudflare may cache public pages, depending on your deployment process, changes to your HTML may not be immediately visible to tidyDOM.
If you experience this, here are some options:
- Purge the Cloudflare cache using their API as part of your deployment process. This solution will require a developer.
- Manually purge the Cloudflare cache after deploying, but before starting a manual scan.
- If you are triggering a scan with a webhook, you can set a delay in the webhook settings that is longer than your Cloudflare cache duration.
See https://support.cloudflare.com/hc/en-us/articles/200169246-Purging-cached-resources-from-Cloudflare for more details on purging the Cloudflare cache.