Yeah this is a good call-out. If the site is being used for drive-by or targeted malware there are other checks that may be happening alongside the redirect such as user agent, country of origin (like you mentioned), plugins installed, OS, or even time of day.
If they detect something that matches what they want, they may throw some intermediate 301's to pages that attempt to infect the user with something still ultimately redirecting to the "normal" page.
Just a note 301s are super sticky and browsers cache them even across incognito modes. Your best bet is to use a new browser after reconnecting to avoid false results.
On Chromium-based browsers, if you open the Developer Tools (F12 or Inspect in right click) and you go to the Network tab, you can click 'Disable Cache'.
In my experience, this solves the sticky 301 issue and you should have no issues with cached 301s anymore.
Works perfect for these kind of investigations or if you made a mistake during site development.
I'm not GP but a decade ago when I started out as a web developer I made the mistake of using 301s in production and at the time we never figured out how to get the browser to re-learn the responses for those pages without drastic measures.
I still never use 301s for that reason. Things may have changed, but I dare not try!
> I still never use 301s for that reason. Things may have changed, but I dare not try!
I use 301 for http:->https: redirects because (a) I doubt we're going back, (b) it prevents some cleartext leaks (like the Host header), and (c) it is slightly cheaper.
> we never figured out how to get the browser to re-learn the responses for those pages without drastic measures.
If you control the target URL it is easy, just redirect back. Seriously: The browser won't loop, it'll just fetch the content again and now not seeing a 301 will forget that nonsense ever happened. This is why 301 is usually a fine default for same-site redirects, or if the redirect target is encoded in the URL (such as in tracking URLs).
The big no-no is don't 301 to a URL you can't control unless you have the appropriate Cache-Control headers on the redirect.
Yeah that's a good point, but one way to think about a CDN is like a web browser that you control, so I say do it even with a CDN and remember you can always just flush the "browser" cache! (or in cloudfront's case: create an invalidation and wait a few seconds)
You can disable caching in Firefox's developer tools, this covers such cached redirects. Very useful combined with a persistent log of network activity to avoid clears after redirects.
If they detect something that matches what they want, they may throw some intermediate 301's to pages that attempt to infect the user with something still ultimately redirecting to the "normal" page.