I’ve had a look and it seems that the robots.txt
is an Edge Function/Middleware which fails on invocation. You can test this yourself:
→ curl -I https://atlantawhereyoubelong.com/robots.txt
HTTP/2 500
cache-control: public, max-age=0, must-revalidate
content-type: text/plain; charset=utf-8
date: Thu, 01 Aug 2024 05:05:58 GMT
server: Vercel
strict-transport-security: max-age=63072000
x-vercel-error: EDGE_FUNCTION_INVOCATION_FAILED
x-vercel-id: fra1::pq8s7-1722488758778-c08f7de0f7fe
content-length: 61
I recommend checking your Runtime Logs and you might see that locale information (accept-language
header) is missing, which I believe Googlebot also doesn’t send in this case and this is causing the crawling to fail.
However, once you add this header to the request, actual content is returned. The recommended mitigation is to not only rely on the accept-language
header rather than having a fallback if this header is not present:
→ curl -I https://atlantawhereyoubelong.com/robots.txt -H "accept-language: en-US"
HTTP/2 200
accept-ranges: bytes
access-control-allow-origin: *
age: 78
cache-control: public, max-age=0, must-revalidate
content-disposition: inline
content-type: text/plain
date: Thu, 01 Aug 2024 05:10:39 GMT
etag: "8a10c1ad8c7af097f987274f026391f4"
server: Vercel
strict-transport-security: max-age=63072000
vary: RSC, Next-Router-State-Tree, Next-Router-Prefetch, Next-Url
x-matched-path: /robots.txt
x-vercel-cache: HIT
x-vercel-id: fra1::qdb49-1722489039839-5ba7582cc320
content-length: 404