Back to blog
June 4, 20255 min read

Using Sitemaps for Website Crawling

Point PageGPT at your XML sitemap to automatically index all pages at once.

Using Sitemaps for Website Crawling

Instead of adding URLs one by one, you can use a sitemap to crawl and index your entire website in one go.

What is an XML Sitemap?

A sitemap is a file (usually at /sitemap.xml) that lists all the pages on your website. Most CMS platforms like WordPress, Webflow, and Shopify generate one automatically.

How to Use It

  • Open your chatbot and click the Links tab.
  • Paste your sitemap URL (e.g. https://yoursite.com/sitemap.xml) in the URL field.
  • Click Add URL. PageGPT will detect it's a sitemap and expand all the listed URLs.
  • Each page is added as a separate source and queued for crawling.
  • Sitemap Index Files

    If your site uses a sitemap index file (a sitemap of sitemaps), PageGPT will recursively expand it and index all child sitemaps.

    Crawl Limits

    PlanPages per chatbot
    Free50
    Starter500
    Pro5,000
    Business50,000

    Re-crawling Updated Content

    Content on your website changes over time. Use the Retrain button to re-crawl all sources and update the chatbot's knowledge base. Pro and Business plans support scheduled auto-retraining.

    Tips

    • Make sure your sitemap is publicly accessible (no auth required).
    • If some pages have dynamic content (e.g., JavaScript-rendered), PageGPT may not capture all text. Use the Text tab to supplement with custom content.

    Ready to try PageGPT?

    Build your first AI chatbot in under 5 minutes. Free — no credit card needed.

    Get started free →