Ask AI
HomeSupport & TroubleshootingTroubleshooting article import failures

Troubleshooting article import failures

Users of Bot9.ai may occasionally face challenges in importing articles for bot training, primarily due to crawling/indexing issues. This article aims to address these challenges and provide effective solutions.

Understanding the issue

Crawling/indexing failures can occur for various reasons, impacting the bot's ability to access and retrieve data from specified sources. Some common causes include:

  1. Cloudfare Blocks: Security services like Cloudflare can block automated crawlers.

  2. JavaScript Rendering: Sites built with frameworks like ReactJS often require client-side rendering, which simple crawlers can’t handle.

  3. Robots.txt: The website’s robots.txt file might disallow crawling.

  4. CAPTCHAs: Some sites have CAPTCHAs to block automated access.

  5. User-Agent Filtering: Sites may block or redirect certain user-agents.

  6. AJAX Requests: Content loaded via AJAX may not be easily crawlable.

  7. Rate Limiting: Some websites limit the number of requests from a single IP.

  8. Login Required: Some content is behind a login wall.

  9. Infinite Scroll: Crawlers may find it hard to scrape sites with infinite scrolling.

  10. iFrames: Content within iFrames may not be crawlable.

  11. Cookies: Some sites require cookies to be enabled.

  12. Regional Blocks: Geo-blocking restricts content to certain locations.

  13. HTTP Headers: Incorrect or missing headers can lead to failed requests.

  14. Redirects: Too many redirects can confuse a crawler.

  15. Session Expiry: Some sites have short session lifetimes, causing crawlers to be kicked out.

  16. Network Errors: General network issues can also cause failures.

  17. Data Format: Non-standard data formats may be hard to parse.

  18. Broken Links: 404s or other errors prevent crawling.

  19. Heavy Computation: Some sites employ heavy client-side computations before displaying content.

  20. Ad Blockers: Some sites detect ad-blockers and restrict content.

Consequences

If Bot9 fails to index certain data, the bot won't include information from those sources, potentially affecting its performance and knowledge base.

Recommended Solutions for article import failures

  1. Manual Import Mode:

    • Use this mode to manually input or upload content for the bot's training.

  2. Check Source Accessibility:

    • Ensure the sources are accessible and not restricted by security measures like Cloudflare.

    • Regularly check for any changes in the website structures of your sources.

  3. Network and Connectivity Checks:

    • Ensure a stable and robust network connection.

  4. Review Robots.txt Files:

    • Check if the source websites have any crawling restrictions in their robots.txt files.

  5. Contact Support:

    • Reach out to Bot9's support team on helpme@bot9.ai for further assistance.

Understanding the common causes of import failures and applying these solutions should help in resolving the issue effectively.

"Write manually" remains a reliable alternative to ensure continuous bot training.

Copyright © 2025 Bot9. All rights reserved.