Handling Dynamic Content
Some websites load their content dynamically using JavaScript. The default html2rss strategy might not see this content.
Solution
Section titled “Solution”Use the browserless strategy to render JavaScript-heavy websites with a headless browser.
Keep the strategy at the top level and put request-specific options under request:
strategy: browserless
request:
max_redirects: 5
max_requests: 6
browserless:
preload:
wait_after_ms: 5000
channel:
url: https://example.com/app
selectors:
items:
selector: .article
title:
selector: h2
url:
selector: a
extractor: href
When to Use Browserless
Section titled “When to Use Browserless”The browserless strategy is necessary when:
- Content loads after page load - JavaScript fetches data from APIs
- Single Page Applications (SPAs) - React, Vue, Angular apps
- Infinite scroll - Content loads as you scroll
- Dynamic forms - Content changes based on user interaction
Preload Actions
Section titled “Preload Actions”For dynamic sites, rendering once is often not enough. Use request.browserless.preload to wait, click, or scroll before the
HTML snapshot is taken.
Wait Before Capturing Dynamic Content
Section titled “Wait Before Capturing Dynamic Content”strategy: browserlessrequest: browserless: preload: wait_after_ms: 4000Click “Load More” Buttons
Section titled “Click “Load More” Buttons”strategy: browserlessrequest: browserless: preload: wait_after_ms: 3000 click_selectors: - selector: ".load-more" max_clicks: 3 wait_after_ms: 250Scroll Infinite Lists
Section titled “Scroll Infinite Lists”strategy: browserlessrequest: browserless: preload: scroll_down: iterations: 5 wait_after_ms: 200 wait_after_ms: 2500These preload steps can be combined in a single config when a site needs several interactions before all items appear.
Performance Considerations
Section titled “Performance Considerations”The browserless strategy is slower than the default faraday strategy because it:
- Launches a headless Chrome browser
- Renders the full page with JavaScript
- Takes more memory and CPU resources
Use faraday for static content and only switch to browserless when necessary.
Related Topics
Section titled “Related Topics”- Strategy Reference - Complete strategy documentation
- Troubleshooting - Common issues with dynamic content
- Advanced Features - Performance optimization tips