Skip to content

Handling Dynamic Content

Some websites load their content dynamically using JavaScript. The default html2rss strategy might not see this content.

Use the browserless strategy to render JavaScript-heavy websites with a headless browser.

Keep the strategy at the top level and put request-specific options under request:

strategy: browserless
request:
max_redirects: 5
max_requests: 6
browserless:
  preload:
    wait_after_ms: 5000
channel:
url: https://example.com/app
selectors:
items:
  selector: .article
title:
  selector: h2
url:
  selector: a
  extractor: href

The browserless strategy is necessary when:

  • Content loads after page load - JavaScript fetches data from APIs
  • Single Page Applications (SPAs) - React, Vue, Angular apps
  • Infinite scroll - Content loads as you scroll
  • Dynamic forms - Content changes based on user interaction

For dynamic sites, rendering once is often not enough. Use request.browserless.preload to wait, click, or scroll before the HTML snapshot is taken.

strategy: browserless
request:
browserless:
preload:
wait_after_ms: 4000
strategy: browserless
request:
browserless:
preload:
wait_after_ms: 3000
click_selectors:
- selector: ".load-more"
max_clicks: 3
wait_after_ms: 250
strategy: browserless
request:
browserless:
preload:
scroll_down:
iterations: 5
wait_after_ms: 200
wait_after_ms: 2500

These preload steps can be combined in a single config when a site needs several interactions before all items appear.

The browserless strategy is slower than the default faraday strategy because it:

  • Launches a headless Chrome browser
  • Renders the full page with JavaScript
  • Takes more memory and CPU resources

Use faraday for static content and only switch to browserless when necessary.