Skip to content

Use automatic feed generation

Automatic feed generation lets html2rss-web create a stable feed from a page URL. It is useful when the included config set does not already cover the site you want.

Use this only after you have already verified your instance with an included feed. In production, this feature is disabled by default and should be enabled consciously on your own instance.

This flow depends on three separate things:

  • AUTO_SOURCE_ENABLED=true on the server
  • a bearer token that the instance accepts for feed creation
  • Browserless configured if the target page needs JavaScript rendering

The generated API contract for this flow is published at /openapi.yaml.

Edit your docker-compose.yml and enable automatic feed generation:

environment:
AUTO_SOURCE_ENABLED: "true"

Keep the existing BROWSERLESS_IO_WEBSOCKET_URL and BROWSERLESS_IO_API_TOKEN settings if you want JavaScript-heavy pages to work reliably.

Then restart the stack:

Terminal window
docker compose up -d
  1. Open your instance at http://localhost:4000
  2. Paste a page URL into Create a feed
  3. Add a valid access token when prompted
  4. Choose a strategy if needed, then submit
  5. Copy the generated feed URL or open it directly

When the flow works, you should see:

  • a generated feed URL
  • a copy action
  • an open-feed action
  • a preview of recent entries when available
  • the same feed staying available at its tokenized URL

That is enough to confirm the self-hosted flow is working.

  • faraday is the default strategy and should be your first try for most pages.
  • During the feed-creation API request (POST /api/v1/feeds) from the web UI, a faraday submission may be retried once with browserless when the first failure looks retryable.
  • If that fallback attempt fails, or if the first failure is clearly auth/URL/unsupported-strategy related, the UI stops and shows an error.
  • This retry behavior is scoped to feed creation. It is not a general retry layer for later feed rendering (GET /api/v1/feeds/:token) or preview loading.

Automatic generation is most successful when the input URL is already a listing/update surface.

  • Higher-success inputs:
  • newsroom/press listing pages
  • category/tag/archive/listing pages
  • changelog/release/update pages
  • Lower-success inputs:
  • generic homepages
  • search pages
  • app-shell entrypoints (client-rendered shells)

If output quality is poor, switch the input to a direct listing/update URL before assuming the feature is broken.

The backend runtime classifies common extraction failures with clearer intent:

  • blocked/interstitial surface likely
  • app-shell surface likely
  • unsupported extraction surface for auto mode

In the current web product flow, these categories are mostly internal/operator-level signals (runtime/logging). They are not guaranteed to appear as labeled categories in the UI.

What users typically see today:

  • feed-creation API errors (for example auth/URL/unsupported strategy)
  • preview-level fallback text such as Preview unavailable right now.
  • feed render error payloads when opening feed URLs directly

Browserless Troubleshooting In html2rss-web

Section titled “Browserless Troubleshooting In html2rss-web”

If Browserless-backed attempts fail:

  • verify the Browserless container/service is running
  • verify BROWSERLESS_IO_WEBSOCKET_URL is reachable from the web container
  • verify BROWSERLESS_IO_API_TOKEN matches the Browserless TOKEN

For local Compose-based setups, check container health/logs with:

Terminal window
docker compose ps browserless && docker compose logs browserless

Automatic feed generation is the fast first pass, not the final answer for every site.

Move on to Creating Custom Feeds when:

  • the generated feed misses important fields
  • the wrong items are extracted
  • the site needs a stable, reviewable setup
  • you need repeatable selector-level control to make the feed reliable