Skip to content

Getting Started

Run html2rss-web locally with Docker and verify one included feed before enabling direct feed generation.

After this guide, you should have:

  • html2rss-web running at http://localhost:4000
  • the web interface loading correctly
  • a first included feed URL you can copy into your reader
  • a clear path to either token-gated feed generation or custom configs

This guide uses a local Docker Compose stack.

  • Docker
  • About 10 minutes

If you do not already have Docker, install it first.

Create a new folder for html2rss-web:

Terminal window
mkdir html2rss-web && cd html2rss-web

Step 2: Create a Minimal Configuration File

Section titled “Step 2: Create a Minimal Configuration File”

Create a file called docker-compose.yml in that folder and start with the minimal local stack:

services:
html2rss-web:
image: html2rss/web:1
restart: unless-stopped
ports:
- "127.0.0.1:4000:4000"
env_file:
- path: .env
required: false
environment:
RACK_ENV: production
PORT: 4000
HTML2RSS_SECRET_KEY: ${HTML2RSS_SECRET_KEY:?set HTML2RSS_SECRET_KEY}
HEALTH_CHECK_TOKEN: ${HEALTH_CHECK_TOKEN:?set HEALTH_CHECK_TOKEN}
SENTRY_DSN: ${SENTRY_DSN:-}
BROWSERLESS_IO_WEBSOCKET_URL: ws://browserless:4002
BROWSERLESS_IO_API_TOKEN: ${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}
BOTASAURUS_SCRAPER_URL: http://botasaurus:4010
botasaurus:
image: html2rss/botasaurus-scrape-api:latest
restart: unless-stopped
browserless:
image: "ghcr.io/browserless/chromium"
restart: unless-stopped
ports:
- "127.0.0.1:4002:4002"
environment:
PORT: 4002
CONCURRENT: 10
TOKEN: ${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}

Add automatic updates, reverse proxying, or your own config file after this first run works.

Create a .env file in the same folder (minimum required values for this stack):

Terminal window
cat > .env <<EOF
HTML2RSS_SECRET_KEY=$(openssl rand -hex 32)
HEALTH_CHECK_TOKEN=$(openssl rand -hex 24)
BROWSERLESS_IO_API_TOKEN=trial-browserless-token
EOF

Then run:

Terminal window
docker compose up -d
  1. Open http://localhost:4000
  2. Confirm the web interface loads
  3. Open one of the included feed URLs from your own instance:
    • http://localhost:4000/microsoft.com/azure-products.rss
    • http://localhost:4000/phys.org/weekly.rss
    • http://localhost:4000/softwareleadweekly.com/issues.rss
  4. Confirm the feed opens
  5. Copy that feed URL into your reader if you want to keep it

If that works, the local app and included-feed path are ready.

What Changes If You Enable Feed Generation

Section titled “What Changes If You Enable Feed Generation”

Automatic feed generation is off by default in production. When you enable it later:

  • the web app creates feeds through POST /api/v1/feeds
  • that API requires a bearer token
  • the UI starts with faraday and automatically retries once with browserless when appropriate
  • Browserless still needs to be configured for JavaScript-heavy pages

If you are integrating this flow programmatically, the generated OpenAPI is available at /openapi.yaml.

  1. Use the included configs: understand how built-in feed paths work
  2. Use automatic feed generation: enable direct feed creation from page URLs when you want that workflow
  3. Create Custom Feeds: write your own configs when you need reviewable extraction rules
  4. Need help?: troubleshoot startup and extraction problems