Custom HTTP Requests
Some sites only work when requests carry the headers, tokens, or cookies your browser uses. html2rss supports those cases without changing the rest of your feed workflow.
Keep this structure in mind:
headersstays top-levelstrategystays top-level- request-specific controls such as budgets and Browserless options live under
request
When You Need Custom Headers
Section titled “When You Need Custom Headers”You might need custom HTTP requests when:
- APIs require authentication (Bearer tokens, API keys)
- Websites block default user agents (need to appear as a real browser)
- Content is behind login (session cookies, authorization headers)
- Rate limiting (custom headers to identify your requests)
- Content negotiation (specific Accept headers for different formats)
Basic Configuration
Section titled “Basic Configuration”Add a headers section to your feed configuration. This example is a complete, valid config:
headers:
User-Agent: "Mozilla/5.0 (compatible; html2rss/1.0)"
Authorization: "Bearer YOUR_API_TOKEN"
Accept: "application/json"
channel:
url: https://api.example.com/posts
selectors:
items:
selector: "array > object"
title:
selector: "title"
url:
selector: "url"
Request Controls
Section titled “Request Controls”Request budgets are configured under request, not as top-level keys:
headers:
User-Agent: "Mozilla/5.0 (compatible; html2rss/1.0)"
request:
max_redirects: 5
max_requests: 6
channel:
url: https://example.com/articles
selectors:
items:
selector: article
title:
selector: h2
url:
selector: a
extractor: href
request.max_redirectslimits redirect hopsrequest.max_requestslimits the total request budget for the feed buildrequest.browserless.*is reserved for Browserless-only behavior such as preload actions
Common Use Cases
Section titled “Common Use Cases”API Authentication
Section titled “API Authentication”Many APIs require authentication tokens:
headers: Authorization: "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." X-API-Key: "your-api-key-here"channel: url: "https://api.example.com/posts"selectors: items: selector: "array > object" title: selector: "title" url: selector: "url"User Agent Spoofing
Section titled “User Agent Spoofing”Some websites block requests that don’t look like real browsers:
headers: User-Agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" Accept-Language: "en-US,en;q=0.5" Accept-Encoding: "gzip, deflate"channel: url: "https://example.com/articles"selectors: items: selector: "article" title: selector: "h2" url: selector: "a" extractor: "href"Content Type Negotiation
Section titled “Content Type Negotiation”Request specific content types:
headers: Accept: "application/json"channel: url: "https://api.example.com/posts"selectors: items: selector: "array > object" title: selector: "title" url: selector: "url"Custom API Headers
Section titled “Custom API Headers”Some APIs require specific headers:
headers: X-Requested-With: "XMLHttpRequest" X-Custom-Header: "your-value" Content-Type: "application/json"channel: url: "https://api.example.com/posts"selectors: items: selector: "array > object" title: selector: "title" url: selector: "url"Dynamic Headers
Section titled “Dynamic Headers”You can use dynamic parameters in headers for runtime values:
headers: Authorization: "Bearer %<api_token>s" X-User-ID: "%<user_id>s"channel: url: "https://api.example.com/users/%<user_id>s/posts"selectors: items: selector: "array > object" title: selector: "title" url: selector: "url"See our Dynamic Parameters guide for more details.
- Header examples that target third-party APIs are illustrative. Authentication requirements, header names, and response shapes can change independently of
html2rss. - For JSON APIs, validate the response structure before assuming selectors like
array > objectorhtml_urlwill match. - If you document or share a config for reuse, prefer placeholder values and parameterized headers over embedding real tokens.
Testing Your Headers
Section titled “Testing Your Headers”Test your configuration to ensure headers work correctly:
# Test with curl firstcurl -H "Authorization: Bearer YOUR_TOKEN" https://api.example.com/posts
# Then test with html2rsshtml2rss feed your-config.ymlTroubleshooting
Section titled “Troubleshooting”Common Issues
Section titled “Common Issues”- 401 Unauthorized: Check your authentication headers
- 403 Forbidden: Verify API keys and permissions
- 429 Too Many Requests: Add rate limiting or different user agents
- Empty responses: Some APIs require specific Accept headers
Debug Tips
Section titled “Debug Tips”- Use browser developer tools to see what headers successful requests use
- Test with curl before configuring html2rss
- Check API documentation for required headers
- Enable debug logging to see what headers are being sent
Advanced Examples
Section titled “Advanced Examples”GitHub API
Section titled “GitHub API”headers: Authorization: "token YOUR_GITHUB_TOKEN" Accept: "application/vnd.github.v3+json" User-Agent: "html2rss/1.0"channel: url: https://api.github.com/repos/owner/repo/issuesselectors: items: selector: "array > object" title: selector: "title" url: selector: "html_url"Reddit API
Section titled “Reddit API”headers: User-Agent: "html2rss/1.0 by your-username" Accept: "application/json"channel: url: https://www.reddit.com/r/programming.jsonselectors: items: selector: "data > children > object > data" title: selector: "title" url: selector: "url"Related Topics
Section titled “Related Topics”- Headers Reference - Complete headers documentation
- Dynamic Parameters - Runtime header values
- Scraping JSON APIs - Working with JSON responses
- Strategy Selection - Choose the right strategy for your needs
- Troubleshooting - Common issues and solutions
Need More Help?
Section titled “Need More Help?”- Community Discussions - Ask for help
- Advanced Features - Performance optimization
- Ruby Gem Documentation - Complete API reference