Data Sources
n8n Pulse collects data from multiple sources to provide comprehensive insights into the n8n ecosystem. All data is fetched automatically via daily scripts and stored as version-controlled JSON files.
Daily Updates
Automated collection runs every day at 06:00 UTC via GitHub Actions
Transparent
All data is version-controlled and sources are clearly attributed
Open Data
JSON files are publicly accessible for your own analysis
Primary Data Sources
These are the official APIs we query directly for up-to-date metrics.
| Source | Data Collected | Frequency | Notes |
|---|---|---|---|
| n8n Templates API | Templates, views, nodes, categories, creator info | Daily | Official API, complete data |
| GitHub API | Stars, forks, releases, issues, contributors | Daily | Uses token for higher rate limits |
| Discourse Forum API | Users, topics, posts, likes, contributors | Daily | Public endpoints, no auth needed |
| Discord API | Member count, online count | Daily | Requires bot token |
| npm API | Package download counts (n8n core) | Daily | Public API |
| npm Registry (Community) | Community node packages, downloads, authors | Weekly | Keyword search: n8n-community-node-package |
| Luma Events | Community events, locations, registrations | Daily | Scraped from n8n community calendar |
| Notion Directory | Ambassador profiles, join dates, locations | Weekly | Scraped via Playwright (Sundays 05:00 UTC) |
| Bluesky API | Posts mentioning n8n, engagement metrics | Daily | Keyword search via AT Protocol |
| Reddit API | Subscribers, active users, posts, comments | Daily | Public API, no auth needed |
Community Data Sources
Some metrics aren't available from official APIs. We integrate data from trusted community projects with proper attribution. These are fetched automatically alongside the primary sources.
n8n Arena
n8narena.comCreator metrics including template inserters (imports), rich creator profiles, and engagement data. The n8n API doesn't expose inserter metrics, making this an essential community data source. Fetched daily via Ted's GitHub repository.
Historical Backfill Sources
One-off data collection scripts used to backfill historical data from before we started tracking. These are not automated - they were run manually to seed historical time series.
| Source | Data Backfilled | Period | Notes |
|---|---|---|---|
| star-history.com | GitHub star counts over time | 2019-2025 | CSV export of star growth curve |
| ossinsight.io | GitHub stars, forks, issues (event-level) | 2011-2025 | BigQuery GitHub Archive data |
| Wayback Machine | GitHub repo stats, Reddit subscribers, Forum stats | Various | Scraped from archived HTML snapshots |
| npm API (historical) | Weekly download counts | 2019-2025 | Range endpoint for historical data |
| Bluesky API (backfill) | n8n mentions since Bluesky launch | Feb 2024+ | One-time historical search |
Note: Historical backfill data is marked as estimated or wayback in our data files. Charts display this data differently to distinguish it from live API measurements.
Data Quality & Resolution
Understanding how data is collected and what limitations exist helps you interpret the metrics correctly.
Granularity
- Daily: GitHub stars, forum stats, Discord members - collected every day at 06:00 UTC
- Weekly: Full template catalog, community node packages, ambassador profiles - refreshed on Sundays
- Monthly: Aggregated time series for historical trends and milestone predictions
Data Provenance
Each data point includes a source field indicating its origin:
-
apiDirect from official API - highest reliability -
externalFrom attributed third-party source (e.g., n8n Arena) -
estimatedHistorical backfill or interpolated data - shown differently in charts
Data Gaps
When a source is temporarily unavailable, we use the last known value and mark the gap.
The measuredSince field
on each dataset shows when we started tracking that metric. Historical data before that date
comes from backfill sources and is marked accordingly.
Using This Data
All data files are publicly accessible and you're welcome to use them for your own analysis.
Access the Data
- JSON files in
/data/directory - Version controlled in GitHub repository
- Licensed under MIT - use freely, attribution is always nice
Explore Visually
Use the Data Playground to visualize and compare any metrics without writing code.
Open PlaygroundTechnical Documentation
For developers wanting to understand the ETL pipeline, data format specifications, or contribute to data collection:
View DATA-STRATEGY.md on GitHub