Automated weekly digest of oncology literature: Phase 2/3 trials, FDA approvals, and key reviews — summarized by Claude and served as a searchable web dashboard.
PubMed API ──┐
FDA RSS ──┼──► fetch_and_summarize.py ──► data.json ──► index.html (dashboard)
Journal RSS ──┘ (Claude API)
Every Monday at 6:00 AM UTC, GitHub Actions:
data.json to the repogit clone https://github.com/YOURUSERNAME/oncopulse.git
cd oncopulse
Go to Settings → Secrets and variables → Actions → New repository secret:
| Secret | Value | Required |
|---|---|---|
ANTHROPIC_API_KEY |
Your Anthropic API key from console.anthropic.com | ✅ Yes |
PUBMED_API_KEY |
From ncbi.nlm.nih.gov/account (free) — raises rate limit from 3 to 10 req/sec | Optional |
Go to Settings → Pages:
main / folder: /dashboardYour dashboard will be live at: https://YOURUSERNAME.github.io/oncopulse/
Go to Actions → OncoPulse Weekly Literature Update → Run workflow
Or run locally:
cd pipeline
pip install -r requirements.txt
export ANTHROPIC_API_KEY="sk-ant-..."
export OUTPUT_DIR="../dashboard"
python fetch_and_summarize.py --days 7 --max-articles 60
Edit environment variables in .github/workflows/weekly_update.yml or set locally:
| Variable | Default | Description |
|---|---|---|
LOOKBACK_DAYS |
7 |
Days of literature to fetch |
MAX_ARTICLES |
60 |
Max articles to summarize per run (cost control) |
OUTPUT_DIR |
../dashboard |
Where to write data.json |
CACHE_DIR |
.cache |
Cache directory (avoids re-summarizing seen articles) |
Edit .github/workflows/weekly_update.yml:
- cron: "0 6 * * 1" # Monday 6 AM UTC — change to suit your timezone
Using Claude Sonnet at ~$3/MTok input + $15/MTok output:
| Articles/week | Est. tokens | Est. cost |
|---|---|---|
| 30 | ~120K | ~$0.50 |
| 60 | ~240K | ~$1.00 |
| 100 | ~400K | ~$1.65 |
The .cache/ directory prevents re-summarizing articles already seen, keeping costs low on re-runs.
oncopulse/
├── pipeline/
│ ├── fetch_and_summarize.py # Main pipeline script
│ ├── requirements.txt
│ └── .cache/ # Auto-created, cached summaries
├── dashboard/
│ ├── index.html # The web dashboard
│ └── data.json # Generated weekly by pipeline
├── .github/
│ └── workflows/
│ └── weekly_update.yml # GitHub Actions schedule
└── README.md
| Source | What it pulls |
|---|---|
| PubMed | Phase 2/3 RCTs, meta-analyses published in last 7 days |
| FDA RSS | FDA oncology drug approvals and regulatory actions |
| NEJM | New England Journal of Medicine |
| Lancet Oncol | The Lancet Oncology |
| JCO | Journal of Clinical Oncology |
| Blood | ASH Blood journal |
| Cancer Cell | Cell Press Cancer Cell |
| Nature Cancer | Nature Cancer |
| JAMA Oncol | JAMA Oncology |
| Ann Oncol | Annals of Oncology |
| Cancer Discov | Cancer Discovery (AACR) |
Open dashboard/index.html in a browser. If no data.json exists yet, you’ll see a helpful error message. Run the pipeline first to generate it.
For local testing with a small sample:
python fetch_and_summarize.py --days 3 --max-articles 10 --skip-pubmed