Bulk PDF Report Generation: Architecture Patterns
Bulk PDF generation (100s-1000s of PDFs) breaks with serial processing because memory accumulates after each render, CPU becomes the bottleneck at 3-10 seconds per PDF, and batch job timeouts occur after 15-30 minutes. If you're generating monthly reports for 1000 customers with Puppeteer in a loop, you'll see memory grow from 200MB to 2GB after 100 renders, timeouts after 500 renders, or the process crashing entirely after 800-1000 renders.
Scale Challenges
Generating a single PDF is different from generating 1000 PDFs. Here's what breaks:
Memory Accumulation When Generating 1000+ PDFs
The problem: Even with proper browser.close(), memory leaks accumulate.
Memory growth pattern (Puppeteer):
| PDFs Generated | Process Memory | Chrome Zombies | Status |
|---|---|---|---|
| 1 | 250 MB | 0 | ✅ OK |
| 10 | 400 MB | 1-2 | ✅ OK |
| 50 | 800 MB | 5-8 | ⚠️ Slow |
| 100 | 1.5 GB | 12-15 | ⚠️ Very slow |
| 200 | 2.5 GB | 25-30 | ❌ Crashes soon |
| 500 | N/A | N/A | ❌ Already crashed |
Why memory grows:
- Chrome spawns subprocesses (renderer, GPU, network)
browser.close()sends SIGTERM to main process- Subprocesses don't always receive signal
- Zombie processes accumulate
- Node.js event loop holds references
- Garbage collection can't free memory
Real-world impact:
// This looks correct but crashes after ~200 PDFs
async function generateBatch(customers) {
for (const customer of customers) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(generateHTML(customer));
await page.pdf({ path: `report-${customer.id}.pdf` });
await browser.close(); // Doesn't prevent memory leak
}
}
// Memory after 100 iterations: 1.5GB
// Memory after 200 iterations: Process killed by OS
CPU Bottlenecks in Serial Processing
Single-threaded bottleneck:
- Average PDF render: 5 seconds
- 1000 PDFs × 5 seconds = 5000 seconds = 83 minutes
- Plus overhead (HTML generation, DB queries): 100-120 minutes
Why CPU is the limit:
- Chrome rendering is CPU-intensive
- Node.js is single-threaded (one PDF at a time)
- Each PDF blocks the next
Real numbers from production:
| Report Complexity | Render Time | 1000 PDFs (serial) | CPU Usage |
|---|---|---|---|
| Simple (1-2 pages) | 3s | 50 min | 80-100% |
| Medium (5-10 pages) | 7s | 116 min | 90-100% |
| Complex (20+ pages) | 15s | 250 min | 95-100% |
Timeout Issues in Batch Jobs
Common timeout scenarios:
1. Cron job timeout (60 min max):
// Runs every month to generate customer reports
// Times out after 60 minutes if >700 PDFs
async function monthlyReportJob() {
const customers = await db.customers.findAll(); // 1000 customers
for (const customer of customers) {
await generatePDF(customer); // 5s each
}
// Total time: 5000s = 83 minutes
// Job killed at 60 minutes, only 720 PDFs generated
}
2. HTTP request timeout (30s-60s):
// API endpoint: POST /api/reports/generate-all
// Client timeout: 60s
// Actual time for 100 PDFs: 500s (8+ minutes)
app.post('/api/reports/generate-all', async (req, res) => {
const pdfs = [];
for (const customer of customers) {
pdfs.push(await generatePDF(customer));
}
res.json({ pdfs }); // Never reaches here, client already timed out
});
3. Lambda timeout (15 min max):
// AWS Lambda max timeout: 15 minutes
// After 900 seconds, function forcibly killed
// Any in-progress PDFs lost
Storage and Delivery at Scale
The problem: 1000 PDFs = 500MB-2GB total
Where to store:
- Memory: 1000 × 2MB average = 2GB (Node.js crashes)
- Disk: Lambda has 512MB
/tmp(insufficient) - S3: Need to upload 1000 files (adds time)
Delivery challenges:
- Email 1000 PDFs: 1000 SMTP connections (rate limited)
- Zip 1000 PDFs: 2GB zip file (too large for browser download)
- Generate on-demand: User waits 83 minutes (unacceptable)
Architecture Patterns
Four approaches to handle bulk PDF generation:
Pattern 1: Synchronous Batch (Simple but Slow)
Architecture:
API Request → Generate PDF 1 → Generate PDF 2 → ... → Generate PDF 1000 → Response
Code:
// POST /api/reports/batch
async function generateBatch(req, res) {
const { customerIds } = req.body;
const pdfs = [];
for (const customerId of customerIds) {
const customer = await db.customers.findById(customerId);
const html = generateReportHTML(customer);
const pdf = await puppeteer.pdf(html);
pdfs.push({ customerId, pdf });
}
res.json({ pdfs });
}
Pros:
- Simple to implement
- Easy to debug (sequential, predictable)
- No additional infrastructure
Cons:
- Slow: 100 PDFs = 10+ minutes
- Memory leaks: Crashes after 200-500 PDFs
- Timeout: HTTP request times out
- Blocking: Ties up server thread
When to use: <10 PDFs at a time, low volume
Pattern 2: Queue + Workers (Scalable, Complex)
Architecture:
API Request → Add jobs to queue → Return job IDs
↓
[Queue: BullMQ, SQS, etc.]
↓
Worker 1, Worker 2, Worker 3 (parallel)
↓
Generate PDFs → Store in S3 → Notify
Code (using BullMQ):
1. API endpoint (adds jobs to queue):
// POST /api/reports/batch
const { Queue } = require('bullmq');
const pdfQueue = new Queue('pdf-generation', {
connection: { host: 'redis', port: 6379 }
});
async function generateBatch(req, res) {
const { customerIds } = req.body;
const jobIds = [];
for (const customerId of customerIds) {
const job = await pdfQueue.add('generate-report', {
customerId,
requestId: req.id
});
jobIds.push(job.id);
}
res.json({
message: 'PDFs queued for generation',
jobIds,
statusUrl: `/api/reports/batch/${req.id}/status`
});
}
2. Worker (processes jobs from queue):
// worker.js (run separately, can scale to N instances)
const { Worker } = require('bullmq');
const puppeteer = require('puppeteer');
const s3 = require('./s3');
const worker = new Worker('pdf-generation', async (job) => {
const { customerId, requestId } = job.data;
// Generate PDF
const customer = await db.customers.findById(customerId);
const html = generateReportHTML(customer);
const pdf = await puppeteer.pdf(html);
// Upload to S3
const key = `reports/${requestId}/${customerId}.pdf`;
await s3.upload({ Key: key, Body: pdf });
// Update progress
await db.batchJobs.updateProgress(requestId, customerId, 'completed');
return { customerId, s3Key: key };
}, {
connection: { host: 'redis', port: 6379 },
concurrency: 5 // Process 5 PDFs in parallel per worker
});
// Handle failures
worker.on('failed', async (job, err) => {
console.error(`Job ${job.id} failed:`, err);
await db.batchJobs.updateProgress(job.data.requestId, job.data.customerId, 'failed');
});
3. Status endpoint (check progress):
// GET /api/reports/batch/:requestId/status
async function getBatchStatus(req, res) {
const { requestId } = req.params;
const status = await db.batchJobs.getStatus(requestId);
res.json({
total: status.total,
completed: status.completed,
failed: status.failed,
inProgress: status.inProgress,
downloadUrl: status.completed === status.total
? `/api/reports/batch/${requestId}/download`
: null
});
}
Pros:
- Scalable: Add more workers to increase throughput
- Resilient: Failed jobs can retry
- Non-blocking: API returns immediately
- Parallel: 5 workers × 5 concurrency = 25 PDFs at once
Cons:
- Complex: Redis/SQS, worker processes, job tracking
- Infrastructure: Need queue service, worker servers
- Debugging: Jobs fail silently, need monitoring
- Cost: Always-on workers or Lambda invocations
When to use: >100 PDFs regularly, need reliability, okay with complexity
Pattern 3: Parallel API Calls (Fast, Requires Rate Limiting)
Architecture:
API Request → Split into batches → Parallel API calls (25-100 concurrent)
↓
External PDF API (handles rendering)
↓
PDFs returned
Code:
// POST /api/reports/batch
async function generateBatch(req, res) {
const { customerIds } = req.body;
// Limit concurrency to avoid overwhelming API
const batchSize = 50; // 50 concurrent requests
const results = [];
for (let i = 0; i < customerIds.length; i += batchSize) {
const batch = customerIds.slice(i, i + batchSize);
// Generate 50 PDFs in parallel
const batchResults = await Promise.all(
batch.map(async (customerId) => {
const customer = await db.customers.findById(customerId);
try {
const pdf = await fetch('https://api.hundreddocs.com/v1/pdf', {
method: 'POST',
headers: {
'X-API-Key': process.env.API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
templateId: 'monthly-report',
data: {
customerName: customer.name,
reportMonth: '2025-12',
metrics: customer.metrics
}
})
});
const pdfBuffer = await pdf.arrayBuffer();
// Upload to S3 immediately
await s3.upload({
Key: `reports/${customerId}.pdf`,
Body: Buffer.from(pdfBuffer)
});
return { customerId, status: 'success' };
} catch (error) {
console.error(`Failed to generate PDF for ${customerId}:`, error);
return { customerId, status: 'failed', error: error.message };
}
})
);
results.push(...batchResults);
}
res.json({ results });
}
Performance:
- 1000 PDFs, 50 concurrent
- 1000 / 50 = 20 batches
- Each batch: 1-2 seconds (API response time)
- Total: 20-40 seconds (vs 83 minutes serial)
Pros:
- Fast: 1000 PDFs in <1 minute
- No Chrome: Zero memory leaks
- Stateless: No worker infrastructure
- Simple: Standard HTTP requests
Cons:
- API dependency: Requires external service
- Rate limiting: Must respect API limits (429 errors)
- Cost: Per-PDF pricing (vs fixed server cost)
- Network: Need reliable internet
When to use: >100 PDFs, fast turnaround needed, serverless deployment
Pattern 4: Streaming Generation (Memory Efficient)
Architecture:
API Request → Generate PDF 1 → Stream to S3 → Generate PDF 2 → Stream to S3 → ...
Code:
// POST /api/reports/batch (streams PDFs as they're generated)
async function generateBatch(req, res) {
const { customerIds } = req.body;
res.writeHead(200, {
'Content-Type': 'application/x-ndjson', // Newline-delimited JSON
'Transfer-Encoding': 'chunked'
});
for (const customerId of customerIds) {
try {
const customer = await db.customers.findById(customerId);
const html = generateReportHTML(customer);
const pdf = await puppeteer.pdf(html);
// Upload to S3
const key = `reports/${customerId}.pdf`;
await s3.upload({ Key: key, Body: pdf });
// Stream result to client
res.write(JSON.stringify({
customerId,
status: 'success',
url: `https://s3.amazonaws.com/bucket/${key}`
}) + '\n');
} catch (error) {
res.write(JSON.stringify({
customerId,
status: 'failed',
error: error.message
}) + '\n');
}
}
res.end();
}
Client (receives streaming updates):
const response = await fetch('/api/reports/batch', {
method: 'POST',
body: JSON.stringify({ customerIds: [1, 2, 3, ..., 1000] })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const results = chunk.split('\n').filter(Boolean).map(JSON.parse);
results.forEach(result => {
console.log(`PDF ${result.customerId}: ${result.status}`);
updateProgressBar(result.customerId);
});
}
Pros:
- Memory efficient: PDFs streamed immediately, not accumulated
- Real-time progress: Client sees updates as PDFs complete
- No timeout: Long-running request stays open
Cons:
- Still slow: Serial processing (5s × 1000 = 83 min)
- Connection stability: Long HTTP connection can drop
- Puppeteer issues: Memory leaks still occur
When to use: Moderate volume (50-200 PDFs), need progress updates, can't use queue
Queue vs Batch vs Parallel (Comparison)
| Dimension | Synchronous Batch | Queue + Workers | Parallel API | Streaming |
|---|---|---|---|---|
| Speed (1000 PDFs) | 83 min | 10-20 min | 1-2 min | 83 min |
| Complexity | ⭐ Low | ⭐⭐⭐ High | ⭐⭐ Medium | ⭐⭐ Medium |
| Infrastructure | None | Redis + workers | None | None |
| Memory leaks | ❌ Yes | ❌ Yes | ✅ No | ❌ Yes |
| Scalability | ❌ Poor | ✅ Excellent | ✅ Excellent | ⚠️ Limited |
| Cost (1000 PDFs) | $0-5 | $10-30 | $10-50 | $0-5 |
| Real-time progress | ❌ No | ⚠️ Via polling | ❌ No | ✅ Yes |
| Best for | <10 PDFs | 100-10k PDFs | 100-10k PDFs | 50-200 PDFs |
Memory Management at Scale
Why Puppeteer Fails After 100-200 Renders
Root cause: Chrome doesn't clean up completely.
Memory profile (512MB RAM available):
Render 1: [Chrome: 200MB] [Node: 50MB] Available: 262MB ✅
Render 10: [Chrome: 250MB] [Node: 70MB] Available: 192MB ✅
Render 50: [Chrome: 350MB] [Node: 120MB] Available: 42MB ⚠️
Render 100: [Chrome: 480MB] [Node: 180MB] Available: -148MB ❌ Crash
Mitigation (doesn't fully solve):
// Restart process every N PDFs
let renderCount = 0;
const MAX_RENDERS = 50;
async function generatePDFWithRestart(data) {
if (renderCount >= MAX_RENDERS) {
console.log('Restarting process to prevent memory leak...');
process.exit(0); // Process manager (PM2, Kubernetes) restarts
}
const pdf = await generatePDF(data);
renderCount++;
return pdf;
}
Restarting Workers Periodically
Pattern (with PM2 or Kubernetes):
// worker.js
const RESTART_AFTER_RENDERS = 100;
let renderCount = 0;
worker.on('completed', () => {
renderCount++;
if (renderCount >= RESTART_AFTER_RENDERS) {
console.log('Graceful restart after 100 renders');
worker.close(); // Finish current jobs
process.exit(0); // PM2/K8s will restart
}
});
PM2 config:
{
"apps": [{
"name": "pdf-worker",
"script": "worker.js",
"instances": 4,
"max_memory_restart": "1G", // Restart if memory exceeds 1GB
"autorestart": true
}]
}
Stateless Rendering Advantages
Why external APIs don't have memory leaks:
Traditional (Puppeteer):
Request 1 → [Process A: Chrome launched, 200MB] → [Zombie process: 50MB leaked]
Request 2 → [Process A: Chrome launched, 250MB] → [Zombie process: 100MB leaked]
...memory grows...
Stateless API:
Request 1 → [API container, fresh] → [PDF generated] → [Container destroyed]
Request 2 → [API container, fresh] → [PDF generated] → [Container destroyed]
...no memory accumulation...
Result: Can generate 1 million PDFs without memory issues.
Performance Considerations
Render Time Per PDF
Factors affecting render time:
| Factor | Simple PDF | Complex PDF |
|---|---|---|
| Page count | 1-2 pages | 20+ pages |
| Images | None | 10+ images |
| Tables | 1 small table | Multiple large tables |
| Custom fonts | System fonts | 3+ custom fonts |
| Puppeteer time | 3s | 15s |
| API time | 300ms | 1.2s |
Concurrency Limits
Puppeteer (local):
- Safe concurrency: 2-5 per CPU core
- 4-core server: 8-20 concurrent Chrome instances
- Each Chrome: 200-500MB RAM
- 8 concurrent × 400MB = 3.2GB RAM minimum
API (external):
- Rate limit: 100-1000 req/min (depends on plan)
- Recommended batch size: 50-100 concurrent
- No local RAM constraints
Rate Limiting External APIs
Exponential backoff pattern:
async function generatePDFWithRetry(data, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await generatePDF(data);
} catch (error) {
if (error.status === 429) { // Rate limited
const delay = Math.pow(2, attempt) * 1000; // 2s, 4s, 8s
console.log(`Rate limited, retrying in ${delay}ms...`);
await sleep(delay);
} else {
throw error;
}
}
}
}
Storage I/O Bottlenecks
Uploading 1000 PDFs to S3:
// Slow: Sequential uploads (1000 × 200ms = 200 seconds)
for (const pdf of pdfs) {
await s3.upload({ Body: pdf });
}
// Fast: Parallel uploads (1000 / 50 batches × 500ms = 10 seconds)
const batchSize = 50;
for (let i = 0; i < pdfs.length; i += batchSize) {
const batch = pdfs.slice(i, i + batchSize);
await Promise.all(batch.map(pdf => s3.upload({ Body: pdf })));
}
Monitoring Bulk Jobs
Track Success/Failure Rates
// Database schema for batch jobs
{
batchId: 'batch-2025-12-29-001',
totalJobs: 1000,
completed: 985,
failed: 15,
inProgress: 0,
startedAt: '2025-12-29T10:00:00Z',
completedAt: '2025-12-29T10:05:00Z',
duration: 300, // seconds
failureReasons: {
'timeout': 8,
'invalid_data': 5,
'api_error': 2
}
}
Monitor Memory Usage
// Log memory every 10 PDFs
let count = 0;
async function generatePDFWithMonitoring(data) {
const pdf = await generatePDF(data);
count++;
if (count % 10 === 0) {
const mem = process.memoryUsage();
console.log(`After ${count} PDFs:`, {
heapUsed: `${Math.round(mem.heapUsed / 1024 / 1024)}MB`,
external: `${Math.round(mem.external / 1024 / 1024)}MB`,
rss: `${Math.round(mem.rss / 1024 / 1024)}MB`
});
if (mem.rss > 800 * 1024 * 1024) { // > 800MB
console.warn('Memory usage high, consider restarting');
}
}
return pdf;
}
Alert on Stuck Jobs
// Check for jobs stuck >10 minutes
setInterval(async () => {
const stuckJobs = await db.batchJobs.findStuck({
status: 'in_progress',
startedAt: { $lt: Date.now() - 10 * 60 * 1000 }
});
if (stuckJobs.length > 0) {
await alertOps(`${stuckJobs.length} jobs stuck for >10 minutes`, {
jobIds: stuckJobs.map(j => j.id)
});
}
}, 60 * 1000); // Check every minute
Related Content
- Invoice PDF Generation API - Specific example of bulk invoice generation
- Serverless PDF Generation - Why Lambda timeout limits affect bulk jobs
- Puppeteer PDF Memory Leak - Deep dive into memory accumulation
- JSON to PDF Architecture - Template-based approach for consistent bulk generation
- Automated Certificate Generation API for EdTech & Events - A specific use case of bulk generation for certificates.
- How to Generate PDF Reports in Node.js - A focused guide on handling large reports in Node.js.
- Handling Dynamic Content in PDFs: Master Page Breaks and Flow - Understand how to manage content flow and pagination in large reports.
Technical takeaway: Bulk PDF generation (100s-1000s) requires architectural changes from single-PDF patterns. Serial Puppeteer processing is limited by memory leaks (crashes after 200-500 PDFs), CPU bottleneck (83 minutes for 1000 PDFs), and timeouts. Solutions: Queue + Workers (scalable but complex, requires Redis/worker infrastructure, 10-20 min for 1000 PDFs), Parallel API calls (fastest at 1-2 min, stateless, no memory leaks), or Streaming (memory-efficient, real-time progress). For >100 PDFs regularly, parallel API calls provide best balance of speed, simplicity, and reliability.