Puppeteer is a powerful tool capable of simulating human interactions with web pages, enabling various use cases such as webpage screenshots, PDF generation, automated testing, uptime monitoring, web scraping, and content tracking.
There are many scenarios where deploying Puppeteer in the cloud makes sense. For example:
The pay-as-you-go and scalable nature of serverless computing makes it an excellent choice for browser automation tasks. However, most platforms, like DigitalOcean, only provide virtual machines, forcing you to pay for idle time (which would waste a lot of money!). Only a few platforms currently support running Puppeteer in a serverless manner: Leapcell, AWS Lambda, and Cloudflare Browser Rendering.
This article explores these platforms: how to use them to accomplish a typical Puppeteer task, and their pros and cons.
Let’s take a common Puppeteer use case as our example: capturing a screenshot of a web page.
The task involves these steps:
Code Example:
const puppeteer = require('puppeteer');
const { Hono } = require('hono');
const { serve } = require('@hono/node-server');
const screenshot = async (url) => {
const browser = await puppeteer.launch({ args: ['--single-process'] });
const page = await browser.newPage();
await page.goto(url);
const img = await page.screenshot();
await browser.close();
return img;
};
const app = new Hono();
app.get('/', async (c) => {
const url = c.req.query('url');
if (url) {
const img = await screenshot(url);
return c.body(img, { headers: { 'Content-Type': 'image/png' } });
} else {
return c.text('Please add an ?url=https://example.com/ parameter');
}
});
const port = 8080;
serve({ fetch: app.fetch, port }).on('listening', () => {
console.log(`Server is running on port ${port}`);
});
Leapcell is a versatile platform that allows you to deploy any application in a serverless manner. However, because it’s not designed exclusively for HTTP requests, its setup can be slightly more involved - you'll need to manually create an HTTP request handler.
Debugging is straightforward. Like any other Node.js application: node index.js
, it's done!
To deploy, specify the build command, run command, and service port (like below).
Once the deployment is complete, your application is ready to use online.
✅ Pros:
❌ Cons:
Code Example:
const chromium = require('chrome-aws-lambda');
exports.handler = async (event) => {
let browser = null;
try {
browser = await chromium.puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless,
});
const page = await browser.newPage();
await page.goto(event.url);
const screenshot = await page.screenshot();
return {
statusCode: 200,
headers: {'Content-Type': 'image/jpeg'},
body: screenshot.toString('base64'),
isBase64Encoded: true,
};
} catch (error) {
return {
statusCode: 500,
body: 'Failed to capture screenshot.',
};
} finally {
if (browser !== null) {
await browser.close();
}
}
};
AWS Lambda requires the use of puppeteer-core
paired with a third-party Chromium library, such as alixaxel/chrome-aws-lambda
. This is necessary because AWS imposes a 250MB limit on the size of a Lambda function. The default Chromium bundled with Puppeteer easily exceeds this limit (~170MB on macOS, ~282MB on Linux, ~280MB on Windows), making the use of a slimmed-down Chromium necessary.
Local debugging requires complex configurations due to differences in runtime environments. As you can see in alixaxel/chrome-aws-lambda
's guide.
To deploy, you need to upload your node_modules
as a ZIP file. Depending on your use case, you might also need to configure Lambda Layers. The main business logic can be written directly in the AWS console, save it to execute.
✅ Pros:
❌ Cons:
Code Example:
import puppeteer from '@cloudflare/puppeteer';
export default {
async fetch(request, env) {
const { searchParams } = new URL(request.url);
let url = searchParams.get('url');
if (url) {
url = new URL(url).toString(); // normalize
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(url);
const img = await page.screenshot();
await browser.close();
return new Response(img, {
headers: {
'content-type': 'image/png',
},
});
} else {
return new Response('Please add an ?url=https://example.com/ parameter');
}
},
};
Cloudflare Browser Rendering is a relatively new serverless Puppeteer solution. Similar to AWS Lambda, it does not support the official Puppeteer library. Instead, it uses a Puppeteer version provided by Cloudflare.
While Cloudflare’s library is more secure than any third-party options, its slow update cycle can be frustrating - it hasn’t been updated in over five months!
Additionally, Cloudflare Browser Rendering has several limitations:
Local debugging requires complex configurations.
To deploy, write your function online and save it to run.
✅ Pros:
❌ Cons:
This article has compared the three main serverless Puppeteer deployment platforms: Leapcell, AWS Lambda, and Cloudflare Browser Rendering. Each platform has its pros and cons.
Which one do you prefer? Are there other serverless Puppeteer deployment solutions you know of? Share your thoughts in the comments!
If you're planning to deploy your Puppeteer project online, as compared above, Leapcell would be a good choice.
For deployment guide, visit our documentation.