Building PDF Tools with WebAssembly
When we started building OxygenPDF, the assumption was that serious PDF processing needs a server. Poppler, QPDF, Ghostscript — C/C++ libraries that have been doing this for decades. Browsers just display PDFs, right?
Turns out, the browser can do a lot more than people give it credit for.
The Browser PDF Stack
pdf-lib
pdf-lib is a pure JavaScript library for creating and modifying PDFs. It handles merging, splitting, adding text and images, modifying metadata, and embedding fonts with Unicode support. Everything runs in memory using ArrayBuffer and Uint8Array, which makes it a natural fit for the browser.
PDF.js
Mozilla's PDF.js is the same engine Firefox uses to display PDFs. We lean on it for rendering page previews, extracting text, parsing document structure, and handling encrypted files.
WebAssembly
OCR, image processing, and compression need more horsepower, so we bring in WASM modules. Tesseract.js compiles the Tesseract OCR engine for text recognition. PaddleOCR runs through ONNX Runtime Web as an alternative engine. Custom WASM modules handle image codec operations.
Architecture Decisions
Web Workers
PDF operations can peg the CPU. Running them on the main thread would freeze the entire UI, so everything goes through Web Workers.
The UI thread never blocks. Users can scroll, click around, or switch tabs while a 200-page PDF processes in the background.
Streaming for Large Files
Loading a 100+ page PDF into memory all at once is asking for trouble. We process pages one at a time, release memory after each page completes, and report progress back to the UI as we go.
Progressive Enhancement
Not every browser supports every feature. Core operations (merge, split, reorder) are plain JavaScript. OCR and advanced compression use WebAssembly when available. For operations that genuinely exceed what a browser can do, we offer an optional cloud fallback.
Problems We Ran Into
Memory
Browsers cap memory per tab at roughly 2-4GB. Large PDFs with lots of embedded images can bump against that ceiling. We process pages sequentially, explicitly null out ArrayBuffer references so the GC can reclaim them, and warn the user before we get close to the limit.
Fonts
PDF font handling is genuinely awful. Embedded subsets, CID fonts, Type1, TrueType collections — the format has accumulated decades of font technology. PDF.js handles the rendering side. For embedding, we use pdf-lib's fontkit integration. When a font is too exotic to process, we fall back gracefully instead of crashing.
Encryption
A lot of PDF libraries offer an ignoreEncryption flag that sounds like it decrypts the file. It doesn't. It just skips the encryption marker and produces corrupted output. We use PDF.js for real decryption and re-export through pdf-lib to get a clean result.
Performance Numbers
Skipping the network round-trip matters more than the raw CPU numbers suggest:
| Operation | Time |
|---|---|
| Merge 10 PDFs | ~200ms |
| Split a 100-page PDF | ~500ms |
| Compress with image optimization | 2-5s (varies with image count) |
| OCR a single page | 3-8s (WASM Tesseract) |
CPU time per operation can be higher than a beefy server, but wall-clock time is often lower because there's no upload or download.
What We're Watching
The browser platform keeps shipping useful primitives. WebGPU opens the door to GPU-accelerated image processing. Origin Private File System gives us faster file I/O than the download-to-disk flow. Shared Array Buffers make real multi-threaded processing possible.
The gap between what a server can do and what a browser can do gets smaller with every Chrome release. We're betting on that trend continuing.
Rohman

