Alle innlegg
|Også tilgjengelig på:DE

PDF Handling API – Merge, Split, and Auto-Rotate PDFs

Use the MaraDocs PDF manipulation API to combine, split, and auto-rotate PDFs. Thumbnail generation, page selection, orientation correction – all in one API.

Martin Kurtz
APIPDFDocument ProcessingDeveloper
PDF Handling API – Merge, Split, and Auto-Rotate PDFs

Ever need to merge multiple PDFs, select specific pages, or fix incorrectly rotated scans? At our law firm, case files often arrived as separate PDFs – expert reports, client letters, court documents – that had to be merged and reordered. A merge split PDF API that also fixes orientation would streamline that workflow significantly.

It sounds straightforward – until you try wiring PyMuPDF, pikepdf, or similar libraries into a robust pipeline with virus scanning and validation.

Why Building a Merge Split PDF Solution Yourself Takes Weeks

PyMuPDF, pikepdf, PyPDF2, and reportlab each handle parts of PDF manipulation. Merging works until you hit encrypted files, corrupted streams, or odd encodings. If you try to build this yourself, you'll quickly find that orientation detection typically requires OCR or layout analysis – more dependencies, more infrastructure. Thumbnail generation means rendering pages to images, resizing, and encoding. Building a reliable, validated PDF manipulation API takes time and hides complexity in edge cases.

How the MaraDocs Merge Split PDF API Solves This in Minutes

The MaraDocs API provides PDF handling as a service. Upload, validate (virus and format), then compose (merge/split by page), detect and fix orientation, generate thumbnails, optimize, and OCR – all through one API. No per-step uploads; files stay in your workspace. You specify which pages from which PDFs to combine, and the API returns a single composed handle. Orientation correction uses text-based analysis, so rotated scans are fixed automatically without manual intervention.

Document Processing Workflow: Upload, Validate, Compose, Orient

Every MaraDocs workflow starts with upload and validation. Upload a PDF (or get one from a previous step), validate it for viruses and format, then chain operations. The compose endpoint lets you merge multiple PDFs and select specific pages: for example, { pdf_handle: p1, pages: [{ page_number: 0 }, { page_number: 2 }] } for pages 1 and 3 of the first document, plus { pdf_handle: p2 } for the entire second document. Orientation detection uses text-based analysis to fix rotated pages automatically. For thumbnails, render a page to image, then thumbnail, then convert to JPEG – all with server-side handles, no re-upload.

Get your API key in under a minute

Register for a free account and get your API key in under a minute. Of course we'll provide you with some developer credits.

Try MaraDocs API now →

Why MaraDocs is Different: Workspaces, Webview, and German Data Privacy

Most document APIs force you to upload, process, download, then re-upload for the next step. With MaraDocs workspaces, files remain server-side. You pass handles between operations: validate → compose → orientation → optimize. Fewer round-trips, simpler code, and no need to track file identities across steps.

When automation hits an edge case – wrong page order, a corrupted stream, or an odd encoding – you can open app.maradocs.io with your workspace secret for manual review and editing. Your users get full manual control when the pipeline needs a human touch.

All processing runs in Germany (Maramia GmbH), with encryption at rest (SSE-C) and in transit (TLS). Workspaces expire after 7 days. No data leaves the EU. For GDPR- and BDSG-sensitive workloads, this matters.

TypeScript Code for Merging and Auto-Rotating PDFs

API reference: data/upload, pdf/validate, pdf/compose, pdf/orientation, pdf/to/img, img/thumbnail, data/download/pdf

import { MaraDocsClient } from "@maramia/maradocs-sdk-ts";
import { okPdf } from "@maramia/maradocs-sdk-ts/models/pdf";

const client = new MaraDocsClient({ workspaceSecret: workspace_secret });

// Upload and validate both PDFs
const up1 = await client.data.upload(pdf1File);
const up2 = await client.data.upload(pdf2File);
const val1 = await client.pdf.validate({ unvalidated_file_handle: up1.unvalidated_file_handle });
const val2 = await client.pdf.validate({ unvalidated_file_handle: up2.unvalidated_file_handle });
const pdf1 = okPdf(val1);
const pdf2 = okPdf(val2);

// Merge PDFs, selecting specific pages
const composed = await client.pdf.compose({
  pdfs: [
    { pdf_handle: pdf1, pages: [{ page_number: 0 }, { page_number: 2 }] },
    { pdf_handle: pdf2 },
  ],
});

// Auto-detect and fix orientation
const oriented = await client.pdf.orientation({
  pdf_handle: composed.pdf_handle,
});

// Download result
const blob = await client.data.downloadPdf({ pdf_handle: oriented.rotated_pdf_handle });

// Optional: Thumbnail (render page to image, then thumbnail)
const imgResult = await client.pdf.toImg({
  pdf_handle: oriented.rotated_pdf_handle,
  pages: [0],
});
const thumb = await client.img.thumbnail({
  img_handle: imgResult.img_handles[0],
});

Python Code for PDF Merge and Orientation

API reference: data/upload, pdf/validate, pdf/compose, pdf/orientation, data/download/pdf

import requests
import time

API_URL = "https://api.maradocs.io/v1"
headers = {"Authorization": f"Bearer {WORKSPACE_SECRET}"}

def poll(url, job_id):
    while True:
        r = requests.get(f"{url}/{job_id}", headers=headers).json()
        if r["status"] == "complete":
            return r["response"]["response"]
        time.sleep(1)

# 1. Upload and validate both PDFs
with open("doc1.pdf", "rb") as f1, open("doc2.pdf", "rb") as f2:
    up1 = requests.post(f"{API_URL}/data/upload", headers=headers, files={"file": f1}).json()
    up2 = requests.post(f"{API_URL}/data/upload", headers=headers, files={"file": f2}).json()
v1 = requests.post(f"{API_URL}/pdf/validate", headers=headers,
    json={"unvalidated_file_handle": up1["unvalidated_file_handle"]}).json()
v2 = requests.post(f"{API_URL}/pdf/validate", headers=headers,
    json={"unvalidated_file_handle": up2["unvalidated_file_handle"]}).json()
pdf1 = poll(f"{API_URL}/pdf/validate", v1["job_id"])["pdf_handle"]
pdf2 = poll(f"{API_URL}/pdf/validate", v2["job_id"])["pdf_handle"]

# 2. Compose (merge/split)
compose = requests.post(f"{API_URL}/pdf/compose", headers=headers,
    json={"pdfs": [{"pdf_handle": pdf1, "pages": [{"page_number": 0}, {"page_number": 2}]}, {"pdf_handle": pdf2}]}).json()
composed = poll(f"{API_URL}/pdf/compose", compose["job_id"])

# 3. Orientation, 4. Download
orient = requests.post(f"{API_URL}/pdf/orientation", headers=headers,
    json={"pdf_handle": composed["pdf_handle"]}).json()
oriented = poll(f"{API_URL}/pdf/orientation", orient["job_id"])
pdf_resp = requests.get(f"{API_URL}/data/download/pdf", headers=headers,
    params={"pdf_handle": oriented["rotated_pdf_handle"]})
with open("merged.pdf", "wb") as out:
    out.write(pdf_resp.content)

Summary and Next Steps

A merge split PDF API with orientation correction and thumbnails is ready to use. MaraDocs handles validation, composition, orientation, and optimization in one workflow. See Document Scanner App, Auto-Rotation, PDF Compression, and Image on Blank Page for more.


Try it: MaraDocs API | TypeScript SDK


Abonner på nyhetsbrevet nå

Hold deg oppdatert og motta de siste nyhetene, artikler og ressurser via e-post.