Skip to content

pumpzera/pdf-word-replacer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF Word Replacer

Python CLI tool that replaces visible text in text-based PDF files.

What it does

  • Replaces one or more words or short phrases in a PDF
  • Saves the edited PDF as a new file
  • Works best on selectable-text PDFs
  • Keeps the original font, font size, color, and text baseline when the original PDF font can be safely reused
  • Falls back to the closest built-in PDF font family when a custom embedded font would otherwise render invisibly
  • Tries to keep nearby word spacing natural when a shorter replacement would otherwise leave an ugly gap

Super Simple Explanation

If you have a PDF and want to change a word inside it, this tool does that.

Example:

  • Your PDF says John
  • You want it to say Jane
  • This tool opens the PDF, finds John, and writes Jane in the same place

You choose:

  • which PDF to edit
  • where the new edited PDF will be saved
  • which word should change
  • what the new word should be

The original PDF is not changed unless you overwrite it on purpose.

Limitations

  • Scanned/image-only PDFs are not supported unless OCR text already exists
  • Complex layouts and text split across multiple drawing objects may still need manual review
  • This tool replaces visible text areas; it is not meant for legal or tamper-proof redaction
  • PDF line layout is not reflowed; longer replacements can overlap nearby content
  • No automatic font shrinking or stretching is applied, so the output keeps the original text sizing
  • Some embedded Type0 / Identity-H / subset fonts may be replaced with the closest readable built-in PDF font instead of the exact original font

Supported Font Strategy

This tool uses two font modes:

  • Direct reuse mode: if the PDF font can be safely reused, the tool keeps the original font style
  • Safe fallback mode: if the PDF uses a risky custom embedded font, the tool writes the replacement with the closest readable built-in PDF font

Directly reused when possible

  • Standard PDF fonts such as Helvetica, Times-Roman, and Courier
  • Many embedded TrueType fonts that PyMuPDF can reuse safely

Fallback font families

If the original font cannot be safely reused, the tool maps it to one of the standard PDF families below.

Sans-style fonts map to the Helvetica family:

  • Arial
  • Helvetica
  • Calibri
  • Segoe UI
  • Microsoft Sans Serif
  • Verdana
  • Tahoma
  • Trebuchet MS
  • Geneva
  • DejaVu Sans
  • Noto Sans
  • Bitstream Vera Sans
  • Frutiger
  • Univers
  • Gothic / KakuGo style names

Serif-style fonts map to the Times family:

  • Times New Roman
  • Georgia
  • Garamond
  • Cambria
  • Baskerville
  • Palatino
  • Bookman
  • DejaVu Serif
  • Noto Serif
  • Libertine
  • Mincho / Song style names

Monospace fonts map to the Courier family:

  • Courier New
  • Consolas
  • Monaco
  • Lucida Console
  • Menlo
  • DejaVu Sans Mono
  • Fira Code
  • Source Code Pro
  • JetBrains Mono
  • IBM Plex Mono

Bold and italic styles are preserved when the original font name exposes that information.

Install

python -m pip install -e .

For tests:

python -m pip install -e .[dev]

Step By Step

1. Open PowerShell

Open Windows PowerShell.

2. Go to this project folder

cd "C:\path\to\pdf-word-replacer"

3. Install the tool

You only need to do this once:

python -m pip install -e .

4. Run the command

Basic format:

python -m pdf_word_replacer.cli "INPUT_PDF" "OUTPUT_PDF" --replace old=new

What each part means:

  • INPUT_PDF = the PDF you want to edit
  • OUTPUT_PDF = the new PDF that will be created
  • --replace old=new = change the word old into new

5. Real example

python -m pdf_word_replacer.cli "C:\Users\Bugra\Documents\contract.pdf" "C:\Users\Bugra\Documents\contract-edited.pdf" --replace John=Jane

This means:

  • open contract.pdf
  • find John
  • replace it with Jane
  • save the result as contract-edited.pdf

Quick Start Examples

Replace a single word:

python -m pdf_word_replacer.cli "input.pdf" "output.pdf" --replace old=new

Replace multiple words:

python -m pdf_word_replacer.cli "input.pdf" "output.pdf" --replace old=new --replace hello=world

Replace a phrase for cleaner layout:

python -m pdf_word_replacer.cli "input.pdf" "output.pdf" --replace "Meadow City=Garden City"

Use case-insensitive matching:

python -m pdf_word_replacer.cli "input.pdf" "output.pdf" --replace old=new --ignore-case

If You Want To Edit Your Own PDF

Use this template and just change the file paths and words:

python -m pdf_word_replacer.cli "C:\PATH\TO\YOUR\INPUT.pdf" "C:\PATH\TO\YOUR\OUTPUT.pdf" --replace oldword=newword

Example:

python -m pdf_word_replacer.cli "C:\Users\Bugra\Desktop\file.pdf" "C:\Users\Bugra\Desktop\file-edited.pdf" --replace Apple=Orange

What Happens After You Run It

  • The tool reads the input PDF
  • It searches for the word or phrase you gave
  • It replaces the text in the PDF
  • It saves a new PDF at the output path

If the command succeeds, you will see a message like this:

Saved: C:\path\to\output.pdf
Total replacements: 1
- John: 1

Common Mistakes

  • If Python says the file does not exist, your PDF path is wrong
  • If nothing changes, the PDF may be scanned or image-based
  • If the new word is much longer, it may visually collide with nearby text
  • If your path has spaces, keep it inside quotes like "C:\My Folder\file.pdf"

Command Reference

python -m pdf_word_replacer.cli INPUT_PDF OUTPUT_PDF --replace SOURCE=TARGET [--replace SOURCE=TARGET ...] [--ignore-case]

Notes

  • The original file is not modified unless you choose the same input and output path
  • The edited PDF is saved exactly where you tell the tool to save it
  • Exact visual results still depend on how the original PDF stores its text

About

Python CLI to replace visible PDF text with font-aware fallback and cleaner word spacing.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages