# Code Library Documentation - CER Test Fixture ## Purpose This fixture is used for Character Error Rate (CER) testing in the vector PDF corpus. ## Files - `source.pdf` - Clean vector PDF with embedded text - `ground_truth.txt` - Exact text content for CER comparison - `README.md` - This file ## Content libpdf - PDF Processing Library Installation pip install libpdf Quick Example from libpdf import Document doc = Document('example.pdf') text = doc.extract_text() API Reference Document.open(path) Open... ## Expected CER Target: < 0.5% character error rate when extracted by pdftract. ## Metadata - Title: Code Library Documentation - Author: Open Source Contributors - Creator: Markdown - Generated by: generate_vector_cer_corpus.py