fix(pdftract-1j0f8): fix clap short flag conflict in conformance subcommand
The conformance subcommand had duplicate short options (-s) for both --suite and --sdk, causing the CLI reference generator to panic with "Short option names must be unique". Changed --sdk short option from -s to -k (matching the CI workflow convention). This allows the gen-cli-reference binary to run and the CI cli-ref-gen gate to function correctly. Also regenerated mdBook build output including the new cli-reference.html. Closes pdftract-1j0f8. Verification: notes/pdftract-1j0f8.md.
This commit is contained in:
parent
ad29d9dadc
commit
3e3fff08e1
55 changed files with 2822 additions and 225 deletions
|
|
@ -49,7 +49,7 @@ pub enum Commands {
|
|||
#[arg(short, long, default_value = "tests/sdk-conformance/cases.json")]
|
||||
suite: PathBuf,
|
||||
/// SDK name
|
||||
#[arg(short, long, default_value = "pdftract")]
|
||||
#[arg(short = 'k', long, default_value = "pdftract")]
|
||||
sdk: String,
|
||||
/// SDK version
|
||||
#[arg(short, long, default_value = "0.1.0")]
|
||||
|
|
|
|||
|
|
@ -36,10 +36,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -194,7 +194,7 @@
|
|||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="../troubleshooting/index.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<a rel="next prefetch" href="../troubleshooting.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
@ -208,7 +208,7 @@
|
|||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="../troubleshooting/index.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<a rel="next prefetch" href="../troubleshooting.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
</nav>
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
843
docs/user-docs/build/user-docs/cli-reference.html
Normal file
843
docs/user-docs/build/user-docs/cli-reference.html
Normal file
|
|
@ -0,0 +1,843 @@
|
|||
<!DOCTYPE HTML>
|
||||
<html lang="en" class="light sidebar-visible" dir="ltr">
|
||||
<head>
|
||||
<!-- Book generated using mdBook -->
|
||||
<meta charset="UTF-8">
|
||||
<title>CLI Reference - pdftract User Documentation</title>
|
||||
|
||||
|
||||
<!-- Custom HTML head -->
|
||||
|
||||
<meta name="description" content="">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<meta name="theme-color" content="#ffffff">
|
||||
|
||||
<link rel="icon" href="favicon-de23e50b.svg">
|
||||
<link rel="shortcut icon" href="favicon-8114d1fc.png">
|
||||
<link rel="stylesheet" href="css/variables-8adf115d.css">
|
||||
<link rel="stylesheet" href="css/general-2459343d.css">
|
||||
<link rel="stylesheet" href="css/chrome-ae938929.css">
|
||||
<link rel="stylesheet" href="css/print-9e4910d8.css" media="print">
|
||||
|
||||
<!-- Fonts -->
|
||||
<link rel="stylesheet" href="fonts/fonts-9644e21d.css">
|
||||
|
||||
<!-- Highlight.js Stylesheets -->
|
||||
<link rel="stylesheet" id="mdbook-highlight-css" href="highlight-493f70e1.css">
|
||||
<link rel="stylesheet" id="mdbook-tomorrow-night-css" href="tomorrow-night-4c0ae647.css">
|
||||
<link rel="stylesheet" id="mdbook-ayu-highlight-css" href="ayu-highlight-3fdfc3ac.css">
|
||||
|
||||
<!-- Custom theme stylesheets -->
|
||||
|
||||
|
||||
<!-- Provide site root and default themes to javascript -->
|
||||
<script>
|
||||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
<div id="mdbook-help-popup">
|
||||
<h2 class="mdbook-help-title">Keyboard shortcuts</h2>
|
||||
<div>
|
||||
<p>Press <kbd>←</kbd> or <kbd>→</kbd> to navigate between chapters</p>
|
||||
<p>Press <kbd>S</kbd> or <kbd>/</kbd> to search in the book</p>
|
||||
<p>Press <kbd>?</kbd> to show this help</p>
|
||||
<p>Press <kbd>Esc</kbd> to hide this help</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="mdbook-body-container">
|
||||
<!-- Work around some values being stored in localStorage wrapped in quotes -->
|
||||
<script>
|
||||
try {
|
||||
let theme = localStorage.getItem('mdbook-theme');
|
||||
let sidebar = localStorage.getItem('mdbook-sidebar');
|
||||
|
||||
if (theme.startsWith('"') && theme.endsWith('"')) {
|
||||
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
|
||||
}
|
||||
|
||||
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
|
||||
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
|
||||
}
|
||||
} catch (e) { }
|
||||
</script>
|
||||
|
||||
<!-- Set the theme before any content is loaded, prevents flash -->
|
||||
<script>
|
||||
const default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? default_dark_theme : default_light_theme;
|
||||
let theme;
|
||||
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
|
||||
if (theme === null || theme === undefined) { theme = default_theme; }
|
||||
const html = document.documentElement;
|
||||
html.classList.remove('light')
|
||||
html.classList.add(theme);
|
||||
html.classList.add("js");
|
||||
</script>
|
||||
|
||||
<input type="checkbox" id="mdbook-sidebar-toggle-anchor" class="hidden">
|
||||
|
||||
<!-- Hide / unhide sidebar before it is displayed -->
|
||||
<script>
|
||||
let sidebar = null;
|
||||
const sidebar_toggle = document.getElementById("mdbook-sidebar-toggle-anchor");
|
||||
if (document.body.clientWidth >= 1080) {
|
||||
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
|
||||
sidebar = sidebar || 'visible';
|
||||
} else {
|
||||
sidebar = 'hidden';
|
||||
sidebar_toggle.checked = false;
|
||||
}
|
||||
if (sidebar === 'visible') {
|
||||
sidebar_toggle.checked = true;
|
||||
} else {
|
||||
html.classList.remove('sidebar-visible');
|
||||
}
|
||||
</script>
|
||||
|
||||
<nav id="mdbook-sidebar" class="sidebar" aria-label="Table of contents">
|
||||
<!-- populated by js -->
|
||||
<mdbook-sidebar-scrollbox class="sidebar-scrollbox"></mdbook-sidebar-scrollbox>
|
||||
<noscript>
|
||||
<iframe class="sidebar-iframe-outer" src="toc.html"></iframe>
|
||||
</noscript>
|
||||
<div id="mdbook-sidebar-resize-handle" class="sidebar-resize-handle">
|
||||
<div class="sidebar-resize-indicator"></div>
|
||||
</div>
|
||||
</nav>
|
||||
|
||||
<div id="mdbook-page-wrapper" class="page-wrapper">
|
||||
|
||||
<div class="page">
|
||||
<div id="mdbook-menu-bar-hover-placeholder"></div>
|
||||
<div id="mdbook-menu-bar" class="menu-bar sticky">
|
||||
<div class="left-buttons">
|
||||
<label id="mdbook-sidebar-toggle" class="icon-button" for="mdbook-sidebar-toggle-anchor" title="Toggle Table of Contents" aria-label="Toggle Table of Contents" aria-controls="mdbook-sidebar">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 448 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M0 96C0 78.3 14.3 64 32 64H416c17.7 0 32 14.3 32 32s-14.3 32-32 32H32C14.3 128 0 113.7 0 96zM0 256c0-17.7 14.3-32 32-32H416c17.7 0 32 14.3 32 32s-14.3 32-32 32H32c-17.7 0-32-14.3-32-32zM448 416c0 17.7-14.3 32-32 32H32c-17.7 0-32-14.3-32-32s14.3-32 32-32H416c17.7 0 32 14.3 32 32z"/></svg></span>
|
||||
</label>
|
||||
<button id="mdbook-theme-toggle" class="icon-button" type="button" title="Change theme" aria-label="Change theme" aria-haspopup="true" aria-expanded="false" aria-controls="mdbook-theme-list">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 576 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M371.3 367.1c27.3-3.9 51.9-19.4 67.2-42.9L600.2 74.1c12.6-19.5 9.4-45.3-7.6-61.2S549.7-4.4 531.1 9.6L294.4 187.2c-24 18-38.2 46.1-38.4 76.1L371.3 367.1zm-19.6 25.4l-116-104.4C175.9 290.3 128 339.6 128 400c0 3.9 .2 7.8 .6 11.6c1.8 17.5-10.2 36.4-27.8 36.4H96c-17.7 0-32 14.3-32 32s14.3 32 32 32H240c61.9 0 112-50.1 112-112c0-2.5-.1-5-.2-7.5z"/></svg></span>
|
||||
</button>
|
||||
<ul id="mdbook-theme-list" class="theme-popup" aria-label="Themes" role="menu">
|
||||
<li role="none"><button role="menuitem" class="theme" id="mdbook-theme-default_theme">Auto</button></li>
|
||||
<li role="none"><button role="menuitem" class="theme" id="mdbook-theme-light">Light</button></li>
|
||||
<li role="none"><button role="menuitem" class="theme" id="mdbook-theme-rust">Rust</button></li>
|
||||
<li role="none"><button role="menuitem" class="theme" id="mdbook-theme-coal">Coal</button></li>
|
||||
<li role="none"><button role="menuitem" class="theme" id="mdbook-theme-navy">Navy</button></li>
|
||||
<li role="none"><button role="menuitem" class="theme" id="mdbook-theme-ayu">Ayu</button></li>
|
||||
</ul>
|
||||
<button id="mdbook-search-toggle" class="icon-button" type="button" title="Search (`/`)" aria-label="Toggle Searchbar" aria-expanded="false" aria-keyshortcuts="/ s" aria-controls="mdbook-searchbar">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M416 208c0 45.9-14.9 88.3-40 122.7L502.6 457.4c12.5 12.5 12.5 32.8 0 45.3s-32.8 12.5-45.3 0L330.7 376c-34.4 25.2-76.8 40-122.7 40C93.1 416 0 322.9 0 208S93.1 0 208 0S416 93.1 416 208zM208 352c79.5 0 144-64.5 144-144s-64.5-144-144-144S64 128.5 64 208s64.5 144 144 144z"/></svg></span>
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<h1 class="menu-title">pdftract User Documentation</h1>
|
||||
|
||||
<div class="right-buttons">
|
||||
<a href="print.html" title="Print this book" aria-label="Print this book">
|
||||
<span class=fa-svg id="print-button"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M128 0C92.7 0 64 28.7 64 64v96h64V64H354.7L384 93.3V160h64V93.3c0-17-6.7-33.3-18.7-45.3L400 18.7C388 6.7 371.7 0 354.7 0H128zM384 352v32 64H128V384 368 352H384zm64 32h32c17.7 0 32-14.3 32-32V256c0-35.3-28.7-64-64-64H64c-35.3 0-64 28.7-64 64v96c0 17.7 14.3 32 32 32H64v64c0 35.3 28.7 64 64 64H384c35.3 0 64-28.7 64-64V384zm-16-88c-13.3 0-24-10.7-24-24s10.7-24 24-24s24 10.7 24 24s-10.7 24-24 24z"/></svg></span>
|
||||
</a>
|
||||
<a href="https://github.com/jedarden/pdftract" title="Git repository" aria-label="Git repository">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 496 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span>
|
||||
</a>
|
||||
<a href="https://github.com/jedarden/pdftract/edit/main/docs/user-docs/src/src/cli-reference.md" title="Suggest an edit" aria-label="Suggest an edit" rel="edit">
|
||||
<span class=fa-svg id="git-edit-button"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M421.7 220.3l-11.3 11.3-22.6 22.6-205 205c-6.6 6.6-14.8 11.5-23.8 14.1L30.8 511c-8.4 2.5-17.5 .2-23.7-6.1S-1.5 489.7 1 481.2L38.7 353.1c2.6-9 7.5-17.2 14.1-23.8l205-205 22.6-22.6 11.3-11.3 33.9 33.9 62.1 62.1 33.9 33.9zM96 353.9l-9.3 9.3c-.9 .9-1.6 2.1-2 3.4l-25.3 86 86-25.3c1.3-.4 2.5-1.1 3.4-2l9.3-9.3H112c-8.8 0-16-7.2-16-16V353.9zM453.3 19.3l39.4 39.4c25 25 25 65.5 0 90.5l-14.5 14.5-22.6 22.6-11.3 11.3-33.9-33.9-62.1-62.1L314.3 67.7l11.3-11.3 22.6-22.6 14.5-14.5c25-25 65.5-25 90.5 0z"/></svg></span>
|
||||
</a>
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="mdbook-search-wrapper" class="hidden">
|
||||
<form id="mdbook-searchbar-outer" class="searchbar-outer">
|
||||
<div class="search-wrapper">
|
||||
<input type="search" id="mdbook-searchbar" name="searchbar" placeholder="Search this book ..." aria-controls="mdbook-searchresults-outer" aria-describedby="searchresults-header">
|
||||
<div class="spinner-wrapper">
|
||||
<span class=fa-svg id="fa-spin"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M304 48c0-26.5-21.5-48-48-48s-48 21.5-48 48s21.5 48 48 48s48-21.5 48-48zm0 416c0-26.5-21.5-48-48-48s-48 21.5-48 48s21.5 48 48 48s48-21.5 48-48zM48 304c26.5 0 48-21.5 48-48s-21.5-48-48-48s-48 21.5-48 48s21.5 48 48 48zm464-48c0-26.5-21.5-48-48-48s-48 21.5-48 48s21.5 48 48 48s48-21.5 48-48zM142.9 437c18.7-18.7 18.7-49.1 0-67.9s-49.1-18.7-67.9 0s-18.7 49.1 0 67.9s49.1 18.7 67.9 0zm0-294.2c18.7-18.7 18.7-49.1 0-67.9S93.7 56.2 75 75s-18.7 49.1 0 67.9s49.1 18.7 67.9 0zM369.1 437c18.7 18.7 49.1 18.7 67.9 0s18.7-49.1 0-67.9s-49.1-18.7-67.9 0s-18.7 49.1 0 67.9z"/></svg></span>
|
||||
</div>
|
||||
</div>
|
||||
</form>
|
||||
<div id="mdbook-searchresults-outer" class="searchresults-outer hidden">
|
||||
<div id="mdbook-searchresults-header" class="searchresults-header"></div>
|
||||
<ul id="mdbook-searchresults">
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Apply ARIA attributes after the sidebar and the sidebar toggle button are added to the DOM -->
|
||||
<script>
|
||||
document.getElementById('mdbook-sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
|
||||
document.getElementById('mdbook-sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
|
||||
Array.from(document.querySelectorAll('#mdbook-sidebar a')).forEach(function(link) {
|
||||
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
|
||||
});
|
||||
</script>
|
||||
|
||||
<div id="mdbook-content" class="content">
|
||||
<main>
|
||||
<h1 id="cli-reference"><a class="header" href="#cli-reference">CLI Reference</a></h1>
|
||||
<blockquote>
|
||||
<p>This page is auto-generated from the clap command tree.
|
||||
Run <code>cargo run --bin gen-cli-reference</code> to regenerate.</p>
|
||||
</blockquote>
|
||||
<h1 id="command-line-help-for-pdftract"><a class="header" href="#command-line-help-for-pdftract">Command-Line Help for <code>pdftract</code></a></h1>
|
||||
<p>This document contains the help content for the <code>pdftract</code> command-line program.</p>
|
||||
<p><strong>Command Overview:</strong></p>
|
||||
<ul>
|
||||
<li><a href="#pdftract"><code>pdftract</code>↴</a></li>
|
||||
<li><a href="#pdftract-list-diagnostics"><code>pdftract list-diagnostics</code>↴</a></li>
|
||||
<li><a href="#pdftract-explain-diagnostic"><code>pdftract explain-diagnostic</code>↴</a></li>
|
||||
<li><a href="#pdftract-compare"><code>pdftract compare</code>↴</a></li>
|
||||
<li><a href="#pdftract-conformance"><code>pdftract conformance</code>↴</a></li>
|
||||
<li><a href="#pdftract-sdk"><code>pdftract sdk</code>↴</a></li>
|
||||
<li><a href="#pdftract-sdk-codegen"><code>pdftract sdk codegen</code>↴</a></li>
|
||||
<li><a href="#pdftract-sdk-validate"><code>pdftract sdk validate</code>↴</a></li>
|
||||
<li><a href="#pdftract-extract"><code>pdftract extract</code>↴</a></li>
|
||||
<li><a href="#pdftract-classify"><code>pdftract classify</code>↴</a></li>
|
||||
<li><a href="#pdftract-inspect"><code>pdftract inspect</code>↴</a></li>
|
||||
<li><a href="#pdftract-verify-receipt"><code>pdftract verify-receipt</code>↴</a></li>
|
||||
<li><a href="#pdftract-hash"><code>pdftract hash</code>↴</a></li>
|
||||
<li><a href="#pdftract-cache"><code>pdftract cache</code>↴</a></li>
|
||||
<li><a href="#pdftract-cache-stats"><code>pdftract cache stats</code>↴</a></li>
|
||||
<li><a href="#pdftract-cache-clear"><code>pdftract cache clear</code>↴</a></li>
|
||||
<li><a href="#pdftract-cache-purge"><code>pdftract cache purge</code>↴</a></li>
|
||||
<li><a href="#pdftract-profiles"><code>pdftract profiles</code>↴</a></li>
|
||||
<li><a href="#pdftract-profiles-list"><code>pdftract profiles list</code>↴</a></li>
|
||||
<li><a href="#pdftract-profiles-show"><code>pdftract profiles show</code>↴</a></li>
|
||||
<li><a href="#pdftract-profiles-export"><code>pdftract profiles export</code>↴</a></li>
|
||||
<li><a href="#pdftract-profiles-install"><code>pdftract profiles install</code>↴</a></li>
|
||||
<li><a href="#pdftract-profiles-validate"><code>pdftract profiles validate</code>↴</a></li>
|
||||
<li><a href="#pdftract-serve"><code>pdftract serve</code>↴</a></li>
|
||||
<li><a href="#pdftract-mcp"><code>pdftract mcp</code>↴</a></li>
|
||||
<li><a href="#pdftract-validate"><code>pdftract validate</code>↴</a></li>
|
||||
<li><a href="#pdftract-migrate-schema"><code>pdftract migrate-schema</code>↴</a></li>
|
||||
<li><a href="#pdftract-doctor"><code>pdftract doctor</code>↴</a></li>
|
||||
</ul>
|
||||
<h2 id="pdftract"><a class="header" href="#pdftract"><code>pdftract</code></a></h2>
|
||||
<p>pdftract CLI - PDF extraction and conformance testing</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract <COMMAND></code></p>
|
||||
<h6 id="subcommands"><a class="header" href="#subcommands"><strong>Subcommands:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>list-diagnostics</code> — List all diagnostic codes with their metadata</li>
|
||||
<li><code>explain-diagnostic</code> — Explain a specific diagnostic code in detail</li>
|
||||
<li><code>compare</code> — Compare actual results against expected values with tolerances (for conformance testing)</li>
|
||||
<li><code>conformance</code> — Run SDK conformance test suite</li>
|
||||
<li><code>sdk</code> — SDK code generation commands</li>
|
||||
<li><code>extract</code> — Extract text and structure from a PDF file</li>
|
||||
<li><code>classify</code> — Classify document type (runs metadata + signal extraction, not full text extraction)</li>
|
||||
<li><code>inspect</code> — Inspect a PDF file in a local web browser with debugging overlays</li>
|
||||
<li><code>verify-receipt</code> — Verify a receipt against a PDF file</li>
|
||||
<li><code>hash</code> — Compute the PDF structural fingerprint (hash)</li>
|
||||
<li><code>cache</code> — Manage the extraction cache</li>
|
||||
<li><code>profiles</code> — Manage document type profiles</li>
|
||||
<li><code>serve</code> — Start the HTTP server for extraction</li>
|
||||
<li><code>mcp</code> — Start the MCP (Model Context Protocol) server</li>
|
||||
<li><code>validate</code> — Validate a JSON file against the pdftract schema</li>
|
||||
<li><code>migrate-schema</code> — Migrate JSON output between schema versions</li>
|
||||
<li><code>doctor</code> — Check environment health and dependencies</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-list-diagnostics"><a class="header" href="#pdftract-list-diagnostics"><code>pdftract list-diagnostics</code></a></h2>
|
||||
<p>List all diagnostic codes with their metadata</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract list-diagnostics</code></p>
|
||||
<h2 id="pdftract-explain-diagnostic"><a class="header" href="#pdftract-explain-diagnostic"><code>pdftract explain-diagnostic</code></a></h2>
|
||||
<p>Explain a specific diagnostic code in detail</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract explain-diagnostic <CODE></code></p>
|
||||
<h6 id="arguments"><a class="header" href="#arguments"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><CODE></code> — Diagnostic code to explain (e.g., STRUCT_MISSING_KEY, STREAM_BOMB)</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-compare"><a class="header" href="#pdftract-compare"><code>pdftract compare</code></a></h2>
|
||||
<p>Compare actual results against expected values with tolerances (for conformance testing)</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract compare [OPTIONS] <ACTUAL> <EXPECTED></code></p>
|
||||
<h6 id="arguments-1"><a class="header" href="#arguments-1"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><ACTUAL></code> — Path to the actual results JSON</li>
|
||||
<li><code><EXPECTED></code> — Path to the expected results JSON</li>
|
||||
</ul>
|
||||
<h6 id="options"><a class="header" href="#options"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>-t</code>, <code>--tolerances <TOLERANCES></code> — Path to the tolerances JSON (optional)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-f</code>, <code>--format <FORMAT></code> — Output format (text, json)</p>
|
||||
<p>Default value: <code>text</code></p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-conformance"><a class="header" href="#pdftract-conformance"><code>pdftract conformance</code></a></h2>
|
||||
<p>Run SDK conformance test suite</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract conformance [OPTIONS]</code></p>
|
||||
<h6 id="options-1"><a class="header" href="#options-1"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>-s</code>, <code>--suite <SUITE></code> — Path to the conformance suite JSON</p>
|
||||
<p>Default value: <code>tests/sdk-conformance/cases.json</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-k</code>, <code>--sdk <SDK></code> — SDK name</p>
|
||||
<p>Default value: <code>pdftract</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-v</code>, <code>--version <VERSION></code> — SDK version</p>
|
||||
<p>Default value: <code>0.1.0</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-o</code>, <code>--output <OUTPUT></code> — Output report path</p>
|
||||
<p>Default value: <code>conformance-report.json</code></p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-sdk"><a class="header" href="#pdftract-sdk"><code>pdftract sdk</code></a></h2>
|
||||
<p>SDK code generation commands</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract sdk <COMMAND></code></p>
|
||||
<h6 id="subcommands-1"><a class="header" href="#subcommands-1"><strong>Subcommands:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>codegen</code> — Generate SDK skeleton from templates</li>
|
||||
<li><code>validate</code> — Validate existing SDK against current generator output</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-sdk-codegen"><a class="header" href="#pdftract-sdk-codegen"><code>pdftract sdk codegen</code></a></h2>
|
||||
<p>Generate SDK skeleton from templates</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract sdk codegen --lang <LANG> --out <OUT></code></p>
|
||||
<h6 id="options-2"><a class="header" href="#options-2"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>-l</code>, <code>--lang <LANG></code> — Target language</p>
|
||||
<p>Possible values: <code>python</code>, <code>rust</code>, <code>node</code>, <code>go</code>, <code>java</code>, <code>dotnet</code>, <code>ruby</code>, <code>php</code>, <code>swift</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-o</code>, <code>--out <OUT></code> — Output directory</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-v</code>, <code>--version <VERSION></code> — Version string (defaults to current pdftract version)</p>
|
||||
<p>Default value: <code>0.1.0</code></p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-sdk-validate"><a class="header" href="#pdftract-sdk-validate"><code>pdftract sdk validate</code></a></h2>
|
||||
<p>Validate existing SDK against current generator output</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract sdk validate --lang <LANG> --sdk-dir <SDK_DIR></code></p>
|
||||
<h6 id="options-3"><a class="header" href="#options-3"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>-l</code>, <code>--lang <LANG></code> — Target language</p>
|
||||
<p>Possible values: <code>python</code>, <code>rust</code>, <code>node</code>, <code>go</code>, <code>java</code>, <code>dotnet</code>, <code>ruby</code>, <code>php</code>, <code>swift</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-s</code>, <code>--sdk-dir <SDK_DIR></code> — Path to existing SDK directory</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-extract"><a class="header" href="#pdftract-extract"><code>pdftract extract</code></a></h2>
|
||||
<p>Extract text and structure from a PDF file</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract extract [OPTIONS] <INPUT></code></p>
|
||||
<h6 id="arguments-2"><a class="header" href="#arguments-2"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><INPUT></code> — Path to the PDF file (use ‘-’ for stdin)</li>
|
||||
</ul>
|
||||
<h6 id="options-4"><a class="header" href="#options-4"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>--password-stdin</code> — Read password from stdin (one line, terminated by newline)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--password <PASSWORD></code> — PDF password (INSECURE: rejected unless PDFTRACT_INSECURE_CLI_PASSWORD=1)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--header <HEADER:VALUE></code> — Custom HTTP headers for remote sources (repeatable; format: HEADER:VALUE)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--pages <RANGE></code> — Page range to extract (1-based, comma-separated: 1-5,7,12-)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--json <PATH></code> — Output JSON to PATH (use ‘-’ for stdout)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--md <PATH></code> — Output Markdown to PATH (use ‘-’ for stdout)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--text <PATH></code> — Output plain text to PATH (use ‘-’ for stdout)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--ndjson</code> — Output NDJSON to stdout (mutually exclusive with other formats)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--format <FORMATS></code> — Output formats (comma-separated: json,markdown,text,ndjson)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-o</code>, <code>--output <BASE></code> — Base path for auto-named outputs (used with –format)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--receipts <MODE></code> — Receipt mode: off (default), lite, or svg</p>
|
||||
<p>Default value: <code>off</code></p>
|
||||
<p>Possible values: <code>off</code>, <code>lite</code>, <code>svg</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--ocr</code> — Enable OCR for scanned pages (requires ‘ocr’ feature)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--ocr-language <OCR_LANGUAGE></code> — OCR language codes (comma-separated, e.g., ‘eng,fra,deu’)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--cache-dir <DIR></code> — Enable cache at this directory (creates if absent)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--cache-size <SIZE></code> — Set cache size limit (default 1 GiB; accepts KiB, MiB, GiB suffixes)</p>
|
||||
<p>Default value: <code>1 GiB</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--no-cache</code> — Disable cache for this extraction (even if –cache-dir is set)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--md-anchors</code> — Emit HTML comment anchors before each block in Markdown output</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--md-no-page-breaks</code> — Suppress page-break horizontal rules between pages</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--auto</code> — Auto-detect document type and apply appropriate profile</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--profile <NAME|PATH></code> — Force-apply a specific profile (by name or YAML file path)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--include-headers</code> — Include header blocks in output</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--include-footers</code> — Include footer blocks in output</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--include-headers-footers</code> — Include both header and footer blocks in output</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--include-invisible-text</code> — Include invisible text spans in output (rendering_mode == 3)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--include-hidden-layers</code> — Include hidden-layer text spans in output (OCG-controlled)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--include-watermarks</code> — Include watermark blocks in output (no-op until Phase 7)</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-classify"><a class="header" href="#pdftract-classify"><code>pdftract classify</code></a></h2>
|
||||
<p>Classify document type (runs metadata + signal extraction, not full text extraction)</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract classify [OPTIONS] <INPUT></code></p>
|
||||
<h6 id="arguments-3"><a class="header" href="#arguments-3"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><INPUT></code> — Path to the PDF file</li>
|
||||
</ul>
|
||||
<h6 id="options-5"><a class="header" href="#options-5"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>--password-stdin</code> — Read password from stdin (one line, terminated by newline)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--password <PASSWORD></code> — PDF password (INSECURE: rejected unless PDFTRACT_INSECURE_CLI_PASSWORD=1)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--profiles <DIR></code> — Directory containing custom profile YAML files</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--pretty</code> — Pretty-print JSON output</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--top-k <TOP_K></code> — Number of top reasons to include (default: all)</p>
|
||||
<p>Default value: <code>0</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--exit-on-unknown</code> — Exit with code 1 if document type is unknown</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-inspect"><a class="header" href="#pdftract-inspect"><code>pdftract inspect</code></a></h2>
|
||||
<p>Inspect a PDF file in a local web browser with debugging overlays</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract inspect [OPTIONS] <FILE></code></p>
|
||||
<h6 id="arguments-4"><a class="header" href="#arguments-4"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><FILE></code> — Path to the PDF file to inspect</li>
|
||||
</ul>
|
||||
<h6 id="options-6"><a class="header" href="#options-6"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>-p</code>, <code>--port <PORT></code> — Port to bind the inspector server (default: 7676)</p>
|
||||
<p>Default value: <code>7676</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-b</code>, <code>--bind <BIND></code> — Bind address for the inspector server (default: 127.0.0.1)</p>
|
||||
<p>Binding to a non-loopback address requires –auth-token for security.</p>
|
||||
<p>Default value: <code>127.0.0.1</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--auth-token <AUTH_TOKEN></code> — Authentication token for non-loopback binds</p>
|
||||
<p>Required when –bind is not a loopback address (127.0.0.1 or ::1).</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--no-open</code> — Suppress automatic browser launch</p>
|
||||
<p>Useful for CI environments or when you want to manually open the browser.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--compare <FILE></code> — Optional second PDF file for comparative debugging</p>
|
||||
<p>When provided, the inspector shows side-by-side comparison.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--audit-log <FILE></code> — Write per-request audit log to FILE (NDJSON; use “-” for stdout, “/dev/stderr” for stderr)</p>
|
||||
<p>Rotation: pdftract does NOT rotate logs; configure logrotate on the audit-log file. When FILE is “-”, rotation is the responsibility of the supervisor (e.g., journald).</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-verify-receipt"><a class="header" href="#pdftract-verify-receipt"><code>pdftract verify-receipt</code></a></h2>
|
||||
<p>Verify a receipt against a PDF file</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract verify-receipt [OPTIONS] <FILE.pdf> <RECEIPT.json></code></p>
|
||||
<h6 id="arguments-5"><a class="header" href="#arguments-5"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><FILE.pdf></code> — Path to the PDF file to verify against</li>
|
||||
<li><code><RECEIPT.json></code> — Path to the receipt JSON file, or “-” for stdin</li>
|
||||
</ul>
|
||||
<h6 id="options-7"><a class="header" href="#options-7"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>--stdin</code> — Read receipt from stdin (alternative to “-”)</li>
|
||||
<li><code>--inline <INLINE></code> — Receipt JSON as inline string (alternative to file path)</li>
|
||||
<li><code>--json</code> — Output machine-readable JSON result</li>
|
||||
<li><code>--quiet</code> — Suppress human-readable output (exit code only)</li>
|
||||
<li><code>--password <PASSWORD></code> — PDF password (INSECURE: rejected unless PDFTRACT_INSECURE_CLI_PASSWORD=1)</li>
|
||||
<li><code>--password-stdin</code> — Read password from stdin (one line, terminated by newline)</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-hash"><a class="header" href="#pdftract-hash"><code>pdftract hash</code></a></h2>
|
||||
<p>Compute the PDF structural fingerprint (hash)</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract hash [OPTIONS] <INPUT></code></p>
|
||||
<h6 id="arguments-6"><a class="header" href="#arguments-6"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><INPUT></code> — Path to the PDF file or URL</li>
|
||||
</ul>
|
||||
<h6 id="options-8"><a class="header" href="#options-8"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>--password <PASSWORD></code> — PDF password (INSECURE: rejected unless PDFTRACT_INSECURE_CLI_PASSWORD=1)</li>
|
||||
<li><code>--header <HEADER:VALUE></code> — Custom HTTP headers for remote sources (repeatable; format: HEADER:VALUE)</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-cache"><a class="header" href="#pdftract-cache"><code>pdftract cache</code></a></h2>
|
||||
<p>Manage the extraction cache</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract cache <COMMAND></code></p>
|
||||
<h6 id="subcommands-2"><a class="header" href="#subcommands-2"><strong>Subcommands:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>stats</code> — Show cache statistics</li>
|
||||
<li><code>clear</code> — Clear all cache entries (preserves index.json and sentinel)</li>
|
||||
<li><code>purge</code> — Purge old cache entries</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-cache-stats"><a class="header" href="#pdftract-cache-stats"><code>pdftract cache stats</code></a></h2>
|
||||
<p>Show cache statistics</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract cache stats [OPTIONS] <DIR></code></p>
|
||||
<h6 id="arguments-7"><a class="header" href="#arguments-7"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><DIR></code> — Path to the cache directory</li>
|
||||
</ul>
|
||||
<h6 id="options-9"><a class="header" href="#options-9"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>--json</code> — Output in JSON format</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-cache-clear"><a class="header" href="#pdftract-cache-clear"><code>pdftract cache clear</code></a></h2>
|
||||
<p>Clear all cache entries (preserves index.json and sentinel)</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract cache clear [OPTIONS] <DIR></code></p>
|
||||
<h6 id="arguments-8"><a class="header" href="#arguments-8"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><DIR></code> — Path to the cache directory</li>
|
||||
</ul>
|
||||
<h6 id="options-10"><a class="header" href="#options-10"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>-y</code>, <code>--yes</code> — Skip confirmation prompt</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-cache-purge"><a class="header" href="#pdftract-cache-purge"><code>pdftract cache purge</code></a></h2>
|
||||
<p>Purge old cache entries</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract cache purge [OPTIONS] <DIR></code></p>
|
||||
<h6 id="arguments-9"><a class="header" href="#arguments-9"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><DIR></code> — Path to the cache directory</li>
|
||||
</ul>
|
||||
<h6 id="options-11"><a class="header" href="#options-11"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>--older-than <DURATION></code> — Delete entries older than this duration (e.g., “30d”, “7d”, “1h”)</li>
|
||||
<li><code>--version <CONSTRAINT></code> — Delete entries matching this version constraint (e.g., “<1.0.0”)</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-profiles"><a class="header" href="#pdftract-profiles"><code>pdftract profiles</code></a></h2>
|
||||
<p>Manage document type profiles</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract profiles <COMMAND></code></p>
|
||||
<h6 id="subcommands-3"><a class="header" href="#subcommands-3"><strong>Subcommands:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>list</code> — List all available profiles</li>
|
||||
<li><code>show</code> — Show a profile’s YAML content</li>
|
||||
<li><code>export</code> — Export a built-in profile to stdout</li>
|
||||
<li><code>install</code> — Install a profile to the user config directory</li>
|
||||
<li><code>validate</code> — Validate a profile file</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-profiles-list"><a class="header" href="#pdftract-profiles-list"><code>pdftract profiles list</code></a></h2>
|
||||
<p>List all available profiles</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract profiles list</code></p>
|
||||
<h2 id="pdftract-profiles-show"><a class="header" href="#pdftract-profiles-show"><code>pdftract profiles show</code></a></h2>
|
||||
<p>Show a profile’s YAML content</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract profiles show <NAME_OR_PATH></code></p>
|
||||
<h6 id="arguments-10"><a class="header" href="#arguments-10"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><NAME_OR_PATH></code> — Profile name or path to YAML file</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-profiles-export"><a class="header" href="#pdftract-profiles-export"><code>pdftract profiles export</code></a></h2>
|
||||
<p>Export a built-in profile to stdout</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract profiles export <NAME></code></p>
|
||||
<h6 id="arguments-11"><a class="header" href="#arguments-11"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><NAME></code> — Name of the built-in profile to export</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-profiles-install"><a class="header" href="#pdftract-profiles-install"><code>pdftract profiles install</code></a></h2>
|
||||
<p>Install a profile to the user config directory</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract profiles install <PATH></code></p>
|
||||
<h6 id="arguments-12"><a class="header" href="#arguments-12"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><PATH></code> — Path to the profile YAML file to install</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-profiles-validate"><a class="header" href="#pdftract-profiles-validate"><code>pdftract profiles validate</code></a></h2>
|
||||
<p>Validate a profile file</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract profiles validate <PATH></code></p>
|
||||
<h6 id="arguments-13"><a class="header" href="#arguments-13"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><PATH></code> — Path to the profile YAML file to validate</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-serve"><a class="header" href="#pdftract-serve"><code>pdftract serve</code></a></h2>
|
||||
<p>Start the HTTP server for extraction</p>
|
||||
<h2 id="security-model"><a class="header" href="#security-model">Security Model</a></h2>
|
||||
<p><strong>pdftract serve has no built-in authentication.</strong> Deploy behind a reverse proxy (nginx, Traefik, Caddy) for production use. The server accepts PDFs via multipart upload only; no endpoint accepts file paths from server filesystem.</p>
|
||||
<h2 id="concurrency"><a class="header" href="#concurrency">Concurrency</a></h2>
|
||||
<p>The server uses a two-level concurrency architecture:</p>
|
||||
<ul>
|
||||
<li><strong>tokio</strong>: Per-request concurrency via the async executor. Each HTTP request is handled asynchronously on tokio’s multi-threaded runtime. - <strong>rayon</strong>: Per-document parallelism within each extraction. PDF pages are processed in parallel using rayon’s work-stealing thread pool.</li>
|
||||
</ul>
|
||||
<p>The bridge between async (tokio) and sync (rayon) is <code>tokio::task::spawn_blocking</code>. Each POST handler wraps the synchronous extraction call in <code>spawn_blocking</code>, which runs the work on tokio’s blocking thread pool (separate from the async reactor).</p>
|
||||
<p>This design ensures: - The async reactor is never blocked by extraction work - Multiple PDFs can be extracted concurrently (one per request) - Within each PDF, pages are processed in parallel (rayon) - Thread pools are sized appropriately (tokio: 512 blocking threads; rayon: num_cpus)</p>
|
||||
<h2 id="endpoints"><a class="header" href="#endpoints">Endpoints</a></h2>
|
||||
<ul>
|
||||
<li><code>POST /extract</code> - Extract PDF and return JSON with metadata - <code>POST /extract/text</code> - Extract PDF and return plain text - <code>POST /extract/stream</code> - Extract PDF and return streaming NDJSON - <code>GET /health</code> - Health check (responds within 100ms even during concurrent extractions)</li>
|
||||
</ul>
|
||||
<h2 id="cache"><a class="header" href="#cache">Cache</a></h2>
|
||||
<p>Cache is optional. When enabled, extracted results are stored on disk and reused for identical PDFs. Cache status is reported via the <code>X-Pdftract-Cache</code> response header.</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract serve [OPTIONS]</code></p>
|
||||
<h6 id="options-12"><a class="header" href="#options-12"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>-b</code>, <code>--bind <BIND></code> — Bind address (e.g., “127.0.0.1:8080”, “[::1]:9000”, “0.0.0.0:3000”)</p>
|
||||
<p>Default value: <code>127.0.0.1:8080</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--cache-dir <DIR></code> — Enable cache at this directory</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--cache-size <SIZE></code> — Set cache size limit (default 1 GiB; accepts KiB, MiB, GiB suffixes)</p>
|
||||
<p>Default value: <code>1 GiB</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--no-cache</code> — Disable cache</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--max-upload-mb <MAX_UPLOAD_MB></code> — Maximum request body size in MB (default: 256, max: 4096)</p>
|
||||
<p>Default value: <code>256</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--max-decompress-gb <GB></code> — Maximum decompression size in GB (default: 1, overrides per-request max_decompress_gb)</p>
|
||||
<p>Default value: <code>1</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--audit-log <FILE></code> — Write per-request audit log to FILE (NDJSON; use “-” for stdout, “/dev/stderr” for stderr)</p>
|
||||
<p>Rotation: pdftract does NOT rotate logs; configure logrotate on the audit-log file. When FILE is “-”, rotation is the responsibility of the supervisor (e.g., journald).</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--trust-forwarded-for</code> — Trust X-Forwarded-For header for client IP detection (DANGER: enables IP spoofing if not behind a trusted proxy)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--profile-dir <DIR></code> — Directory containing custom profile YAML files (repeatable)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--profile-hot-reload</code> — Enable hot-reload for profiles (re-read directory on every request)</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-mcp"><a class="header" href="#pdftract-mcp"><code>pdftract mcp</code></a></h2>
|
||||
<p>Start the MCP (Model Context Protocol) server</p>
|
||||
<p>Per ADR-006: stdio and HTTP transports are mutually exclusive because they have opposite stdout discipline (stdio: JSON-RPC sink; HTTP: log channel). Exactly one transport must be selected per invocation.</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract mcp [OPTIONS]</code></p>
|
||||
<h6 id="options-13"><a class="header" href="#options-13"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>--stdio</code> — Use stdio transport (for Claude Desktop, Claude Code, Continue, Cursor)</p>
|
||||
<p>This is the default transport mode if neither –stdio nor –bind is specified.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-b</code>, <code>--bind <ADDR></code> — Bind address for the MCP server (e.g., “127.0.0.1:8080”, “[::1]:9000”, “0.0.0.0:3000”)</p>
|
||||
<p>Enables HTTP+SSE transport mode. Mutually exclusive with –stdio.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--auth-token-file <AUTH_TOKEN_FILE></code> — Path to a file containing the bearer token (RECOMMENDED)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--auth-token <AUTH_TOKEN></code> — Bearer token for authentication (INSECURE: rejected unless PDFTRACT_INSECURE_CLI_TOKEN=1)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--max-upload-mb <MAX_UPLOAD_MB></code> — Maximum request body size in MB (default: 256)</p>
|
||||
<p>Default value: <code>256</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--root <DIR></code> — Root directory for local filesystem access (enforces path-traversal protection)</p>
|
||||
<p>When set, all local-path tool arguments are resolved relative to DIR and any path that escapes DIR is rejected with JSON-RPC error code -32602. HTTPS URLs are not affected by this flag. Without –root, the server runs in trust-the-caller mode (no path-check applied).</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--audit-log <FILE></code> — Write per-request audit log to FILE (NDJSON; use “-” for stdout, “/dev/stderr” for stderr)</p>
|
||||
<p>Rotation: pdftract does NOT rotate logs; configure logrotate on the audit-log file. When FILE is “-”, rotation is the responsibility of the supervisor (e.g., journald).</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-validate"><a class="header" href="#pdftract-validate"><code>pdftract validate</code></a></h2>
|
||||
<p>Validate a JSON file against the pdftract schema</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract validate [OPTIONS] <FILE></code></p>
|
||||
<h6 id="arguments-14"><a class="header" href="#arguments-14"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code><FILE></code> — Path to the JSON file to validate (use ‘-’ for stdin)</li>
|
||||
</ul>
|
||||
<h6 id="options-14"><a class="header" href="#options-14"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li><code>-s</code>, <code>--schema <PATH></code> — Path to a custom schema file (default: bundled v1.0 schema)</li>
|
||||
<li><code>-q</code>, <code>--quiet</code> — Quiet mode - suppress error output (only exit code matters)</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-migrate-schema"><a class="header" href="#pdftract-migrate-schema"><code>pdftract migrate-schema</code></a></h2>
|
||||
<p>Migrate JSON output between schema versions</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract migrate-schema [OPTIONS] --from <FROM> --to <TO> [INPUT]</code></p>
|
||||
<h6 id="arguments-15"><a class="header" href="#arguments-15"><strong>Arguments:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code><INPUT></code> — Input JSON file (use ‘-’ for stdin)</p>
|
||||
<p>Default value: <code>-</code></p>
|
||||
</li>
|
||||
</ul>
|
||||
<h6 id="options-15"><a class="header" href="#options-15"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>--from <FROM></code> — Source schema version (e.g., “1.0”, “1.1”)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--to <TO></code> — Target schema version (e.g., “1.0”, “1.1”)</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-o</code>, <code>--output <OUTPUT></code> — Output JSON file (use ‘-’ for stdout)</p>
|
||||
<p>Default value: <code>-</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>-p</code>, <code>--pretty</code> — Pretty-print output JSON</p>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="pdftract-doctor"><a class="header" href="#pdftract-doctor"><code>pdftract doctor</code></a></h2>
|
||||
<p>Check environment health and dependencies</p>
|
||||
<p>Exit code policy: exits 0 if no checks FAIL (WARN does not affect exit code); exits 1 if any check FAILs; exits 2 on argument parse errors.</p>
|
||||
<p><strong>Usage:</strong> <code>pdftract doctor [OPTIONS]</code></p>
|
||||
<h6 id="options-16"><a class="header" href="#options-16"><strong>Options:</strong></a></h6>
|
||||
<ul>
|
||||
<li>
|
||||
<p><code>--features</code> — Print compiled features and exit</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--json</code> — Output results as JSON</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--no-color</code> — Disable colored output</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--exit-on-fail</code> — Explicit form of the default policy (exit 1 if any check FAILs).</p>
|
||||
<p>This flag is the default behavior and is provided for CI script readability. WARN does not affect exit code regardless of this flag.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--profile-dir <DIR></code> — Verify the profile search path includes DIR</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--cache-dir <DIR></code> — Verify DIR is writable and has sufficient space</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><code>--lang <LANG></code> — Requested OCR languages (default: eng)</p>
|
||||
</li>
|
||||
</ul>
|
||||
<hr />
|
||||
<p><small><i>
|
||||
This document was generated automatically by
|
||||
<a href="https://crates.io/crates/clap-markdown"><code>clap-markdown</code></a>.
|
||||
</i></small></p>
|
||||
<!-- AUTOGEN END -->
|
||||
<h2 id="hand-curated-content"><a class="header" href="#hand-curated-content">Hand-Curated Content</a></h2>
|
||||
<blockquote>
|
||||
<p><strong>Note:</strong> Any content added after this marker will be preserved
|
||||
when the CLI reference is regenerated. This section is for
|
||||
additional context that doesn’t fit in the auto-generated sections.</p>
|
||||
</blockquote>
|
||||
<h3 id="common-patterns"><a class="header" href="#common-patterns">Common Patterns</a></h3>
|
||||
<h4 id="basic-extraction"><a class="header" href="#basic-extraction">Basic Extraction</a></h4>
|
||||
<pre><code class="language-bash">pdftract extract document.pdf
|
||||
</code></pre>
|
||||
<h4 id="json-output"><a class="header" href="#json-output">JSON Output</a></h4>
|
||||
<pre><code class="language-bash">pdftract extract --json output.json document.pdf
|
||||
</code></pre>
|
||||
<h4 id="markdown-with-anchors"><a class="header" href="#markdown-with-anchors">Markdown with Anchors</a></h4>
|
||||
<pre><code class="language-bash">pdftract extract --md-anchors --md output.md document.pdf
|
||||
</code></pre>
|
||||
<h3 id="exit-codes"><a class="header" href="#exit-codes">Exit Codes</a></h3>
|
||||
<ul>
|
||||
<li><code>0</code>: Success</li>
|
||||
<li><code>1</code>: General error (extraction failed, file not found, etc.)</li>
|
||||
<li><code>2</code>: Usage error (invalid arguments, conflicting flags)</li>
|
||||
<li><code>3</code>: Decryption error (wrong or missing password)</li>
|
||||
</ul>
|
||||
|
||||
</main>
|
||||
|
||||
<nav class="nav-wrapper" aria-label="Page navigation">
|
||||
<!-- Mobile navigation buttons -->
|
||||
<a rel="prev" href="quickstart.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="cli/global-options.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<div style="clear: both"></div>
|
||||
</nav>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<nav class="nav-wide-wrapper" aria-label="Page navigation">
|
||||
<a rel="prev" href="quickstart.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="cli/global-options.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
</nav>
|
||||
|
||||
</div>
|
||||
|
||||
<template id=fa-eye><span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 576 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M288 32c-80.8 0-145.5 36.8-192.6 80.6C48.6 156 17.3 208 2.5 243.7c-3.3 7.9-3.3 16.7 0 24.6C17.3 304 48.6 356 95.4 399.4C142.5 443.2 207.2 480 288 480s145.5-36.8 192.6-80.6c46.8-43.5 78.1-95.4 93-131.1c3.3-7.9 3.3-16.7 0-24.6c-14.9-35.7-46.2-87.7-93-131.1C433.5 68.8 368.8 32 288 32zM432 256c0 79.5-64.5 144-144 144s-144-64.5-144-144s64.5-144 144-144s144 64.5 144 144zM288 192c0 35.3-28.7 64-64 64c-11.5 0-22.3-3-31.6-8.4c-.2 2.8-.4 5.5-.4 8.4c0 53 43 96 96 96s96-43 96-96s-43-96-96-96c-2.8 0-5.6 .1-8.4 .4c5.3 9.3 8.4 20.1 8.4 31.6z"/></svg></span></template>
|
||||
<template id=fa-eye-slash><span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 640 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M38.8 5.1C28.4-3.1 13.3-1.2 5.1 9.2S-1.2 34.7 9.2 42.9l592 464c10.4 8.2 25.5 6.3 33.7-4.1s6.3-25.5-4.1-33.7L525.6 386.7c39.6-40.6 66.4-86.1 79.9-118.4c3.3-7.9 3.3-16.7 0-24.6c-14.9-35.7-46.2-87.7-93-131.1C465.5 68.8 400.8 32 320 32c-68.2 0-125 26.3-169.3 60.8L38.8 5.1zM223.1 149.5C248.6 126.2 282.7 112 320 112c79.5 0 144 64.5 144 144c0 24.9-6.3 48.3-17.4 68.7L408 294.5c5.2-11.8 8-24.8 8-38.5c0-53-43-96-96-96c-2.8 0-5.6 .1-8.4 .4c5.3 9.3 8.4 20.1 8.4 31.6c0 10.2-2.4 19.8-6.6 28.3l-90.3-70.8zm223.1 298L373 389.9c-16.4 6.5-34.3 10.1-53 10.1c-79.5 0-144-64.5-144-144c0-6.9 .5-13.6 1.4-20.2L83.1 161.5C60.3 191.2 44 220.8 34.5 243.7c-3.3 7.9-3.3 16.7 0 24.6c14.9 35.7 46.2 87.7 93 131.1C174.5 443.2 239.2 480 320 480c47.8 0 89.9-12.9 126.2-32.5z"/></svg></span></template>
|
||||
<template id=fa-copy><span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M502.6 70.63l-61.25-61.25C435.4 3.371 427.2 0 418.7 0H255.1c-35.35 0-64 28.66-64 64l.0195 256C192 355.4 220.7 384 256 384h192c35.2 0 64-28.8 64-64V93.25C512 84.77 508.6 76.63 502.6 70.63zM464 320c0 8.836-7.164 16-16 16H255.1c-8.838 0-16-7.164-16-16L239.1 64.13c0-8.836 7.164-16 16-16h128L384 96c0 17.67 14.33 32 32 32h47.1V320zM272 448c0 8.836-7.164 16-16 16H63.1c-8.838 0-16-7.164-16-16L47.98 192.1c0-8.836 7.164-16 16-16H160V128H63.99c-35.35 0-64 28.65-64 64l.0098 256C.002 483.3 28.66 512 64 512h192c35.2 0 64-28.8 64-64v-32h-47.1L272 448z"/></svg></span></template>
|
||||
<template id=fa-play><span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 384 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M73 39c-14.8-9.1-33.4-9.4-48.5-.9S0 62.6 0 80V432c0 17.4 9.4 33.4 24.5 41.9s33.7 8.1 48.5-.9L361 297c14.3-8.7 23-24.2 23-41s-8.7-32.2-23-41L73 39z"/></svg></span></template>
|
||||
<template id=fa-clock-rotate-left><span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M75 75L41 41C25.9 25.9 0 36.6 0 57.9V168c0 13.3 10.7 24 24 24H134.1c21.4 0 32.1-25.9 17-41l-30.8-30.8C155 85.5 203 64 256 64c106 0 192 86 192 192s-86 192-192 192c-40.8 0-78.6-12.7-109.7-34.4c-14.5-10.1-34.4-6.6-44.6 7.9s-6.6 34.4 7.9 44.6C151.2 495 201.7 512 256 512c141.4 0 256-114.6 256-256S397.4 0 256 0C185.3 0 121.3 28.7 75 75zm181 53c-13.3 0-24 10.7-24 24V256c0 6.4 2.5 12.5 7 17l72 72c9.4 9.4 24.6 9.4 33.9 0s9.4-24.6 0-33.9l-65-65V152c0-13.3-10.7-24-24-24z"/></svg></span></template>
|
||||
|
||||
|
||||
|
||||
<script>
|
||||
window.playground_copyable = true;
|
||||
</script>
|
||||
|
||||
|
||||
<script src="elasticlunr-ef4e11c1.min.js"></script>
|
||||
<script src="mark-09e88c2c.min.js"></script>
|
||||
<script src="searcher-c2a407aa.js"></script>
|
||||
|
||||
<script src="clipboard-1626706a.min.js"></script>
|
||||
<script src="highlight-abc7f01d.js"></script>
|
||||
<script src="book-a0b12cfe.js"></script>
|
||||
|
||||
<!-- Custom JS scripts -->
|
||||
|
||||
|
||||
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -190,7 +190,7 @@
|
|||
|
||||
<nav class="nav-wrapper" aria-label="Page navigation">
|
||||
<!-- Mobile navigation buttons -->
|
||||
<a rel="prev" href="../cli/index.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<a rel="prev" href="../cli-reference.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
@ -204,7 +204,7 @@
|
|||
</div>
|
||||
|
||||
<nav class="nav-wide-wrapper" aria-label="Page navigation">
|
||||
<a rel="prev" href="../cli/index.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<a rel="prev" href="../cli-reference.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -340,7 +340,7 @@ receipt.pdf:1: "search term" found on page 1
|
|||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="cli/index.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<a rel="next prefetch" href="cli-reference.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
@ -354,7 +354,7 @@ receipt.pdf:1: "search term" found on page 1
|
|||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="cli/index.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<a rel="next prefetch" href="cli-reference.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
</nav>
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -181,10 +181,35 @@
|
|||
<div id="mdbook-content" class="content">
|
||||
<main>
|
||||
<h1 id="sdk-quickstarts"><a class="header" href="#sdk-quickstarts">SDK Quickstarts</a></h1>
|
||||
<blockquote>
|
||||
<p><strong>Draft</strong> — This section is a placeholder for future content.</p>
|
||||
</blockquote>
|
||||
<p>Getting started guides for using pdftract from various programming languages.</p>
|
||||
<p>Getting started guides for using pdftract from various programming languages. Each SDK implements the same 9-method contract: <code>extract</code>, <code>extract_text</code>, <code>extract_markdown</code>, <code>extract_stream</code>, <code>search</code>, <code>get_metadata</code>, <code>hash</code>, <code>classify</code>, and <code>verify_receipt</code>.</p>
|
||||
<h2 id="available-sdks"><a class="header" href="#available-sdks">Available SDKs</a></h2>
|
||||
<ul>
|
||||
<li><strong><a href="./rust.html">Rust</a></strong> — The <code>pdftract-core</code> crate with native zero-copy PDF processing</li>
|
||||
<li><strong><a href="./python.html">Python</a></strong> — Native Python bindings with PyO3, plus subprocess fallback</li>
|
||||
<li><strong><a href="./javascript.html">JavaScript/TypeScript</a></strong> — npm package with Node.js and browser support</li>
|
||||
<li><strong><a href="./go.html">Go</a></strong> — Go module with native bindings</li>
|
||||
</ul>
|
||||
<h2 id="choosing-an-sdk"><a class="header" href="#choosing-an-sdk">Choosing an SDK</a></h2>
|
||||
<ul>
|
||||
<li><strong>Rust</strong> — Best for performance-critical applications and CLI tools</li>
|
||||
<li><strong>Python</strong> — Best for data science, ML pipelines, and scripting</li>
|
||||
<li><strong>JavaScript</strong> — Best for web applications and serverless functions</li>
|
||||
<li><strong>Go</strong> — Best for microservices and cloud-native applications</li>
|
||||
</ul>
|
||||
<p>All SDKs support:</p>
|
||||
<ul>
|
||||
<li>Remote PDFs via HTTP/HTTPS URLs</li>
|
||||
<li>Encrypted PDFs with password</li>
|
||||
<li>OCR for scanned documents (with feature flag)</li>
|
||||
<li>Streaming extraction for large documents</li>
|
||||
<li>Cryptographic receipt verification</li>
|
||||
</ul>
|
||||
<h2 id="see-also"><a class="header" href="#see-also">See Also</a></h2>
|
||||
<ul>
|
||||
<li><a href="../json-schema-reference.html">JSON Schema Reference</a></li>
|
||||
<li><a href="../cli/README.html">CLI Reference</a></li>
|
||||
<li><a href="../installation.html">Installation Guide</a></li>
|
||||
</ul>
|
||||
|
||||
</main>
|
||||
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -181,10 +181,193 @@
|
|||
<div id="mdbook-content" class="content">
|
||||
<main>
|
||||
<h1 id="python-sdk"><a class="header" href="#python-sdk">Python SDK</a></h1>
|
||||
<blockquote>
|
||||
<p><strong>Draft</strong> — This page is a placeholder for future content.</p>
|
||||
</blockquote>
|
||||
<p>Using pdftract from Python.</p>
|
||||
<p>The Python SDK (<code>pdftract</code>) provides native Python bindings with idiomatic ergonomics including an exception hierarchy, dataclass types, and optional asyncio wrappers.</p>
|
||||
<h2 id="installation"><a class="header" href="#installation">Installation</a></h2>
|
||||
<pre><code class="language-bash">pip install pdftract
|
||||
</code></pre>
|
||||
<p>The package includes a precompiled native module for your platform. If the native module fails to import, a subprocess fallback is automatically used (with significantly degraded performance).</p>
|
||||
<h2 id="basic-extraction"><a class="header" href="#basic-extraction">Basic Extraction</a></h2>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
doc = pdftract.extract("document.pdf")
|
||||
print(f"Extracted {len(doc.pages)} pages")
|
||||
|
||||
for page in doc.pages:
|
||||
for span in page.spans:
|
||||
print(span.text)
|
||||
</code></pre>
|
||||
<h2 id="text-only-extraction"><a class="header" href="#text-only-extraction">Text-Only Extraction</a></h2>
|
||||
<p>For RAG pipelines that just need the text body:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
text = pdftract.extract_text("document.pdf")
|
||||
print(text)
|
||||
</code></pre>
|
||||
<h2 id="streaming"><a class="header" href="#streaming">Streaming</a></h2>
|
||||
<p>For large PDFs, stream pages one at a time to keep memory usage bounded:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
for page in pdftract.extract_stream("large_document.pdf"):
|
||||
print(f"Page {page.page_index}: {len(page.spans)} spans")
|
||||
# Process page while only one page is resident in memory
|
||||
</code></pre>
|
||||
<h2 id="markdown-extraction"><a class="header" href="#markdown-extraction">Markdown Extraction</a></h2>
|
||||
<p>Extract Markdown with optional anchor links for mapping back to PDF locations:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
# Basic Markdown
|
||||
markdown = pdftract.extract_markdown("document.pdf")
|
||||
|
||||
# With anchor links (HTML comments)
|
||||
markdown = pdftract.extract_markdown("document.pdf", anchors=True)
|
||||
</code></pre>
|
||||
<h2 id="options"><a class="header" href="#options">Options</a></h2>
|
||||
<p>Pass extraction options as keyword arguments:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
doc = pdftract.extract(
|
||||
"document.pdf",
|
||||
pages="1-5,7", # Page range
|
||||
password="secret123", # PDF password
|
||||
receipts="lite" # Receipt generation mode
|
||||
)
|
||||
</code></pre>
|
||||
<h3 id="available-options"><a class="header" href="#available-options">Available Options</a></h3>
|
||||
<div class="table-wrapper">
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Option</th><th>Type</th><th>Default</th><th>Use Case</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td><code>pages</code></td><td><code>str | None</code></td><td><code>None</code></td><td>Page range (e.g., <code>"1-5,7,12-"</code>)</td></tr>
|
||||
<tr><td><code>password</code></td><td><code>str | None</code></td><td><code>None</code></td><td>PDF password for encrypted documents</td></tr>
|
||||
<tr><td><code>receipts</code></td><td><code>str | None</code></td><td><code>None</code></td><td>Receipt mode: <code>"off"</code>, <code>"lite"</code>, or <code>"full"</code></td></tr>
|
||||
<tr><td><code>ocr</code></td><td><code>bool</code></td><td><code>False</code></td><td>Enable OCR for scanned documents</td></tr>
|
||||
<tr><td><code>ocr_language</code></td><td><code>list[str]</code></td><td><code>["eng"]</code></td><td>OCR language codes</td></tr>
|
||||
<tr><td><code>include_invisible</code></td><td><code>bool</code></td><td><code>False</code></td><td>Include invisible text in output</td></tr>
|
||||
<tr><td><code>extract_forms</code></td><td><code>bool</code></td><td><code>True</code></td><td>Extract AcroForm fields</td></tr>
|
||||
<tr><td><code>extract_attachments</code></td><td><code>bool</code></td><td><code>True</code></td><td>Extract embedded attachments</td></tr>
|
||||
<tr><td><code>readability_threshold</code></td><td><code>float</code></td><td><code>0.0</code></td><td>Minimum readability score</td></tr>
|
||||
<tr><td><code>max_decompress_gb</code></td><td><code>int</code></td><td><code>512</code></td><td>Max decompressed GB per stream</td></tr>
|
||||
<tr><td><code>full_render</code></td><td><code>bool</code></td><td><code>False</code></td><td>Enable full rendering</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<h2 id="error-handling"><a class="header" href="#error-handling">Error Handling</a></h2>
|
||||
<p>The SDK provides a structured exception hierarchy:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
try:
|
||||
doc = pdftract.extract("encrypted.pdf", password="wrong")
|
||||
except pdftract.EncryptionError as e:
|
||||
print(f"Encryption error: {e.code} - {e.hint}")
|
||||
except pdftract.CorruptPdfError as e:
|
||||
print(f"Corrupt PDF: {e}")
|
||||
except pdftract.SourceUnreachableError as e:
|
||||
print(f"File not found: {e}")
|
||||
except pdftract.PdftractError as e:
|
||||
print(f"Extraction failed: {e}")
|
||||
</code></pre>
|
||||
<h3 id="exception-hierarchy"><a class="header" href="#exception-hierarchy">Exception Hierarchy</a></h3>
|
||||
<p>All exceptions inherit from <code>PdftractError</code>:</p>
|
||||
<ul>
|
||||
<li><code>PdftractError</code> — Base exception for all extraction errors</li>
|
||||
<li><code>EncryptionError</code> — PDF encryption/password errors</li>
|
||||
<li><code>CorruptPdfError</code> — Malformed or corrupted PDF</li>
|
||||
<li><code>SourceUnreachableError</code> — File or URL unreachable</li>
|
||||
<li><code>RemoteFetchInterruptedError</code> — Network interruption during fetch</li>
|
||||
<li><code>TlsError</code> — TLS/certificate errors</li>
|
||||
<li><code>ReceiptVerifyError</code> — Receipt verification failed</li>
|
||||
<li><code>UnsupportedOperationError</code> — Requested operation not available</li>
|
||||
</ul>
|
||||
<h3 id="exception-attributes"><a class="header" href="#exception-attributes">Exception Attributes</a></h3>
|
||||
<p>All exceptions have the following attributes:</p>
|
||||
<ul>
|
||||
<li><code>code</code> — Diagnostic code (e.g., <code>"ENCRYPTION_WRONG_PASSWORD"</code>)</li>
|
||||
<li><code>page_index</code> — Page number where error occurred (if applicable)</li>
|
||||
<li><code>hint</code> — Suggested action for resolution</li>
|
||||
</ul>
|
||||
<h2 id="metadata"><a class="header" href="#metadata">Metadata</a></h2>
|
||||
<p>Get document metadata without full extraction:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
metadata = pdftract.get_metadata("document.pdf")
|
||||
print(f"Pages: {metadata.page_count}")
|
||||
print(f"Title: {metadata.title}")
|
||||
print(f"Author: {metadata.author}")
|
||||
print(f"Fingerprint: {metadata.fingerprint}")
|
||||
</code></pre>
|
||||
<h2 id="search"><a class="header" href="#search">Search</a></h2>
|
||||
<p>Search for a regex pattern in the PDF:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
for match in pdftract.search("document.pdf", r"\b\d{3}-\d{2}-\d{4}\b"):
|
||||
print(f"Found SSN at page {match.page_index}: {match.text}")
|
||||
</code></pre>
|
||||
<h2 id="fingerprint"><a class="header" href="#fingerprint">Fingerprint</a></h2>
|
||||
<p>Compute the structural fingerprint of a PDF:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
fingerprint = pdftract.hash("document.pdf")
|
||||
print(f"Fingerprint: {fingerprint.value}")
|
||||
</code></pre>
|
||||
<h2 id="classify"><a class="header" href="#classify">Classify</a></h2>
|
||||
<p>Classify a PDF page type:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
classification = pdftract.classify("document.pdf")
|
||||
print(f"Type: {classification.class_name}")
|
||||
print(f"Confidence: {classification.confidence}")
|
||||
</code></pre>
|
||||
<h2 id="verify-receipt"><a class="header" href="#verify-receipt">Verify Receipt</a></h2>
|
||||
<p>Verify a cryptographic receipt:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
# Extract with receipts enabled
|
||||
doc = pdftract.extract("document.pdf", receipts="lite")
|
||||
receipt = doc.pages[0].receipt
|
||||
|
||||
# Verify later
|
||||
verified = pdftract.verify_receipt("document.pdf", receipt)
|
||||
print(f"Verified: {verified}")
|
||||
</code></pre>
|
||||
<h2 id="remote-pdfs"><a class="header" href="#remote-pdfs">Remote PDFs</a></h2>
|
||||
<p>Extract from HTTP/HTTPS URLs:</p>
|
||||
<pre><code class="language-python">import pdftract
|
||||
|
||||
doc = pdftract.extract("https://example.com/document.pdf")
|
||||
</code></pre>
|
||||
<h2 id="mcp-integration"><a class="header" href="#mcp-integration">MCP Integration</a></h2>
|
||||
<p>For AI-assisted PDF extraction, pdftract provides an <a href="../cli/mcp.html">MCP (Model Context Protocol) server</a>. The Python SDK can be used alongside MCP clients like Claude Desktop:</p>
|
||||
<pre><code class="language-bash">pdftract mcp --stdio
|
||||
</code></pre>
|
||||
<p>See <a href="../cli/mcp.html">MCP Server Documentation</a> for setup instructions.</p>
|
||||
<h2 id="types"><a class="header" href="#types">Types</a></h2>
|
||||
<p>The SDK provides typed wrappers for all output structures:</p>
|
||||
<pre><code class="language-python">from pdftract.types import Document, Page, Span, Block, Metadata
|
||||
|
||||
# All extraction functions return typed objects
|
||||
doc: Document = pdftract.extract("document.pdf")
|
||||
page: Page = doc.pages[0]
|
||||
span: Span = page.spans[0]
|
||||
block: Block = page.blocks[0]
|
||||
metadata: Metadata = pdftract.get_metadata("document.pdf")
|
||||
</code></pre>
|
||||
<h2 id="async-api"><a class="header" href="#async-api">Async API</a></h2>
|
||||
<p>For asyncio-based applications, use the async API:</p>
|
||||
<pre><code class="language-python">import pdftract.asyncio as pdftract_async
|
||||
|
||||
async def extract_async():
|
||||
doc = await pdftract_async.extract("document.pdf")
|
||||
print(f"Extracted {len(doc.pages)} pages")
|
||||
</code></pre>
|
||||
<h2 id="see-also"><a class="header" href="#see-also">See Also</a></h2>
|
||||
<ul>
|
||||
<li><a href="../cli/mcp.html">MCP Server Documentation</a></li>
|
||||
<li><a href="../json-schema-reference.html">JSON Schema Reference</a></li>
|
||||
<li><a href="../cli/README.html">CLI Reference</a></li>
|
||||
<li><a href="../advanced/ocr.html">Advanced: OCR Configuration</a></li>
|
||||
</ul>
|
||||
|
||||
</main>
|
||||
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -181,10 +181,172 @@
|
|||
<div id="mdbook-content" class="content">
|
||||
<main>
|
||||
<h1 id="rust-sdk"><a class="header" href="#rust-sdk">Rust SDK</a></h1>
|
||||
<blockquote>
|
||||
<p><strong>Draft</strong> — This page is a placeholder for future content.</p>
|
||||
</blockquote>
|
||||
<p>Using pdftract from Rust.</p>
|
||||
<p>The Rust SDK is the <code>pdftract-core</code> crate. It provides native PDF text extraction with zero-copy memory mapping and streaming support.</p>
|
||||
<h2 id="installation"><a class="header" href="#installation">Installation</a></h2>
|
||||
<p>Add to your <code>Cargo.toml</code>:</p>
|
||||
<pre><code class="language-toml">[dependencies]
|
||||
pdftract-core = "1.0"
|
||||
</code></pre>
|
||||
<p>For OCR support, enable the <code>ocr</code> feature:</p>
|
||||
<pre><code class="language-toml">[dependencies]
|
||||
pdftract-core = { version = "1.0", features = ["ocr"] }
|
||||
</code></pre>
|
||||
<h2 id="basic-extraction"><a class="header" href="#basic-extraction">Basic Extraction</a></h2>
|
||||
<pre class="playground"><code class="language-rust">use pdftract_core::{extract, ExtractionOptions};
|
||||
|
||||
fn main() -> anyhow::Result<()> {
|
||||
let opts = ExtractionOptions::default();
|
||||
let result = extract("document.pdf", &opts)?;
|
||||
|
||||
for (i, page) in result.pages.iter().enumerate() {
|
||||
println!("Page {}: {} spans", i + 1, page.spans.len());
|
||||
for span in &page.spans {
|
||||
println!(" {}", span.text);
|
||||
}
|
||||
}
|
||||
Ok(())
|
||||
}</code></pre>
|
||||
<h2 id="streaming-extraction"><a class="header" href="#streaming-extraction">Streaming Extraction</a></h2>
|
||||
<p>For large PDFs, stream pages one at a time to keep memory usage bounded:</p>
|
||||
<pre class="playground"><code class="language-rust">use pdftract_core::{extract_stream, ExtractionOptions};
|
||||
use std::path::Path;
|
||||
|
||||
fn main() -> anyhow::Result<()> {
|
||||
let opts = ExtractionOptions::default();
|
||||
let pages = extract_stream(Path::new("large_document.pdf"), &opts)?;
|
||||
|
||||
for page_result in pages {
|
||||
let page = page_result?;
|
||||
println!("Page {}: {} spans", page.index, page.spans.len());
|
||||
}
|
||||
Ok(())
|
||||
}</code></pre>
|
||||
<h2 id="options"><a class="header" href="#options">Options</a></h2>
|
||||
<h3 id="extractionoptions"><a class="header" href="#extractionoptions">ExtractionOptions</a></h3>
|
||||
<div class="table-wrapper">
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Field</th><th>Type</th><th>Default</th><th>Use Case</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td><code>receipts</code></td><td><code>ReceiptsMode</code></td><td><code>Off</code></td><td>Generate cryptographic receipts</td></tr>
|
||||
<tr><td><code>max_parallel_pages</code></td><td><code>usize</code></td><td><code>4</code></td><td>Control memory for concurrent page processing</td></tr>
|
||||
<tr><td><code>memory_budget_mb</code></td><td><code>usize</code></td><td><code>512</code></td><td>Target peak RSS in MB</td></tr>
|
||||
<tr><td><code>full_render</code></td><td><code>bool</code></td><td><code>false</code></td><td>Enable PDFium rendering (requires <code>full-render</code> feature)</td></tr>
|
||||
<tr><td><code>ocr_dpi_override</code></td><td><code>Option<u32></code></td><td><code>None</code></td><td>Override automatic DPI selection</td></tr>
|
||||
<tr><td><code>ocr_language</code></td><td><code>Vec<String></code></td><td><code>vec!["eng"]</code></td><td>Tesseract language codes</td></tr>
|
||||
<tr><td><code>markdown_anchors</code></td><td><code>bool</code></td><td><code>false</code></td><td>Emit HTML comment anchors in Markdown</td></tr>
|
||||
<tr><td><code>max_decompress_bytes</code></td><td><code>u64</code></td><td><code>512 MiB</code></td><td>Bomb limit for decompressed streams</td></tr>
|
||||
<tr><td><code>output</code></td><td><code>OutputOptions</code></td><td><code>default()</code></td><td>Output filtering options</td></tr>
|
||||
<tr><td><code>pages</code></td><td><code>Option<String></code></td><td><code>None</code></td><td>Page range (e.g., <code>"1-5,7,12-"</code>)</td></tr>
|
||||
<tr><td><code>password</code></td><td><code>Option<SecretString></code></td><td><code>None</code></td><td>PDF password for encrypted documents</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<h3 id="outputoptions"><a class="header" href="#outputoptions">OutputOptions</a></h3>
|
||||
<div class="table-wrapper">
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Field</th><th>Type</th><th>Default</th><th>Use Case</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td><code>include_invisible</code></td><td><code>bool</code></td><td><code>false</code></td><td>Include invisible text in output</td></tr>
|
||||
<tr><td><code>extract_forms</code></td><td><code>bool</code></td><td><code>true</code></td><td>Extract AcroForm fields</td></tr>
|
||||
<tr><td><code>extract_attachments</code></td><td><code>bool</code></td><td><code>true</code></td><td>Extract embedded attachments</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<h2 id="receipts"><a class="header" href="#receipts">Receipts</a></h2>
|
||||
<p>Generate cryptographic receipts for verification:</p>
|
||||
<pre class="playground"><code class="language-rust">use pdftract_core::{extract, ExtractionOptions};
|
||||
use pdftract_core::options::ReceiptsMode;
|
||||
|
||||
fn main() -> anyhow::Result<()> {
|
||||
let opts = ExtractionOptions {
|
||||
receipts: ReceiptsMode::Lite,
|
||||
..Default::default()
|
||||
};
|
||||
let result = extract("document.pdf", &opts)?;
|
||||
|
||||
// Receipts are embedded in page metadata
|
||||
if let Some(receipt) = &result.pages[0].receipt {
|
||||
println!("Receipt: {}", receipt);
|
||||
}
|
||||
Ok(())
|
||||
}</code></pre>
|
||||
<h2 id="remote-pdfs"><a class="header" href="#remote-pdfs">Remote PDFs</a></h2>
|
||||
<p>With the <code>remote</code> feature, fetch PDFs via HTTP:</p>
|
||||
<pre class="playground"><code class="language-rust">use pdftract_core::{extract, ExtractionOptions};
|
||||
use std::path::Path;
|
||||
|
||||
fn main() -> anyhow::Result<()> {
|
||||
let opts = ExtractionOptions::default();
|
||||
let result = extract(Path::new("https://example.com/document.pdf"), &opts)?;
|
||||
Ok(())
|
||||
}</code></pre>
|
||||
<h2 id="error-handling"><a class="header" href="#error-handling">Error Handling</a></h2>
|
||||
<p>Most functions return <code>anyhow::Result<T></code> which wraps various error types:</p>
|
||||
<pre class="playground"><code class="language-rust">use pdftract_core::{extract, ExtractionOptions};
|
||||
use std::path::Path;
|
||||
|
||||
fn main() {
|
||||
let opts = ExtractionOptions::default();
|
||||
|
||||
match extract(Path::new("document.pdf"), &opts) {
|
||||
Ok(result) => {
|
||||
println!("Extracted {} pages", result.pages.len());
|
||||
}
|
||||
Err(e) => {
|
||||
eprintln!("Extraction failed: {}", e);
|
||||
// Inspect error chain
|
||||
for cause in e.chain() {
|
||||
eprintln!(" caused by: {}", cause);
|
||||
}
|
||||
}
|
||||
}
|
||||
}</code></pre>
|
||||
<h2 id="feature-flags"><a class="header" href="#feature-flags">Feature Flags</a></h2>
|
||||
<div class="table-wrapper">
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Feature</th><th>Adds</th><th>Default</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td><code>serde</code></td><td>JSON serialization support</td><td>✓</td></tr>
|
||||
<tr><td><code>decrypt</code></td><td>Decryption of encrypted PDFs</td><td>✓</td></tr>
|
||||
<tr><td><code>quick-xml</code></td><td>Conformance detection via XML metadata</td><td>✓</td></tr>
|
||||
<tr><td><code>ocr</code></td><td>Tesseract OCR for scanned documents</td><td>-</td></tr>
|
||||
<tr><td><code>full-render</code></td><td>PDFium-based rendering (requires <code>ocr</code>)</td><td>-</td></tr>
|
||||
<tr><td><code>remote</code></td><td>HTTP range fetching for remote PDFs</td><td>-</td></tr>
|
||||
<tr><td><code>profiles</code></td><td>Extraction profiles</td><td>-</td></tr>
|
||||
<tr><td><code>receipts</code></td><td>Cryptographic receipt generation</td><td>-</td></tr>
|
||||
<tr><td><code>cjk</code></td><td>CJK text extraction via predefined CMap registry</td><td>-</td></tr>
|
||||
<tr><td><code>schemars</code></td><td>JSON Schema generation</td><td>-</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<h2 id="source-types"><a class="header" href="#source-types">Source Types</a></h2>
|
||||
<p>The SDK supports multiple source types via the <code>PdfSource</code> trait:</p>
|
||||
<pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
|
||||
</span><span class="boring">fn main() {
|
||||
</span>use pdftract_core::source::{FileSource, MmapSource, MemorySource};
|
||||
|
||||
// Memory-mapped source (zero-copy for large files)
|
||||
let source = MmapSource::open("document.pdf")?;
|
||||
|
||||
// In-memory source (for byte buffers)
|
||||
let data = std::fs::read("document.pdf")?;
|
||||
let source = MemorySource::new(data);
|
||||
|
||||
// Standard file source
|
||||
let source = FileSource::open("document.pdf")?;
|
||||
<span class="boring">}</span></code></pre>
|
||||
<h2 id="see-also"><a class="header" href="#see-also">See Also</a></h2>
|
||||
<ul>
|
||||
<li><a href="../json-schema-reference.html">JSON Schema Reference</a></li>
|
||||
<li><a href="../cli/README.html">CLI Reference</a></li>
|
||||
<li><a href="../advanced/ocr.html">Advanced: OCR Configuration</a></li>
|
||||
</ul>
|
||||
|
||||
</main>
|
||||
|
||||
|
|
|
|||
|
|
@ -437,7 +437,7 @@ window.search = window.search || {};
|
|||
if (yes) {
|
||||
loadSearchScript(
|
||||
window.path_to_searchindex_js ||
|
||||
path_to_root + 'searchindex-fc6d8bf8.js',
|
||||
path_to_root + 'searchindex-b0453933.js',
|
||||
'mdbook-search-index');
|
||||
search_wrap.classList.remove('hidden');
|
||||
searchicon.setAttribute('aria-expanded', 'true');
|
||||
|
|
|
|||
1
docs/user-docs/build/user-docs/searchindex-b0453933.js
Normal file
1
docs/user-docs/build/user-docs/searchindex-b0453933.js
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
|
@ -3,7 +3,7 @@
|
|||
<head>
|
||||
<!-- Book generated using mdBook -->
|
||||
<meta charset="UTF-8">
|
||||
<title>CLI Reference - pdftract User Documentation</title>
|
||||
<title>Troubleshooting Guide - pdftract User Documentation</title>
|
||||
|
||||
|
||||
<!-- Custom HTML head -->
|
||||
|
|
@ -12,33 +12,33 @@
|
|||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<meta name="theme-color" content="#ffffff">
|
||||
|
||||
<link rel="icon" href="../favicon-de23e50b.svg">
|
||||
<link rel="shortcut icon" href="../favicon-8114d1fc.png">
|
||||
<link rel="stylesheet" href="../css/variables-8adf115d.css">
|
||||
<link rel="stylesheet" href="../css/general-2459343d.css">
|
||||
<link rel="stylesheet" href="../css/chrome-ae938929.css">
|
||||
<link rel="stylesheet" href="../css/print-9e4910d8.css" media="print">
|
||||
<link rel="icon" href="favicon-de23e50b.svg">
|
||||
<link rel="shortcut icon" href="favicon-8114d1fc.png">
|
||||
<link rel="stylesheet" href="css/variables-8adf115d.css">
|
||||
<link rel="stylesheet" href="css/general-2459343d.css">
|
||||
<link rel="stylesheet" href="css/chrome-ae938929.css">
|
||||
<link rel="stylesheet" href="css/print-9e4910d8.css" media="print">
|
||||
|
||||
<!-- Fonts -->
|
||||
<link rel="stylesheet" href="../fonts/fonts-9644e21d.css">
|
||||
<link rel="stylesheet" href="fonts/fonts-9644e21d.css">
|
||||
|
||||
<!-- Highlight.js Stylesheets -->
|
||||
<link rel="stylesheet" id="mdbook-highlight-css" href="../highlight-493f70e1.css">
|
||||
<link rel="stylesheet" id="mdbook-tomorrow-night-css" href="../tomorrow-night-4c0ae647.css">
|
||||
<link rel="stylesheet" id="mdbook-ayu-highlight-css" href="../ayu-highlight-3fdfc3ac.css">
|
||||
<link rel="stylesheet" id="mdbook-highlight-css" href="highlight-493f70e1.css">
|
||||
<link rel="stylesheet" id="mdbook-tomorrow-night-css" href="tomorrow-night-4c0ae647.css">
|
||||
<link rel="stylesheet" id="mdbook-ayu-highlight-css" href="ayu-highlight-3fdfc3ac.css">
|
||||
|
||||
<!-- Custom theme stylesheets -->
|
||||
|
||||
|
||||
<!-- Provide site root and default themes to javascript -->
|
||||
<script>
|
||||
const path_to_root = "../";
|
||||
const path_to_root = "";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -105,7 +105,7 @@
|
|||
<!-- populated by js -->
|
||||
<mdbook-sidebar-scrollbox class="sidebar-scrollbox"></mdbook-sidebar-scrollbox>
|
||||
<noscript>
|
||||
<iframe class="sidebar-iframe-outer" src="../toc.html"></iframe>
|
||||
<iframe class="sidebar-iframe-outer" src="toc.html"></iframe>
|
||||
</noscript>
|
||||
<div id="mdbook-sidebar-resize-handle" class="sidebar-resize-handle">
|
||||
<div class="sidebar-resize-indicator"></div>
|
||||
|
|
@ -140,13 +140,13 @@
|
|||
<h1 class="menu-title">pdftract User Documentation</h1>
|
||||
|
||||
<div class="right-buttons">
|
||||
<a href="../print.html" title="Print this book" aria-label="Print this book">
|
||||
<a href="print.html" title="Print this book" aria-label="Print this book">
|
||||
<span class=fa-svg id="print-button"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M128 0C92.7 0 64 28.7 64 64v96h64V64H354.7L384 93.3V160h64V93.3c0-17-6.7-33.3-18.7-45.3L400 18.7C388 6.7 371.7 0 354.7 0H128zM384 352v32 64H128V384 368 352H384zm64 32h32c17.7 0 32-14.3 32-32V256c0-35.3-28.7-64-64-64H64c-35.3 0-64 28.7-64 64v96c0 17.7 14.3 32 32 32H64v64c0 35.3 28.7 64 64 64H384c35.3 0 64-28.7 64-64V384zm-16-88c-13.3 0-24-10.7-24-24s10.7-24 24-24s24 10.7 24 24s-10.7 24-24 24z"/></svg></span>
|
||||
</a>
|
||||
<a href="https://github.com/jedarden/pdftract" title="Git repository" aria-label="Git repository">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 496 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span>
|
||||
</a>
|
||||
<a href="https://github.com/jedarden/pdftract/edit/main/docs/user-docs/src/src/cli/README.md" title="Suggest an edit" aria-label="Suggest an edit" rel="edit">
|
||||
<a href="https://github.com/jedarden/pdftract/edit/main/docs/user-docs/src/src/troubleshooting.md" title="Suggest an edit" aria-label="Suggest an edit" rel="edit">
|
||||
<span class=fa-svg id="git-edit-button"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M421.7 220.3l-11.3 11.3-22.6 22.6-205 205c-6.6 6.6-14.8 11.5-23.8 14.1L30.8 511c-8.4 2.5-17.5 .2-23.7-6.1S-1.5 489.7 1 481.2L38.7 353.1c2.6-9 7.5-17.2 14.1-23.8l205-205 22.6-22.6 11.3-11.3 33.9 33.9 62.1 62.1 33.9 33.9zM96 353.9l-9.3 9.3c-.9 .9-1.6 2.1-2 3.4l-25.3 86 86-25.3c1.3-.4 2.5-1.1 3.4-2l9.3-9.3H112c-8.8 0-16-7.2-16-16V353.9zM453.3 19.3l39.4 39.4c25 25 25 65.5 0 90.5l-14.5 14.5-22.6 22.6-11.3 11.3-33.9-33.9-62.1-62.1L314.3 67.7l11.3-11.3 22.6-22.6 14.5-14.5c25-25 65.5-25 90.5 0z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
@ -180,21 +180,227 @@
|
|||
|
||||
<div id="mdbook-content" class="content">
|
||||
<main>
|
||||
<h1 id="cli-reference"><a class="header" href="#cli-reference">CLI Reference</a></h1>
|
||||
<h1 id="troubleshooting"><a class="header" href="#troubleshooting">Troubleshooting</a></h1>
|
||||
<p>This guide maps common pdftract failures to their causes and fixes. Each error is associated with a <strong>diagnostic code</strong> that appears in extraction output (see <code>diagnostics</code> in the JSON response or CLI stderr).</p>
|
||||
<blockquote>
|
||||
<p><strong>Draft</strong> — This section is a placeholder for future content.</p>
|
||||
<p><strong>For the authoritative diagnostic code catalog</strong>, see the <a href="./troubleshooting/diagnostics.html">Diagnostics Reference</a>.</p>
|
||||
</blockquote>
|
||||
<p>Complete command-line interface documentation.</p>
|
||||
<h2 id="symptom--diagnostic-lookup"><a class="header" href="#symptom--diagnostic-lookup">Symptom → Diagnostic Lookup</a></h2>
|
||||
<div class="table-wrapper">
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Symptom</th><th>Likely Diagnostic Code</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td>PDF won’t open, “encrypted” error</td><td><code>ENCRYPTION_UNSUPPORTED</code></td></tr>
|
||||
<tr><td>Text extraction incomplete or missing</td><td><code>XREF_REPAIRED</code>, <code>OCR_*_UNSUPPORTED</code></td></tr>
|
||||
<tr><td>Process hangs or runs very long</td><td><code>STREAM_BOMB</code></td></tr>
|
||||
<tr><td>“Path outside root” (MCP mode)</td><td><code>MCP_PATH_TRAVERSAL</code></td></tr>
|
||||
<tr><td>Cache errors / corrupted entries</td><td><code>CACHE_ENTRY_CORRUPT</code>, <code>CACHE_INTEGRITY_FAIL</code></td></tr>
|
||||
<tr><td>Profile fails to load</td><td><code>PROFILE_INVALID</code>, <code>PROFILE_SECRETS_FORBIDDEN</code></td></tr>
|
||||
<tr><td>Remote URL fetch blocked</td><td><code>URL_PRIVATE_NETWORK</code></td></tr>
|
||||
<tr><td>Requested page doesn’t exist</td><td><code>PAGE_OUT_OF_RANGE</code></td></tr>
|
||||
<tr><td>Text contains placeholder characters (⍰)</td><td><code>GLYPH_UNMAPPED</code></td></tr>
|
||||
<tr><td>Broken vector graphics not recovered</td><td><code>BROKENVECTOR_OCR_UNAVAILABLE</code></td></tr>
|
||||
<tr><td>JavaScript warning in output</td><td><code>JAVASCRIPT_PRESENT</code></td></tr>
|
||||
<tr><td>Circular reference warnings</td><td><code>STRUCT_CIRCULAR_REF</code>, <code>STRUCT_XOBJECT_CYCLE</code></td></tr>
|
||||
<tr><td>Stack overflow warnings</td><td><code>GSTATE_STACK_OVERFLOW</code></td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<hr>
|
||||
<h2 id="xref_repaired-warning"><a class="header" href="#xref_repaired-warning">XREF_REPAIRED warning</a></h2>
|
||||
<p><strong>What it means</strong>: pdftract found the PDF’s cross-reference table was corrupt and ran the forward-scan fallback (Phase 1.3) to recover.</p>
|
||||
<p><strong>Cause</strong>: PDF created or transmitted with truncation or corruption. The <code>startxref</code> offset points outside the file, or the xref table is malformed.</p>
|
||||
<p><strong>Fix</strong>: Usually no action needed; extraction succeeds with the recovered xref. Output may be incomplete on truncated files. If extraction fails, the PDF is unsalvageable.</p>
|
||||
<p><strong>Severity</strong>: info (extraction continues)</p>
|
||||
<hr>
|
||||
<h2 id="stream_bomb-error"><a class="header" href="#stream_bomb-error">STREAM_BOMB error</a></h2>
|
||||
<p><strong>What it means</strong>: A compressed stream exceeded the decompression size limit (default: 512 MB).</p>
|
||||
<p><strong>Cause</strong>: A hostile PDF with a “compression bomb” — a small stream that expands to multi-GB size (e.g., 10 KB → 2 GB). This is a common security exploit pattern.</p>
|
||||
<p><strong>Fix</strong>:</p>
|
||||
<ul>
|
||||
<li>If the PDF is <strong>trusted</strong>: Increase the limit with <code>--max-decompress-gb 2</code> (or higher)</li>
|
||||
<li>If the PDF is <strong>untrusted</strong>: Treat as a hostile file; do not process</li>
|
||||
</ul>
|
||||
<p><strong>Severity</strong>: error (stream aborted; partial extraction returned)</p>
|
||||
<hr>
|
||||
<h2 id="encryption_unsupported-fatal"><a class="header" href="#encryption_unsupported-fatal">ENCRYPTION_UNSUPPORTED fatal</a></h2>
|
||||
<p><strong>What it means</strong>: The PDF is encrypted with an unsupported handler or the wrong password.</p>
|
||||
<p><strong>Cause</strong>:</p>
|
||||
<ul>
|
||||
<li>PDF encrypted with an unknown handler (e.g., Adobe LiveCycle policy server)</li>
|
||||
<li>PDF password-protected but no password (or wrong password) supplied</li>
|
||||
</ul>
|
||||
<p><strong>Fix</strong>:</p>
|
||||
<pre><code class="language-bash"># Supply password via environment variable
|
||||
export PDFTRACT_PASSWORD="your-password"
|
||||
pdftract extract document.pdf
|
||||
|
||||
# Or via stdin
|
||||
echo "your-password" | pdftract extract --password-stdin document.pdf
|
||||
</code></pre>
|
||||
<p>If the handler is unsupported (e.g., Adobe LiveCycle), use an Adobe-side decryption tool first, or a dedicated password recovery tool like <code>pdfcrack</code> or <code>john</code>.</p>
|
||||
<p><strong>Severity</strong>: fatal (process exits with code 3)</p>
|
||||
<hr>
|
||||
<h2 id="ocr_jbig2_unsupported--ocr_jpx_unsupported--ocr_ccitt_unsupported-warning"><a class="header" href="#ocr_jbig2_unsupported--ocr_jpx_unsupported--ocr_ccitt_unsupported-warning">OCR_JBIG2_UNSUPPORTED / OCR_JPX_UNSUPPORTED / OCR_CCITT_UNSUPPORTED warning</a></h2>
|
||||
<p><strong>What it means</strong>: A page contains an image that requires a decoder not available in the current build.</p>
|
||||
<p><strong>Cause</strong>:</p>
|
||||
<ul>
|
||||
<li><code>OCR_JBIG2_UNSUPPORTED</code>: JBIG2-encoded image (rare)</li>
|
||||
<li><code>OCR_JPX_UNSUPPORTED</code>: JPEG 2000-encoded image</li>
|
||||
<li><code>OCR_CCITT_UNSUPPORTED</code>: CCITT fax-encoded image</li>
|
||||
</ul>
|
||||
<p><strong>Fix</strong>:</p>
|
||||
<pre><code class="language-bash"># Build with full-render feature (enables all decoders via PDFium)
|
||||
cargo build --release --features full-render
|
||||
|
||||
# Or install system libraries:
|
||||
# - JPX: install libopenjp2
|
||||
# - CCITT: install libtiff
|
||||
</code></pre>
|
||||
<p><strong>Severity</strong>: warn (page skipped from OCR; extraction continues)</p>
|
||||
<hr>
|
||||
<h2 id="brokenvector_ocr_unavailable-warning"><a class="header" href="#brokenvector_ocr_unavailable-warning">BROKENVECTOR_OCR_UNAVAILABLE warning</a></h2>
|
||||
<p><strong>What it means</strong>: A page contains broken vector graphics that could be recovered via OCR, but the OCR feature is disabled.</p>
|
||||
<p><strong>Cause</strong>: Build was compiled without the <code>ocr</code> feature.</p>
|
||||
<p><strong>Fix</strong>: Rebuild with OCR enabled:</p>
|
||||
<pre><code class="language-bash">cargo build --release --features ocr
|
||||
</code></pre>
|
||||
<p><strong>Severity</strong>: warn (broken vector graphics not recovered; extraction continues)</p>
|
||||
<hr>
|
||||
<h2 id="mcp_path_traversal--path_outside_root-error"><a class="header" href="#mcp_path_traversal--path_outside_root-error">MCP_PATH_TRAVERSAL / PATH_OUTSIDE_ROOT error</a></h2>
|
||||
<p><strong>What it means</strong>: (MCP mode) The requested path escapes the <code>--root</code> directory boundary.</p>
|
||||
<p><strong>Cause</strong>: A tool call attempted path traversal (e.g., <code>../../etc/passwd</code>).</p>
|
||||
<p><strong>Fix</strong>:</p>
|
||||
<ul>
|
||||
<li>Adjust the requested path to stay within <code>--root</code></li>
|
||||
<li>Or restart the MCP server without <code>--root</code> restriction (not recommended for multi-tenant deployments)</li>
|
||||
</ul>
|
||||
<p><strong>Severity</strong>: error (request rejected)</p>
|
||||
<hr>
|
||||
<h2 id="url_private_network-error"><a class="header" href="#url_private_network-error">URL_PRIVATE_NETWORK error</a></h2>
|
||||
<p><strong>What it means</strong>: Remote fetch blocked because the URL targets a private network address.</p>
|
||||
<p><strong>Cause</strong>: URL targets localhost, private IP ranges (RFC 1918), or link-local addresses. This is an SSRF (Server-Side Request Forgery) protection.</p>
|
||||
<p><strong>Fix</strong>:</p>
|
||||
<pre><code class="language-bash"># If you trust the URL, allow private networks:
|
||||
pdftract extract --allow-private-networks https://internal-server/docs.pdf
|
||||
</code></pre>
|
||||
<p><strong>Severity</strong>: error (request rejected with HTTP 400 in serve mode)</p>
|
||||
<hr>
|
||||
<h2 id="cache_entry_corrupt-warning"><a class="header" href="#cache_entry_corrupt-warning">CACHE_ENTRY_CORRUPT warning</a></h2>
|
||||
<p><strong>What it means</strong>: A cache entry failed integrity verification.</p>
|
||||
<p><strong>Cause</strong>: Cache file corruption (disk error, concurrent write, etc.).</p>
|
||||
<p><strong>Fix</strong>: None needed — the entry is automatically deleted and extraction re-runs. If this recurs frequently, check your disk filesystem.</p>
|
||||
<p><strong>Severity</strong>: warn (entry deleted; extraction re-runs)</p>
|
||||
<hr>
|
||||
<h2 id="cache_integrity_fail-diagnostic"><a class="header" href="#cache_integrity_fail-diagnostic">CACHE_INTEGRITY_FAIL diagnostic</a></h2>
|
||||
<p><strong>What it means</strong>: A cache entry’s HMAC verification failed, indicating potential cache poisoning.</p>
|
||||
<p><strong>Cause</strong>: Malicious co-tenant wrote a forged cache entry (multi-user cache scenarios), or disk corruption.</p>
|
||||
<p><strong>Fix</strong>: The entry is treated as a cache miss and extraction re-runs. In multi-user environments, ensure per-user cache directories or verify cache permissions.</p>
|
||||
<p><strong>Severity</strong>: warn (entry rejected; extraction re-runs)</p>
|
||||
<hr>
|
||||
<h2 id="profile_invalid--profile_secrets_forbidden-error"><a class="header" href="#profile_invalid--profile_secrets_forbidden-error">PROFILE_INVALID / PROFILE_SECRETS_FORBIDDEN error</a></h2>
|
||||
<p><strong>What it means</strong>: Profile YAML failed validation.</p>
|
||||
<p><strong>Cause</strong>:</p>
|
||||
<ul>
|
||||
<li><code>PROFILE_INVALID</code>: YAML syntax error or schema violation</li>
|
||||
<li><code>PROFILE_SECRETS_FORBIDDEN</code>: Profile contains secret-keyword keys (<code>password:</code>, <code>token:</code>, <code>secret:</code>, <code>api_key:</code>)</li>
|
||||
</ul>
|
||||
<p><strong>Fix</strong>:</p>
|
||||
<pre><code class="language-bash"># For schema errors, check the YAML syntax:
|
||||
pdftract profile show --profile-path your-profile.yaml
|
||||
|
||||
# For secrets errors, remove secret keys from the profile.
|
||||
# Secrets should be passed via environment variables, not profiles.
|
||||
</code></pre>
|
||||
<p><strong>Severity</strong>: error (profile rejected)</p>
|
||||
<hr>
|
||||
<h2 id="page_out_of_range-warning"><a class="header" href="#page_out_of_range-warning">PAGE_OUT_OF_RANGE warning</a></h2>
|
||||
<p><strong>What it means</strong>: The <code>--pages</code> argument exceeds the document’s actual page count.</p>
|
||||
<p><strong>Cause</strong>: Page range specified (e.g., <code>--pages 1-100</code>) on a document with fewer pages (e.g., 10 pages).</p>
|
||||
<p><strong>Fix</strong>: Adjust the <code>--pages</code> argument to the actual page count:</p>
|
||||
<pre><code class="language-bash"># First, get the page count:
|
||||
pdftract inspect document.json | jq '.page_count'
|
||||
|
||||
# Then extract with a valid range:
|
||||
pdftract extract --pages 1-10 document.pdf
|
||||
</code></pre>
|
||||
<p><strong>Severity</strong>: warn (pages clamped to available range)</p>
|
||||
<hr>
|
||||
<h2 id="glyph_unmapped-warning"><a class="header" href="#glyph_unmapped-warning">GLYPH_UNMAPPED warning</a></h2>
|
||||
<p><strong>What it means</strong>: A glyph could not be resolved by any of the four encoding levels.</p>
|
||||
<p><strong>Cause</strong>: Font encoding corruption, missing font embedding, or non-standard encoding.</p>
|
||||
<p><strong>Fix</strong>: Output contains the Unicode replacement character (⍰). No direct fix; consider re-saving the PDF through a normalizing tool (e.g., Adobe Acrobat, qpdf).</p>
|
||||
<p><strong>Severity</strong>: warn (character replaced with U+FFFD; extraction continues)</p>
|
||||
<hr>
|
||||
<h2 id="javascript_present-info"><a class="header" href="#javascript_present-info">JAVASCRIPT_PRESENT info</a></h2>
|
||||
<p><strong>What it means</strong>: PDF contains embedded JavaScript (in <code>/AA</code>, <code>/OpenAction</code>, or <code>/JS</code> entries).</p>
|
||||
<p><strong>Cause</strong>: PDF includes JavaScript actions (common in forms, interactive documents).</p>
|
||||
<p><strong>Fix</strong>: None needed for extraction — pdftract NEVER executes embedded JavaScript. JavaScript actions are surfaced in <code>metadata.javascript_actions[]</code> for downstream review.</p>
|
||||
<p><strong>Severity</strong>: info (JavaScript is not executed)</p>
|
||||
<hr>
|
||||
<h2 id="struct_circular_ref--struct_xobject_cycle--gstate_stack_overflow-warning"><a class="header" href="#struct_circular_ref--struct_xobject_cycle--gstate_stack_overflow-warning">STRUCT_CIRCULAR_REF / STRUCT_XOBJECT_CYCLE / GSTATE_STACK_OVERFLOW warning</a></h2>
|
||||
<p><strong>What it means</strong>: PDF contains circular references or malformed content streams.</p>
|
||||
<p><strong>Cause</strong>:</p>
|
||||
<ul>
|
||||
<li><code>STRUCT_CIRCULAR_REF</code>: Indirect object reference cycle</li>
|
||||
<li><code>STRUCT_XOBJECT_CYCLE</code>: XObject (image/form) reference cycle</li>
|
||||
<li><code>GSTATE_STACK_OVERFLOW</code>: Graphics state stack exceeds depth limit</li>
|
||||
</ul>
|
||||
<p><strong>Fix</strong>: Usually no action needed — pdftract breaks cycles at the second visit (or depth 20 for XObjects). If output is incomplete, investigate the source PDF for a producer bug.</p>
|
||||
<p><strong>Severity</strong>: warn (cycle broken; extraction continues)</p>
|
||||
<hr>
|
||||
<h2 id="remote_fetch_interrupted-error"><a class="header" href="#remote_fetch_interrupted-error">REMOTE_FETCH_INTERRUPTED error</a></h2>
|
||||
<p><strong>What it means</strong>: Remote fetch was interrupted (network timeout, connection reset, etc.).</p>
|
||||
<p><strong>Cause</strong>: Network connectivity issues, server timeout, or premature connection close.</p>
|
||||
<p><strong>Fix</strong>: Retry the request; check network connectivity:</p>
|
||||
<pre><code class="language-bash"># Retry with increased timeout:
|
||||
pdftract extract --timeout-seconds 120 https://example.com/document.pdf
|
||||
</code></pre>
|
||||
<p><strong>Severity</strong>: error (request aborted)</p>
|
||||
<hr>
|
||||
<h2 id="remote_no_range_support-warning"><a class="header" href="#remote_no_range_support-warning">REMOTE_NO_RANGE_SUPPORT warning</a></h2>
|
||||
<p><strong>What it means</strong>: Remote server does not support HTTP Range requests.</p>
|
||||
<p><strong>Cause</strong>: Server lacks <code>Accept-Ranges</code> header or returns 206 Unsupported.</p>
|
||||
<p><strong>Fix</strong>: None needed — pdftract falls back to whole-file download. For large files, consider hosting on a Range-supporting server.</p>
|
||||
<p><strong>Severity</strong>: warn (fallback to whole-file download)</p>
|
||||
<hr>
|
||||
<h2 id="tagged_pdf_struct_tree_deferred-info"><a class="header" href="#tagged_pdf_struct_tree_deferred-info">TAGGED_PDF_STRUCT_TREE_DEFERRED info</a></h2>
|
||||
<p><strong>What it means</strong>: Tagged PDF structure tree extraction is deferred in this version.</p>
|
||||
<p><strong>Cause</strong>: Phase 7.1 (full structure tree extraction) is not yet implemented.</p>
|
||||
<p><strong>Fix</strong>: None needed — this is a temporary fallback. Structure tree extraction will be added in v1.0.0.</p>
|
||||
<p><strong>Severity</strong>: info (structure tree not extracted)</p>
|
||||
<hr>
|
||||
<h2 id="getting-help"><a class="header" href="#getting-help">Getting Help</a></h2>
|
||||
<p>If you encounter a diagnostic code not listed here, or the suggested fix doesn’t resolve your issue:</p>
|
||||
<ol>
|
||||
<li><strong>Check the <a href="./troubleshooting/diagnostics.html">Diagnostics Reference</a></strong> for the full catalog</li>
|
||||
<li><strong>Search existing issues</strong> on <a href="https://github.com/jedarden/pdftract/issues">GitHub</a></li>
|
||||
<li><strong>Open a new issue</strong> with:
|
||||
<ul>
|
||||
<li>The diagnostic code(s)</li>
|
||||
<li>A minimal reproducible example (PDF or command)</li>
|
||||
<li>The <code>--debug</code> output if safe to share</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ol>
|
||||
<h2 id="related-documentation"><a class="header" href="#related-documentation">Related Documentation</a></h2>
|
||||
<ul>
|
||||
<li><a href="./troubleshooting/diagnostics.html">Diagnostics Reference</a> — Full diagnostic code catalog</li>
|
||||
<li><a href="./faq.html">FAQ</a> — Common questions and answers</li>
|
||||
<li><a href="./advanced/ocr.html">Advanced: OCR Configuration</a> — OCR troubleshooting details</li>
|
||||
</ul>
|
||||
|
||||
</main>
|
||||
|
||||
<nav class="nav-wrapper" aria-label="Page navigation">
|
||||
<!-- Mobile navigation buttons -->
|
||||
<a rel="prev" href="../quickstart.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<a rel="prev" href="advanced/provenance.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="../cli/global-options.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<a rel="next prefetch" href="troubleshooting/index.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
@ -204,11 +410,11 @@
|
|||
</div>
|
||||
|
||||
<nav class="nav-wide-wrapper" aria-label="Page navigation">
|
||||
<a rel="prev" href="../quickstart.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<a rel="prev" href="advanced/provenance.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
<a rel="next prefetch" href="../cli/global-options.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<a rel="next prefetch" href="troubleshooting/index.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M278.6 233.4c12.5 12.5 12.5 32.8 0 45.3l-160 160c-12.5 12.5-32.8 12.5-45.3 0s-12.5-32.8 0-45.3L210.7 256 73.4 118.6c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0l160 160z"/></svg></span>
|
||||
</a>
|
||||
</nav>
|
||||
|
|
@ -228,13 +434,13 @@
|
|||
</script>
|
||||
|
||||
|
||||
<script src="../elasticlunr-ef4e11c1.min.js"></script>
|
||||
<script src="../mark-09e88c2c.min.js"></script>
|
||||
<script src="../searcher-c2a407aa.js"></script>
|
||||
<script src="elasticlunr-ef4e11c1.min.js"></script>
|
||||
<script src="mark-09e88c2c.min.js"></script>
|
||||
<script src="searcher-c2a407aa.js"></script>
|
||||
|
||||
<script src="../clipboard-1626706a.min.js"></script>
|
||||
<script src="../highlight-abc7f01d.js"></script>
|
||||
<script src="../book-a0b12cfe.js"></script>
|
||||
<script src="clipboard-1626706a.min.js"></script>
|
||||
<script src="highlight-abc7f01d.js"></script>
|
||||
<script src="book-a0b12cfe.js"></script>
|
||||
|
||||
<!-- Custom JS scripts -->
|
||||
|
||||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
@ -190,7 +190,7 @@
|
|||
|
||||
<nav class="nav-wrapper" aria-label="Page navigation">
|
||||
<!-- Mobile navigation buttons -->
|
||||
<a rel="prev" href="../advanced/provenance.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<a rel="prev" href="../troubleshooting.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
@ -204,7 +204,7 @@
|
|||
</div>
|
||||
|
||||
<nav class="nav-wide-wrapper" aria-label="Page navigation">
|
||||
<a rel="prev" href="../advanced/provenance.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<a rel="prev" href="../troubleshooting.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
|
||||
<span class=fa-svg><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><!--! Font Awesome Free 6.2.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2022 Fonticons, Inc. --><path d="M41.4 233.4c-12.5 12.5-12.5 32.8 0 45.3l160 160c12.5 12.5 32.8 12.5 45.3 0s12.5-32.8 0-45.3L109.3 256 246.6 118.6c12.5-12.5 12.5-32.8 0-45.3s-32.8-12.5-45.3 0l-160 160z"/></svg></span>
|
||||
</a>
|
||||
|
||||
|
|
|
|||
|
|
@ -35,10 +35,10 @@
|
|||
const path_to_root = "../";
|
||||
const default_light_theme = "light";
|
||||
const default_dark_theme = "navy";
|
||||
window.path_to_searchindex_js = "../searchindex-fc6d8bf8.js";
|
||||
window.path_to_searchindex_js = "../searchindex-b0453933.js";
|
||||
</script>
|
||||
<!-- Start loading toc.js asap -->
|
||||
<script src="../toc-d0f907c9.js"></script>
|
||||
<script src="../toc-224e0484.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="mdbook-help-container">
|
||||
|
|
|
|||
|
|
@ -1,80 +1,78 @@
|
|||
# Verification Note: pdftract-1j0f8 (CLI Reference Documentation)
|
||||
# pdftract-1j0f8: CLI Reference Documentation
|
||||
|
||||
**Date:** 2025-06-01
|
||||
**Bead:** pdftract-1j0f8
|
||||
**Task:** Author docs/user-docs/src/cli-reference.md with auto-generation and CI gate
|
||||
## Summary
|
||||
|
||||
Verified CLI reference documentation infrastructure. Fixed a clap configuration bug that prevented the generator from running (duplicate short option `-s` in `conformance` subcommand).
|
||||
|
||||
## Work Completed
|
||||
|
||||
### 1. CLI Reference Documentation
|
||||
- **Status:** PASS - Already exists and is comprehensive
|
||||
- **File:** `docs/user-docs/src/cli-reference.md`
|
||||
- **Content:** Complete documentation for all subcommands and flags
|
||||
- **Structure:**
|
||||
- Header with AUTOGEN END marker for auto-generated content
|
||||
- Hand-curated content preserved after marker
|
||||
- Covers: extract, classify, grep, inspect, verify-receipt, hash, cache, profiles, serve, mcp, doctor
|
||||
### 1. Bug Fix: Clap Short Flag Conflict
|
||||
**File:** `crates/pdftract-cli/src/cli.rs`
|
||||
|
||||
### 2. Auto-Generation Tool
|
||||
- **Status:** PASS - Already implemented
|
||||
- **File:** `crates/pdftract-cli/src/gen_cli_reference.rs`
|
||||
- **Tool:** clap-markdown crate (integrated in Cargo.toml)
|
||||
- **Command:** `cargo run --bin gen-cli-reference -- --output docs/user-docs/src/cli-reference.md`
|
||||
- **Features:**
|
||||
- Generates markdown from clap definitions
|
||||
- Preserves hand-curated content after AUTOGEN END marker
|
||||
- Uses `help_markdown_custom` with MarkdownOptions
|
||||
**Problem:** The `conformance` subcommand had duplicate short options:
|
||||
- `--suite` used `-s`
|
||||
- `--sdk` used `-s` (conflict!)
|
||||
|
||||
### 3. mdBook Integration
|
||||
- **Status:** PASS - Updated
|
||||
- **File:** `docs/user-docs/src/SUMMARY.md`
|
||||
- **Change:** Updated link from `cli/README.md` to `cli-reference.md`
|
||||
- **Result:** CLI reference now properly linked in docs navigation
|
||||
**Solution:** Changed `--sdk` short option to `-k` (as used in CI workflow).
|
||||
|
||||
### 4. CI Gate
|
||||
- **Status:** PASS - Added
|
||||
- **File:** `.ci/argo-workflows/pdftract-ci.yaml`
|
||||
- **Changes:**
|
||||
1. Added `cli-ref-gen` task to quality-matrix DAG
|
||||
2. Created cli-ref-gen template (similar to schema-gen)
|
||||
3. Updated exit handler step outcomes
|
||||
- **Gate Logic:**
|
||||
- Runs `cargo run --bin gen-cli-reference` in CI container
|
||||
- Compares regenerated output to committed file
|
||||
- Fails build if diff detected
|
||||
- Provides reproduction instructions in error message
|
||||
**Before:**
|
||||
```rust
|
||||
#[arg(short, long, default_value = "pdftract")]
|
||||
sdk: String,
|
||||
```
|
||||
|
||||
### 5. Build Environment Issue
|
||||
- **Status:** WARN - Cannot verify build locally due to Nix cc permission issues
|
||||
- **Issue:** Permission denied when executing gcc during cargo build
|
||||
- **Workaround:** CI uses `ronaldraygun/pdftract-test-glibc:1.78` container which has proper build environment
|
||||
- **Verification:** The gen-cli-reference.rs code is correct and follows clap-markdown API
|
||||
**After:**
|
||||
```rust
|
||||
#[arg(short = 'k', long, default_value = "pdftract")]
|
||||
sdk: String,
|
||||
```
|
||||
|
||||
### 2. Verification Tests
|
||||
|
||||
1. **CLI Reference Generation:**
|
||||
```bash
|
||||
cargo run --bin gen-cli-reference -- --output /tmp/cli-reference-test.md
|
||||
```
|
||||
Result: PASS - Generated successfully with preserved hand-curated content.
|
||||
|
||||
2. **mdBook Build:**
|
||||
```bash
|
||||
cd docs/user-docs && mdbook build
|
||||
```
|
||||
Result: PASS - HTML book built successfully to `build/user-docs/`.
|
||||
|
||||
3. **CI Gate Check:**
|
||||
The `cli-ref-gen` template in `.ci/argo-workflows/pdftract-ci.yaml` (lines 1952-2042) correctly:
|
||||
- Regenerates CLI reference via `cargo run --bin gen-cli-reference`
|
||||
- Compares output to committed file
|
||||
- Fails build on any diff
|
||||
|
||||
## Acceptance Criteria Status
|
||||
|
||||
| Criterion | Status | Notes |
|
||||
|-----------|--------|-------|
|
||||
| cli-reference.md exists and is non-trivial | PASS | Comprehensive documentation exists |
|
||||
| Auto-gen step compiles and runs in mdBook build | N/A | Uses cargo binary, not mdBook preprocessor |
|
||||
| CI gate fails on stale cli-reference.md | PASS | Added cli-ref-gen template to quality-matrix |
|
||||
| mdBook renders the page without errors | PASS | Updated SUMMARY.md link |
|
||||
**PASS:**
|
||||
- cli-reference.md exists at `docs/user-docs/src/cli-reference.md`
|
||||
- Auto-gen compiles and runs: `cargo run --bin gen-cli-reference`
|
||||
- CI gate `cli-ref-gen` fails on stale content
|
||||
- mdBook builds and renders without errors
|
||||
- cli-reference.md is included in SUMMARY.md
|
||||
|
||||
## Artifacts Produced
|
||||
**WARN:**
|
||||
- None
|
||||
|
||||
1. **docs/user-docs/src/SUMMARY.md** - Updated CLI reference link
|
||||
2. **.ci/argo-workflows/pdftract-ci.yaml** - Added cli-ref-gen quality gate
|
||||
**FAIL:**
|
||||
- None
|
||||
|
||||
## Implementation Notes
|
||||
## Commit
|
||||
|
||||
The CLI reference uses a hybrid approach:
|
||||
- Auto-generated content from clap definitions (before AUTOGEN END marker)
|
||||
- Hand-curated content (after marker, preserved across regenerations)
|
||||
- **Files Changed:**
|
||||
- `crates/pdftract-cli/src/cli.rs`: Fixed short flag conflict
|
||||
|
||||
This matches the pattern used for schema generation, ensuring consistency across documentation tooling.
|
||||
## Retrospective
|
||||
|
||||
## References
|
||||
**What worked:** The CLI reference infrastructure was already complete with clap-markdown, CI gate, and mdBook integration.
|
||||
|
||||
- Plan section: DOC epic
|
||||
- clap-markdown crate: https://crates.io/crates/clap-markdown
|
||||
- Coordinator: pdftract-53no (parent — 5-page user docs bundle)
|
||||
- Sibling: schema-reference, sdk quickstarts, troubleshooting, FAQ
|
||||
**What didn't:** The clap configuration bug prevented the generator from running - needed to debug panic output to find the duplicate short option.
|
||||
|
||||
**Surprise:** The `-s` conflict existed but was masked - CI gate would catch it once docs needed regeneration.
|
||||
|
||||
**Reusable pattern:** When adding clap short options, always check for conflicts within the same subcommand context.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue