docs(contributing): add Argo-CI caveat, DCO sign-off, and contributor templates
- Restructured CONTRIBUTING.md with all nine required sections: - Project licensing (MIT OR Apache-2.0, DCO sign-off required) - Code of conduct (Contributor Covenant v2.1) - Security reporting (link to SECURITY.md) - Development setup (with OCR dependencies) - Local validation checklist (6 commands matching pdftract-ci) - CI on forks caveat (maintainer-triggered, 48-hour response) - PR template requirements - Commit message style (Conventional Commits) - Issue triage - Created CODE_OF_CONDUCT.md (Contributor Covenant v2.1) - Created .github/PULL_REQUEST_TEMPLATE.md with required fields: - Linked issue or RFC - Scope statement (Phase / Acceptance Scenario) - Test plan - Manual-test evidence - Performance impact - Created issue templates: - bug_report.md (with pdftract doctor output requirement) - feature_request.md (with use case and proposed solution) - performance_regression.md (with baseline vs current) - Updated README.md with Contributing section linking to CONTRIBUTING.md - Added footer links to CONTRIBUTING.md in all templates Closes: pdftract-i9rk Verification: notes/pdftract-i9rk.md Signed-off-by: jedarden <github@jedarden.com>
This commit is contained in:
parent
db7fcf0097
commit
97fecb7b4b
8 changed files with 668 additions and 35 deletions
66
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
66
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
---
|
||||
name: Bug report
|
||||
about: Report a problem with pdftract
|
||||
title: '[BUG] '
|
||||
labels: bug
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Bug Description
|
||||
|
||||
A clear and concise description of what the bug is.
|
||||
|
||||
## PDF File That Triggered the Bug
|
||||
|
||||
**IMPORTANT:** Please attach the PDF file that causes the bug. If the file is confidential, please sanitize it first or describe the issue in detail.
|
||||
|
||||
- **File:** (attach PDF or describe the issue)
|
||||
- **File size:** (if applicable)
|
||||
- **PDF generator:** (e.g., Acrobat, Word, Ghostscript)
|
||||
|
||||
## `pdftract doctor` Output
|
||||
|
||||
**REQUIRED:** Run `pdftract doctor` and paste the output here.
|
||||
|
||||
```text
|
||||
(paste output here)
|
||||
```
|
||||
|
||||
## Steps to Reproduce
|
||||
|
||||
1. Run this command: `...`
|
||||
2. With this PDF file: `...`
|
||||
3. See this error: `...`
|
||||
|
||||
## Expected Behavior
|
||||
|
||||
What should have happened?
|
||||
|
||||
## Actual Behavior
|
||||
|
||||
What actually happened? Include error messages, stack traces, or incorrect output.
|
||||
|
||||
## Environment
|
||||
|
||||
- **OS:** (e.g., Ubuntu 22.04, macOS 14, Windows 11)
|
||||
- **pdftract version:** (run `pdftract --version`)
|
||||
- **Installation method:** (e.g., cargo install, brew, compiled from source)
|
||||
- **Rust version:** (run `rustc --version`)
|
||||
|
||||
## Additional Context
|
||||
|
||||
Add any other context about the problem here:
|
||||
|
||||
- Logs (attach or paste)
|
||||
- Screenshots (if applicable)
|
||||
- Related issues or PRs
|
||||
- Workarounds you've found
|
||||
|
||||
---
|
||||
|
||||
**Note:** For help with development or contributing to pdftract, see [`CONTRIBUTING.md`](../../CONTRIBUTING.md).
|
||||
|
||||
- Logs (attach or paste)
|
||||
- Screenshots (if applicable)
|
||||
- Related issues or PRs
|
||||
- Workarounds you've found
|
||||
48
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
48
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
---
|
||||
name: Feature request
|
||||
about: Suggest an enhancement or new feature for pdftract
|
||||
title: '[FEATURE] '
|
||||
labels: enhancement
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Feature Description
|
||||
|
||||
A clear and concise description of the feature you'd like to see added.
|
||||
|
||||
## Use Case
|
||||
|
||||
Describe the specific problem this feature would solve. Who would benefit from this feature?
|
||||
|
||||
**Example:**
|
||||
"As a user working with scientific papers, I need to extract tables as structured data so that I can analyze experimental results without manual transcription."
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
How do you envision this feature working?
|
||||
|
||||
- **API:** What would the API look like?
|
||||
- **CLI:** What flags or commands would be added?
|
||||
- **Output format:** JSON, Markdown, CSV, etc.?
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
Describe any alternative solutions or workarounds you've considered. Why aren't they sufficient?
|
||||
|
||||
## Additional Context
|
||||
|
||||
Add any other context about the feature request here:
|
||||
|
||||
- Links to related issues or PRs
|
||||
- References to similar features in other tools
|
||||
- Example PDF files that demonstrate the need
|
||||
- Draft API designs or pseudocode
|
||||
|
||||
---
|
||||
|
||||
**Note:** For help with development or contributing to pdftract, see [`CONTRIBUTING.md`](../../CONTRIBUTING.md).
|
||||
|
||||
- Links to related issues or PRs
|
||||
- References to similar features in other tools
|
||||
- Example PDF files that demonstrate the need
|
||||
- Draft API designs or pseudocode
|
||||
80
.github/ISSUE_TEMPLATE/performance_regression.md
vendored
Normal file
80
.github/ISSUE_TEMPLATE/performance_regression.md
vendored
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
name: Performance regression
|
||||
about: Report a slowdown or performance issue
|
||||
title: '[PERF] '
|
||||
labels: performance
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Performance Issue Description
|
||||
|
||||
A clear and concise description of the performance problem.
|
||||
|
||||
## Baseline vs Current Performance
|
||||
|
||||
**BEFORE (working well):**
|
||||
- Version: (e.g., 0.5.0)
|
||||
- Processing time: (e.g., 2.5 seconds for a 100-page PDF)
|
||||
- Memory usage: (e.g., 150 MB peak)
|
||||
|
||||
**AFTER (regression):**
|
||||
- Version: (e.g., 0.6.0)
|
||||
- Processing time: (e.g., 8 seconds for the same PDF)
|
||||
- Memory usage: (e.g., 600 MB peak)
|
||||
|
||||
## Test Case
|
||||
|
||||
Please provide:
|
||||
1. **PDF file** (attach or link to a representative file)
|
||||
2. **Command used:**
|
||||
```bash
|
||||
pdftract <command> <file>
|
||||
```
|
||||
3. **Benchmark results** (before and after):
|
||||
```bash
|
||||
# Use `hyperfine` or similar for accurate measurements
|
||||
hyperfine 'pdftract old_version' 'pdftract new_version'
|
||||
```
|
||||
|
||||
## Profiling Data (Optional but Helpful)
|
||||
|
||||
If available, attach profiling output:
|
||||
```bash
|
||||
# Flamegraph (Linux)
|
||||
cargo install flamegraph
|
||||
cargo flamegraph --bin pdftract -- <args>
|
||||
|
||||
# Instruments (macOS)
|
||||
instruments -t "Time Profiler" cargo run --release -- <args>
|
||||
|
||||
# perf (Linux)
|
||||
perf record -g cargo run --release -- <args>
|
||||
perf report
|
||||
```
|
||||
|
||||
## Environment
|
||||
|
||||
- **OS:** (e.g., Ubuntu 22.04, macOS 14, Windows 11)
|
||||
- **Hardware:** (CPU, RAM - relevant for performance issues)
|
||||
- **pdftract version:** (run `pdftract --version`)
|
||||
- **Rust version:** (run `rustc --version`)
|
||||
|
||||
## Suspected Cause
|
||||
|
||||
If you have a hypothesis about what's causing the regression (e.g., a specific commit, a new dependency), please describe it here.
|
||||
|
||||
## Additional Context
|
||||
|
||||
Add any other context about the performance issue:
|
||||
|
||||
- Logs or traces
|
||||
- Related issues or PRs
|
||||
- Workarounds (e.g., using an older version)
|
||||
|
||||
---
|
||||
|
||||
**Note:** For help with development or contributing to pdftract, see [`CONTRIBUTING.md`](../../CONTRIBUTING.md).
|
||||
|
||||
- Logs or traces
|
||||
- Related issues or PRs
|
||||
- Workarounds (e.g., using an older version)
|
||||
70
.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
70
.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
|
|
@ -0,0 +1,70 @@
|
|||
# Pull Request
|
||||
|
||||
## Linked Issue or RFC
|
||||
|
||||
Closes #(issue number)
|
||||
|
||||
## Scope Statement
|
||||
|
||||
Which Phase / which Acceptance Scenario does this PR address?
|
||||
|
||||
- **Phase:** (e.g., Phase 2 - Font Encoding)
|
||||
- **Acceptance Scenario:** (e.g., AS-2.3 - Embedded CMap with predefined CID->Unicode mapping)
|
||||
|
||||
## Summary
|
||||
|
||||
Brief description of what this PR does and why it's necessary.
|
||||
|
||||
## Changes Made
|
||||
|
||||
- List the main changes here
|
||||
- Include file paths and key functions modified
|
||||
- Note any breaking changes
|
||||
|
||||
## Test Plan
|
||||
|
||||
How did you verify this works?
|
||||
|
||||
- [ ] Unit tests pass (`cargo test --workspace --features default`)
|
||||
- [ ] Integration tests pass
|
||||
- [ ] Manual testing completed
|
||||
|
||||
### Test Evidence
|
||||
|
||||
Attach or paste:
|
||||
- Terminal output from test runs
|
||||
- Screenshots (for UI changes)
|
||||
- Example PDF files processed (before/after)
|
||||
|
||||
## Performance Impact
|
||||
|
||||
If this PR touches hot-path code (parsing, text extraction, encoding resolution):
|
||||
|
||||
- [ ] No performance impact (CI changes, documentation, etc.)
|
||||
- [ ] Performance improvement (include benchmarks)
|
||||
- [ ] Performance regression (include justification)
|
||||
|
||||
### Benchmark Results (if applicable)
|
||||
|
||||
```text
|
||||
(paste `cargo bench` output here)
|
||||
```
|
||||
|
||||
## Checklist
|
||||
|
||||
- [ ] My code follows the style guidelines of this project (`cargo fmt`)
|
||||
- [ ] I have performed a self-review of my code
|
||||
- [ ] I have commented my code where necessary, particularly in hard-to-understand areas
|
||||
- [ ] I have made corresponding changes to the documentation
|
||||
- [ ] My changes generate no new warnings (`cargo clippy`)
|
||||
- [ ] I have added tests that prove my fix is effective or that my feature works
|
||||
- [ ] New and existing tests pass locally with `cargo test --workspace --features default`
|
||||
- [ ] I have signed off my commits (`git commit -s`) per the DCO
|
||||
|
||||
## Additional Notes
|
||||
|
||||
Any additional context, screenshots, or considerations for the reviewer.
|
||||
|
||||
---
|
||||
|
||||
**Note:** See [`CONTRIBUTING.md`](../../CONTRIBUTING.md) for development setup, local validation checklist, and commit message guidelines.
|
||||
133
CODE_OF_CONDUCT.md
Normal file
133
CODE_OF_CONDUCT.md
Normal file
|
|
@ -0,0 +1,133 @@
|
|||
# Contributor Covenant Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
We as members, contributors, and leaders pledge to make participation in our
|
||||
community a harassment-free experience for everyone, regardless of age, body
|
||||
size, visible or invisible disability, ethnicity, sex characteristics, gender
|
||||
identity and expression, level of experience, education, socio-economic status,
|
||||
nationality, personal appearance, race, caste, color, religion, or sexual
|
||||
identity and orientation.
|
||||
|
||||
We pledge to act and interact in ways that contribute to an open, welcoming,
|
||||
diverse, inclusive, and healthy community.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to a positive environment for our
|
||||
community include:
|
||||
|
||||
* Demonstrating empathy and kindness toward other people
|
||||
* Being respectful of differing opinions, viewpoints, and experiences
|
||||
* Giving and gracefully accepting constructive feedback
|
||||
* Accepting responsibility and apologizing to those affected by our mistakes,
|
||||
and learning from the experience
|
||||
* Focusing on what is best not just for us as individuals, but for the overall
|
||||
community
|
||||
|
||||
Examples of unacceptable behavior include:
|
||||
|
||||
* The use of sexualized language or imagery, and sexual attention or advances of
|
||||
any kind
|
||||
* Trolling, insulting or derogatory comments, and personal or political attacks
|
||||
* Public or private harassment
|
||||
* Publishing others' private information, such as a physical or email address,
|
||||
without their explicit permission
|
||||
* Other conduct which could reasonably be considered inappropriate in a
|
||||
professional setting
|
||||
|
||||
## Enforcement Responsibilities
|
||||
|
||||
Community leaders are responsible for clarifying and enforcing our standards of
|
||||
acceptable behavior and will take appropriate and fair corrective action in
|
||||
response to any behavior that they deem inappropriate, threatening, offensive,
|
||||
or harmful.
|
||||
|
||||
Community leaders have the right and responsibility to remove, edit, or reject
|
||||
comments, commits, code, wiki edits, issues, and other contributions that are
|
||||
not aligned to this Code of Conduct, and will communicate reasons for moderation
|
||||
decisions when appropriate.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all community spaces, and also applies when
|
||||
an individual is officially representing the community in public spaces.
|
||||
Examples of representing our community include using an official e-mail address,
|
||||
posting via an official social media account, or acting as an appointed
|
||||
representative at an online or offline event.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
||||
reported to the community leaders responsible for enforcement at
|
||||
[security@jedarden.com](mailto:security@jedarden.com).
|
||||
|
||||
All complaints will be reviewed and investigated promptly and fairly.
|
||||
|
||||
All community leaders are obligated to respect the privacy and security of the
|
||||
reporter of any incident.
|
||||
|
||||
## Enforcement Guidelines
|
||||
|
||||
Community leaders will follow these Community Impact Guidelines in determining
|
||||
the consequences for any action they deem in violation of this Code of Conduct:
|
||||
|
||||
### 1. Correction
|
||||
|
||||
**Community Impact**: Use of inappropriate language or other behavior deemed
|
||||
unprofessional or unwelcome in the community.
|
||||
|
||||
**Consequence**: A private, written warning from community leaders, providing
|
||||
clarity around the nature of the violation and an explanation of why the
|
||||
behavior was inappropriate. A public apology may be requested.
|
||||
|
||||
### 2. Warning
|
||||
|
||||
**Community Impact**: A violation through a single incident or series of
|
||||
actions.
|
||||
|
||||
**Consequence**: A warning with consequences for continued behavior. No
|
||||
interaction with the people involved, including unsolicited interaction with
|
||||
those enforcing the Code of Conduct, for a specified period of time. This
|
||||
includes avoiding interactions in community spaces as well as external channels
|
||||
like social media. Violating these terms may lead to a temporary or permanent
|
||||
ban.
|
||||
|
||||
### 3. Temporary Ban
|
||||
|
||||
**Community Impact**: A serious violation of community standards, including
|
||||
sustained inappropriate behavior.
|
||||
|
||||
**Consequence**: A temporary ban from any sort of interaction or public
|
||||
communication with the community for a specified period of time. No public or
|
||||
private interaction with the people involved, including unsolicited interaction
|
||||
with those enforcing the Code of Conduct, is allowed during this period.
|
||||
Violating these terms may lead to a permanent ban.
|
||||
|
||||
### 4. Permanent Ban
|
||||
|
||||
**Community Impact**: Demonstrating a pattern of violation of community
|
||||
standards, including sustained inappropriate behavior, harassment of an
|
||||
individual, or aggression toward or disparagement of classes of individuals.
|
||||
|
||||
**Consequence**: A permanent ban from any sort of public interaction within the
|
||||
community.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
|
||||
version 2.1, available at
|
||||
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
|
||||
|
||||
Community Impact Guidelines were inspired by
|
||||
[Mozilla's code of conduct enforcement ladder][mozilla coc].
|
||||
|
||||
For answers to common questions about this code of conduct, see the FAQ at
|
||||
[https://www.contributor-covenant.org/faq][faq]. Translations are available at
|
||||
[https://www.contributor-covenant.org/translations][translations].
|
||||
|
||||
[homepage]: https://www.contributor-covenant.org
|
||||
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
|
||||
[mozilla coc]: https://github.com/mozilla/diversity
|
||||
[faq]: https://www.contributor-covenant.org/faq
|
||||
[translations]: https://www.contributor-covenant.org/translations
|
||||
231
CONTRIBUTING.md
231
CONTRIBUTING.md
|
|
@ -2,6 +2,106 @@
|
|||
|
||||
Thank you for your interest in contributing to pdftract! This document covers the essential workflows for contributors.
|
||||
|
||||
## Licensing and Sign-off
|
||||
|
||||
pdftract is dual-licensed under **MIT OR Apache-2.0**. You may choose either license for your use.
|
||||
|
||||
### Developer Certificate of Origin (DCO)
|
||||
|
||||
This project requires a **Developer Certificate of Origin (DCO)** sign-off on all commits. This certifies that you wrote the code or have the right to pass it on as open-source.
|
||||
|
||||
**To sign your commits, use `git commit --signoff` (or `git commit -s`):**
|
||||
|
||||
```bash
|
||||
git commit -s -m "feat: add some feature"
|
||||
# The "Signed-off-by" trailer is added automatically
|
||||
```
|
||||
|
||||
**No CLA is required.** The DCO is sufficient for this permissive-license project.
|
||||
|
||||
### Apache NOTICE File
|
||||
|
||||
The Apache-2.0 license includes a NOTICE file requirement, but pdftract does not ship a NOTICE file in the source distribution. This is intentional: the project maintains no contributor list outside of git history, and there are no third-party attribution notices required.
|
||||
|
||||
**Downstream redistributors MAY add a NOTICE file** when distributing pdftract as part of their own product. If you choose to add one, it should include:
|
||||
- Attribution to the pdftract project
|
||||
- A link to the original source repository
|
||||
- Any modifications you made (if distributing a modified version)
|
||||
|
||||
The absence of a NOTICE file in the upstream distribution does not violate the Apache-2.0 license; the NOTICE requirement applies only when there is something to notice.
|
||||
|
||||
## Code of Conduct
|
||||
|
||||
This project adopts the [Contributor Covenant v2.1](CODE_OF_CONDUCT.md). All contributors are expected to uphold this code of conduct.
|
||||
|
||||
## Reporting Security Issues
|
||||
|
||||
If you discover a security vulnerability, please do **NOT** open a public issue or pull request. Instead, report it privately:
|
||||
|
||||
1. **Email (preferred):** [security@jedarden.com](mailto:security@jedarden.com)
|
||||
- PGP-encrypted emails are strongly encouraged
|
||||
- PGP key: [`docs/security/pgp-public-key.asc`](docs/security/pgp-public-key.asc)
|
||||
|
||||
2. **GitHub Private Vulnerability Reporting:**
|
||||
- Use the [Security tab](https://github.com/jedarden/pdftract/security/advisories)
|
||||
|
||||
See [`SECURITY.md`](SECURITY.md) for our full disclosure policy, including:
|
||||
- Supported versions and security fix timeline
|
||||
- 90-day disclosure window
|
||||
- CVE assignment process
|
||||
- Safe harbor for good-faith researchers
|
||||
|
||||
## Development Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **Rust 1.78 or later** — See [Minimum Supported Rust Version (MSRV)](#minimum-supported-rust-version-msrv) below
|
||||
- **Git** — For cloning and committing
|
||||
|
||||
### OCR Feature Dependencies (Optional)
|
||||
|
||||
If you're developing OCR-related features (Phase 5), you'll need additional dependencies:
|
||||
|
||||
**Linux (Debian/Ubuntu):**
|
||||
```bash
|
||||
sudo apt-get install libleptonica-dev libtesseract-dev tesseract-ocr-eng
|
||||
```
|
||||
|
||||
**macOS:**
|
||||
```bash
|
||||
brew install tesseract leptonica
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
- Install Tesseract from the official installers: https://github.com/UB-Mannheim/tesseract/wiki
|
||||
|
||||
### Building
|
||||
|
||||
```bash
|
||||
# Clone your fork
|
||||
git clone https://github.com/YOUR_USERNAME/pdftract.git
|
||||
cd pdftract
|
||||
|
||||
# Build the workspace
|
||||
cargo build --workspace --locked
|
||||
|
||||
# Build release binaries
|
||||
cargo build --release --workspace
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
cargo test --workspace --features default
|
||||
|
||||
# Run tests with output
|
||||
cargo test --workspace --features default -- --nocapture
|
||||
|
||||
# Run a specific test
|
||||
cargo test --workspace --features default test_name
|
||||
```
|
||||
|
||||
## Minimum Supported Rust Version (MSRV)
|
||||
|
||||
The **Minimum Supported Rust Version (MSRV)** for pdftract is **1.78**. This is the oldest Rust version that can successfully build the project. The MSRV is declared in `Cargo.toml` via the `rust-version` field and enforced in CI.
|
||||
|
|
@ -71,61 +171,122 @@ The Rust ecosystem convention is that library crates should not check in `Cargo.
|
|||
- A single workspace-level `Cargo.lock` applies to all members.
|
||||
- Downstream consumers can still ignore the lockfile by using `cargo build --frozen` with their own lockfile, or by vendoring.
|
||||
|
||||
## Development Workflow
|
||||
## Local Validation Before Opening a PR
|
||||
|
||||
### Building
|
||||
Before submitting a pull request, please run the following commands locally to ensure your changes pass all quality gates:
|
||||
|
||||
```bash
|
||||
cargo build --release
|
||||
```
|
||||
# 1. Run all tests (must be all green)
|
||||
cargo test --workspace --features default
|
||||
|
||||
### Testing
|
||||
# 2. Lint with clippy (no warnings allowed)
|
||||
cargo clippy --all-targets --features default -- -D warnings
|
||||
|
||||
```bash
|
||||
cargo test --all
|
||||
```
|
||||
# 3. Check binary size (must be within budget; target <= 4 MB stripped)
|
||||
cargo bloat --release --features default
|
||||
|
||||
### Linting
|
||||
# 4. Check for security advisories (no medium+ issues)
|
||||
cargo audit
|
||||
|
||||
```bash
|
||||
cargo clippy --all-targets --all-features
|
||||
# 5. Check license compliance (no rejected licenses)
|
||||
cargo deny check licenses
|
||||
|
||||
# 6. Check code formatting
|
||||
cargo fmt --check
|
||||
```
|
||||
|
||||
## Security
|
||||
**Why these checks?** These exact commands are run in the `pdftract-ci` Argo workflow. A green local run predicts a green CI run, reducing review iteration cycles.
|
||||
|
||||
### Responsible Disclosure
|
||||
### Binary Size Budget
|
||||
|
||||
If you discover a security vulnerability, please do **NOT** open a public issue or pull request. Instead, report it privately:
|
||||
The release binary must be <= 4 MB when stripped. `cargo bloat` helps identify functions contributing most to binary size. If your PR adds significant code:
|
||||
- Run `cargo bloat --release --features default --crates pdftract-cli`
|
||||
- Check the top functions in the output
|
||||
- Consider if large dependencies can be made optional or feature-gated
|
||||
|
||||
1. **Email (preferred):** [security@jedarden.com](mailto:security@jedarden.com)
|
||||
- PGP-encrypted emails are strongly encouraged
|
||||
- PGP key: [`docs/security/pgp-public-key.asc`](docs/security/pgp-public-key.asc)
|
||||
## CI on Forks — The Argo-CI Caveat
|
||||
|
||||
2. **GitHub Private Vulnerability Reporting:**
|
||||
- Use the [Security tab](https://github.com/jedarden/pdftract/security/advisories)
|
||||
> **IMPORTANT:** Because CI runs on the private `iad-ci` cluster, external contributors cannot trigger CI from their fork.
|
||||
|
||||
See [`SECURITY.md`](SECURITY.md) for our full disclosure policy, including:
|
||||
- Supported versions and security fix timeline
|
||||
- 90-day disclosure window
|
||||
- CVE assignment process
|
||||
- Safe harbor for good-faith researchers
|
||||
### How It Works
|
||||
|
||||
### Supply-Chain Security
|
||||
1. **Fork and open a pull request** against `jedarden/pdftract:main`
|
||||
2. **A maintainer will trigger the `pdftract-ci` Argo workflow** against your branch
|
||||
3. **Results are posted as a PR comment** once the workflow completes
|
||||
|
||||
This project uses `cargo-audit` and `cargo-deny` for supply-chain security. New direct dependencies require an ADR or written justification in the PR description.
|
||||
### Expected Response Time
|
||||
|
||||
## Licensing
|
||||
- Maintainer-triggered CI: **within 48 hours**
|
||||
- You'll receive a comment on your PR with the full CI log
|
||||
|
||||
pdftract is dual-licensed under **MIT OR Apache-2.0**. You may choose either license for your use.
|
||||
### Why This Model?
|
||||
|
||||
### Apache NOTICE File
|
||||
The `iad-ci` cluster is a private Rackspace Spot cluster accessed via kubectl-proxy over Tailscale. External forks do not have credentials to access this cluster, so they cannot self-serve CI runs. This is unusual, but it allows us to run CI on infrastructure we control without exposing cluster credentials publicly.
|
||||
|
||||
The Apache-2.0 license includes a NOTICE file requirement, but pdftract does not ship a NOTICE file in the source distribution. This is intentional: the project maintains no contributor list outside of git history, and there are no third-party attribution notices required.
|
||||
### Local Validation is Critical
|
||||
|
||||
**Downstream redistributors MAY add a NOTICE file** when distributing pdftract as part of their own product. If you choose to add one, it should include:
|
||||
- Attribution to the pdftract project
|
||||
- A link to the original source repository
|
||||
- Any modifications you made (if distributing a modified version)
|
||||
Since you cannot trigger CI yourself, **please run the full local validation checklist** before opening your PR. This minimizes back-and-forth cycles when the maintainer-triggered CI fails.
|
||||
|
||||
The absence of a NOTICE file in the upstream distribution does not violate the Apache-2.0 license; the NOTICE requirement applies only when there is something to notice.
|
||||
## Pull Request Template
|
||||
|
||||
All pull requests must follow the [PR template](.github/PULL_REQUEST_TEMPLATE.md). The template requires:
|
||||
|
||||
- **Linked issue or RFC** — Every PR should reference an issue or design document
|
||||
- **Scope statement** — Which Phase / which Acceptance Scenario does this address?
|
||||
- **Test plan** — How did you verify this works?
|
||||
- **Manual-test evidence** — Screenshots, terminal output, or example runs
|
||||
- **Performance impact** — If hot-path code was touched, include benchmark results
|
||||
|
||||
## Commit Message Style
|
||||
|
||||
This project uses **Conventional Commits** for commit messages. Release notes are auto-generated from commit history using `git-cliff`.
|
||||
|
||||
### Format
|
||||
|
||||
```
|
||||
<type>(<scope>): <short summary>
|
||||
|
||||
[optional body]
|
||||
|
||||
[optional footer]
|
||||
```
|
||||
|
||||
### Types
|
||||
|
||||
- `feat:` — A new feature
|
||||
- `fix:` — A bug fix
|
||||
- `perf:` — A performance improvement
|
||||
- `docs:` — Documentation changes
|
||||
- `chore:` — Maintenance tasks (updates, refactoring, tooling)
|
||||
- `test:` — Test changes
|
||||
- `BREAKING CHANGE:` — A breaking change (include in body or footer)
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
feat(ocr): add Tesseract integration for phase 5
|
||||
fix(font): handle missing /Widths in Type 3 fonts
|
||||
perf(extract): cache page tree parsing results
|
||||
docs(contributing): add Argo-CI caveat section
|
||||
chore(deps): upgrade lodepng to 0.9.0
|
||||
```
|
||||
|
||||
## Issue Triage
|
||||
|
||||
We use issue templates to ensure all necessary information is provided upfront. When opening an issue, please use the appropriate template:
|
||||
|
||||
- **Bug report** — Must include `pdftract doctor` output
|
||||
- **Feature request** — Describe the use case and proposed API
|
||||
- **Performance regression** — Include before/after benchmarks
|
||||
- **Security advisory** — Redirects to private disclosure (see [Reporting Security Issues](#reporting-security-issues))
|
||||
|
||||
See [`.github/ISSUE_TEMPLATE/`](.github/ISSUE_TEMPLATE/) for the full list.
|
||||
|
||||
## Getting Help
|
||||
|
||||
- **Documentation:** Check [`docs/`](docs/) for design docs and ADRs
|
||||
- **Issues:** Search existing issues before opening a new one
|
||||
- **Discussions:** Use GitHub Discussions for questions and RFCs
|
||||
- **Security:** See [SECURITY.md](SECURITY.md) for vulnerability reporting
|
||||
|
||||
Thank you for contributing to pdftract!
|
||||
|
|
|
|||
|
|
@ -147,6 +147,15 @@ For responsible disclosure of security vulnerabilities, please email [security@j
|
|||
|
||||
> **NOTE:** The PGP key is currently a placeholder. The security contact must generate and publish a 4096-bit RSA key for `security@jedarden.com`. See `docs/security/pgp-public-key.asc` for generation instructions.
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please see [`CONTRIBUTING.md`](CONTRIBUTING.md) for:
|
||||
- Development setup and build instructions
|
||||
- Local validation checklist before opening a PR
|
||||
- Commit message style (Conventional Commits)
|
||||
- CI on forks (maintainer-triggered Argo workflow)
|
||||
- DCO sign-off requirement
|
||||
|
||||
## Status
|
||||
|
||||
Early development. See `docs/plan/` for the implementation roadmap.
|
||||
|
|
|
|||
66
notes/pdftract-i9rk.md
Normal file
66
notes/pdftract-i9rk.md
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
# Verification Note: pdftract-i9rk
|
||||
|
||||
## Bead
|
||||
CONTRIBUTING.md — Argo-CI caveat for forks, local validation checklist, PR template requirements
|
||||
|
||||
## Summary
|
||||
Created and updated contributor documentation to ensure first-time contributors can submit properly-formatted PRs without surprises.
|
||||
|
||||
## Files Created/Modified
|
||||
|
||||
### Created:
|
||||
1. **CODE_OF_CONDUCT.md** — Contributor Covenant v2.1 (5,519 bytes)
|
||||
2. **.github/PULL_REQUEST_TEMPLATE.md** — PR template with required fields (1,988 bytes)
|
||||
3. **.github/ISSUE_TEMPLATE/bug_report.md** — Bug report template (1,532 bytes)
|
||||
4. **.github/ISSUE_TEMPLATE/feature_request.md** — Feature request template (1,373 bytes)
|
||||
5. **.github/ISSUE_TEMPLATE/performance_regression.md** — Performance regression template (1,974 bytes)
|
||||
|
||||
### Modified:
|
||||
1. **CONTRIBUTING.md** — Completely restructured with all required sections (11,294 bytes)
|
||||
2. **README.md** — Added Contributing section with link to CONTRIBUTING.md
|
||||
|
||||
## Acceptance Criteria Status
|
||||
|
||||
### PASS
|
||||
- [x] CONTRIBUTING.md exists at repo root
|
||||
- [x] All nine sections from bead description are present:
|
||||
1. Project licensing (dual MIT OR Apache-2.0)
|
||||
2. Code of conduct (link to CODE_OF_CONDUCT.md)
|
||||
3. Reporting security issues (link to SECURITY.md)
|
||||
4. Development setup (with OCR dependencies for Phase 5 features)
|
||||
5. Local validation expected before opening a PR (6 commands matching pdftract-ci)
|
||||
6. CI on forks (the Argo-CI caveat)
|
||||
7. PR template requirements
|
||||
8. Commit message style (Conventional Commits)
|
||||
9. Issue triage
|
||||
- [x] The Argo-CI caveat is in a clearly visible Markdown blockquote (`> **IMPORTANT:**`)
|
||||
- [x] Local-validation commands exactly match the pdftract-ci workflow steps
|
||||
- [x] A first-time contributor can read CONTRIBUTING.md and submit a properly-formatted PR without surprises
|
||||
- [x] Linked from README (Contributing section added)
|
||||
- [x] Linked from .github/ISSUE_TEMPLATE/ (all templates have footer links)
|
||||
- [x] Linked from PR template (footer link added)
|
||||
|
||||
### Key Features Implemented
|
||||
- DCO sign-off requirement clearly documented with `git commit -s` example
|
||||
- 48-hour maintainer-triggered CI response window documented
|
||||
- Argo-CI caveat explains why external forks cannot self-trigger CI
|
||||
- Local validation checklist matches CI workflow: test, clippy, bloat, audit, deny, fmt
|
||||
- Conventional Commits format documented with examples
|
||||
- All issue templates include link to CONTRIBUTING.md
|
||||
|
||||
## Files Modified
|
||||
- CONTRIBUTING.md (restructured)
|
||||
- README.md (added Contributing section)
|
||||
- CODE_OF_CONDUCT.md (created)
|
||||
- .github/PULL_REQUEST_TEMPLATE.md (created)
|
||||
- .github/ISSUE_TEMPLATE/bug_report.md (created)
|
||||
- .github/ISSUE_TEMPLATE/feature_request.md (created)
|
||||
- .github/ISSUE_TEMPLATE/performance_regression.md (created)
|
||||
|
||||
## No WARN/FAIL
|
||||
All acceptance criteria met.
|
||||
|
||||
## References
|
||||
- Plan section: Release Engineering / Contributor Workflow, lines 3424-3433
|
||||
- ADR-009 (Argo-only CI explains the fork caveat)
|
||||
- Bead: pdftract-i9rk
|
||||
Loading…
Add table
Reference in a new issue