11136 Commits

Author SHA1 Message Date
dependabot[bot]
b0a4bc7c41
Bump ocrmypdf from 16.13.0 to 17.3.0 in the document-processing group
Bumps the document-processing group with 1 update: [ocrmypdf](https://github.com/ocrmypdf/OCRmyPDF).


Updates `ocrmypdf` from 16.13.0 to 17.3.0
- [Release notes](https://github.com/ocrmypdf/OCRmyPDF/releases)
- [Commits](https://github.com/ocrmypdf/OCRmyPDF/compare/v16.13.0...v17.3.0)

---
updated-dependencies:
- dependency-name: ocrmypdf
  dependency-version: 17.3.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: document-processing
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-09 19:49:26 +00:00
dependabot[bot]
6955d6c07f
Chore(deps): Bump the utilities-patch group across 1 directory with 6 updates (#12291)
* Chore(deps): Bump the utilities-patch group across 1 directory with 6 updates

Bumps the utilities-patch group with 6 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| llama-index-embeddings-openai | `0.5.1` | `0.5.2` |
| llama-index-llms-openai | `0.6.21` | `0.6.26` |
| [python-dotenv](https://github.com/theskumar/python-dotenv) | `1.2.1` | `1.2.2` |
| [regex](https://github.com/mrabarnett/mrab-regex) | `2026.2.19` | `2026.2.28` |
| [prek](https://github.com/j178/prek) | `0.3.3` | `0.3.5` |
| [ruff](https://github.com/astral-sh/ruff) | `0.15.4` | `0.15.5` |



Updates `llama-index-embeddings-openai` from 0.5.1 to 0.5.2

Updates `llama-index-llms-openai` from 0.6.21 to 0.6.26

Updates `python-dotenv` from 1.2.1 to 1.2.2
- [Release notes](https://github.com/theskumar/python-dotenv/releases)
- [Changelog](https://github.com/theskumar/python-dotenv/blob/main/CHANGELOG.md)
- [Commits](https://github.com/theskumar/python-dotenv/compare/v1.2.1...v1.2.2)

Updates `regex` from 2026.2.19 to 2026.2.28
- [Changelog](https://github.com/mrabarnett/mrab-regex/blob/hg/changelog.txt)
- [Commits](https://github.com/mrabarnett/mrab-regex/compare/2026.2.19...2026.2.28)

Updates `prek` from 0.3.3 to 0.3.5
- [Release notes](https://github.com/j178/prek/releases)
- [Changelog](https://github.com/j178/prek/blob/master/CHANGELOG.md)
- [Commits](https://github.com/j178/prek/compare/v0.3.3...v0.3.5)

Updates `ruff` from 0.15.4 to 0.15.5
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.15.4...0.15.5)

---
updated-dependencies:
- dependency-name: llama-index-embeddings-openai
  dependency-version: 0.5.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: llama-index-llms-openai
  dependency-version: 0.6.26
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: python-dotenv
  dependency-version: 1.2.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: regex
  dependency-version: 2026.2.28
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: prek
  dependency-version: 0.3.5
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: ruff
  dependency-version: 0.15.5
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update .pre-commit-config.yaml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-09 19:47:19 +00:00
shamoon
d85ee29976
Fix ci gate base 2026-03-09 11:16:46 -07:00
GitHub Actions
0c7d56c5e7 Auto translate strings 2026-03-09 17:45:53 +00:00
Trenton H
0bcf904e3a
Chore: Finish settings refactor (#12263) 2026-03-09 17:43:51 +00:00
Trenton H
bcc2f11152
Performance: Stream JSON during import for memory improvements (#12276)
* Perf: stream manifest parsing with ijson in document_importer

Replace bulk json.load of the full manifest (which materializes the
entire JSON array into memory) with incremental ijson streaming.
Eliminates self.manifest entirely — records are never all in memory
at once.

- Add ijson>=3.2 dependency
- New module-level iter_manifest_records() generator
- load_manifest_files() collects paths only; no parsing at load time
- check_manifest_validity() streams without accumulating records
- decrypt_secret_fields() streams each manifest to a .decrypted.json
  temp file record-by-record; temp files cleaned up after file copy
- _import_files_from_manifest() collects only document records (small
  fraction of manifest) for the tqdm progress bar

Measured on 200 docs + 200 CustomFieldInstances:
- Streaming validation: peak memory 3081 KiB -> 333 KiB (89% reduction)
- Stream-decrypt to file: peak memory 3081 KiB -> 549 KiB (82% reduction)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Perf: slim dict in _import_files_from_manifest, discard fields

When collecting document records for the file-copy step, extract only
the 4 keys the loop actually uses (pk + 3 exported filename keys) and
discard the full fields dict (content, checksum, tags, etc.).

Peak memory for the document-record list: 939 KiB -> 375 KiB (60% reduction).
Wall time unchanged.
2026-03-09 10:20:48 -07:00
shamoon
e18b1fd99d
Chore: use unified "gates" for ci tests and docs checks (#12277) 2026-03-09 17:02:34 +00:00
Trenton H
e30676f889
Feature: Migrate import/export to rich progress (#12260)
* Refactor: migrate exporter/importer from tqdm to PaperlessCommand.track()

Replace direct tqdm usage in document_exporter and document_importer with
the PaperlessCommand base class and its track() method, which is backed by
Rich and handles --no-progress-bar automatically. Also removes the unused
ProgressBarMixin from mixins.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Refactor: add explicit supports_progress_bar and supports_multiprocessing to all PaperlessCommand subclasses

Each management command now explicitly declares both class attributes
rather than relying on defaults, making intent unambiguous at a glance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:59:17 -07:00
Martin Kleine
2a28549c5a
Documentation: Update development commands and pnpm for Angular build commands (#12283)
---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-09 07:06:16 -07:00
GitHub Actions
4badf0e7c2 Auto translate strings 2026-03-09 01:52:08 +00:00
Paul Gessinger
bc26d94593
Chore: Add saved view compatibility in API version 9 (#12280)
---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-08 18:50:31 -07:00
shamoon
93cbbf34b7
Merge branch 'main' into dev 2026-03-07 23:30:08 -08:00
shamoon
1e8622494d
Documentation: remove broken link 2026-03-07 23:29:42 -08:00
GitHub Actions
0c3298f030 Auto translate strings 2026-03-08 03:06:59 +00:00
Sven-Hendrik Haase
2b288c094d
Enhancement: Show correspondent in document merge dialog (#12271)
---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-07 19:05:28 -08:00
Trenton H
2cdb1424ef
Performance: Further export memory improvements (#12273)
* Perf: streaming manifest writer for document exporter (Phase 3)

Replaces the in-memory manifest dict accumulation with a
StreamingManifestWriter that writes records to manifest.json
incrementally, keeping only one batch resident in memory at a time.

Key changes:
- Add StreamingManifestWriter: writes to .tmp atomically, BLAKE2b
  compare for --compare-json, discard() on exception
- Add _encrypt_record_inline(): per-record encryption replacing the
  bulk encrypt_secret_fields() call; crypto setup moved before streaming
- Add _write_split_manifest(): extracted per-document manifest writing
- Refactor dump(): non-doc records streamed during transaction, documents
  accumulated then written after filenames are assigned
- Upgrade check_and_write_json() from MD5 to BLAKE2b
- Remove encrypt_secret_fields() and unused itertools.chain import
- Add profiling marker to pyproject.toml

Measured improvement (200 docs + 200 CustomFieldInstances, same
dump() code path, only writer differs):
- Peak memory: ~50% reduction
- Memory delta: ~70% reduction
- Wall time and query count: unchanged

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Refactor: O(1) lookup table for CRYPT_FIELDS in per-record encryption

Add CRYPT_FIELDS_BY_MODEL to CryptMixin, derived from CRYPT_FIELDS at
class definition time. _encrypt_record_inline() now does a single dict
lookup instead of a linear scan per record, eliminating the loop and
break pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-07 14:24:50 -08:00
Trenton H
f5c0c21922
Chore: Lazy imports of the heavy AI modules (#12275) 2026-03-07 12:53:22 -08:00
Trenton H
91ddda9256
Fix: Uploaded digest artifact name for Docker build (#12272) 2026-03-06 13:15:45 -08:00
Trenton H
9d5e618de8
Chore: pytest style paperless tests (#12254) 2026-03-06 13:04:23 -08:00
Trenton H
50ae49c7da
Chore: Uploads the digests as just files, no zips (#12264) 2026-03-06 12:56:34 -08:00
shamoon
ba023ef332
Chore: Add anti-slop job to PR workflow (#12248) 2026-03-06 20:36:24 +00:00
GitHub Actions
7345f2e81c Auto translate strings 2026-03-06 20:01:12 +00:00
shamoon
731448a8f9
Fixhancement: support version-specific edits (#12233) 2026-03-06 11:59:26 -08:00
shamoon
1c2d5483c2
Chore: set fetch depth for bundle analysis (#12257) 2026-03-05 23:54:05 -08:00
shamoon
815e598218
Chore: update ESLint to v10 (#12256) 2026-03-05 22:59:47 -08:00
dependabot[bot]
a5a267fe49
Bump django-allauth from 65.14.0 to 65.14.1 (#12253)
Bumps [django-allauth](https://github.com/sponsors/pennersr) from 65.14.0 to 65.14.1.
- [Commits](https://github.com/sponsors/pennersr/commits)

---
updated-dependencies:
- dependency-name: django-allauth
  dependency-version: 65.14.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-05 14:32:04 -08:00
shamoon
24a2cfd957
Change: use explicit doc creation instead of clone for versions (#12226) 2026-03-04 15:57:44 -08:00
GitHub Actions
7cf2ef6398 Auto translate strings 2026-03-04 23:29:54 +00:00
shamoon
df03207eef
Fix: correct doc version filename handling (#12223) 2026-03-04 23:28:07 +00:00
dependabot[bot]
fa998ecd49
Bump django from 5.2.11 to 5.2.12 (#12249)
Bumps [django](https://github.com/django/django) from 5.2.11 to 5.2.12.
- [Commits](https://github.com/django/django/compare/5.2.11...5.2.12)

---
updated-dependencies:
- dependency-name: django
  dependency-version: 5.2.12
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-04 15:16:25 -08:00
Trenton H
1e21bcd26e
Breaking: Drop support for Python 3.10 (#12234) 2026-03-04 15:03:33 -08:00
Trenton H
a9cb89c633
Enhancement: Improve exporter memory efficiency (#12236)
Phase 1 -- Eliminate JSON round-trip in document exporter

Replace json.loads(serializers.serialize("json", qs)) with
serializers.serialize("python", qs) to skip the intermediate
JSON string allocation and parse step. Use DjangoJSONEncoder
in check_and_write_json() to handle native Python types
(datetime, Decimal, UUID) the Python serializer returns.

Phase 2 -- Batched QuerySet serialization in document exporter

Add serialize_queryset_batched() helper that uses QuerySet.iterator()
and itertools.islice to stream records in configurable chunks, bounding
peak memory during serialization to batch_size * avg_record_size rather
than loading the entire QuerySet at once.
2026-03-04 14:54:20 -08:00
GitHub Actions
a37e24c1ad Auto translate strings 2026-03-04 22:17:32 +00:00
shamoon
85a18e5911
Enhancement: saved view sharing (#12142) 2026-03-04 14:15:43 -08:00
GitHub Actions
ae182c459b Auto translate strings 2026-03-04 21:34:02 +00:00
shamoon
d51a118aac
Merge branch 'main' into dev 2026-03-04 13:31:20 -08:00
github-actions[bot]
d6a316b1df
Changelog v2.20.10 - GHA (#12247)
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
2026-03-04 11:25:44 -08:00
shamoon
8f311c4b6b
Bump version to 2.20.10 v2.20.10 2026-03-04 10:38:14 -08:00
shamoon
f25322600d
Merge branch 'release/v2.20.x' 2026-03-04 10:09:01 -08:00
shamoon
615f27e6fb
Fix: support string coercion in filepath jinja templates (#12244) 2026-03-04 08:32:34 -08:00
Andreas Schneider
190fc70288
Fix: use maxsplit=1 in Redis URL parsing to handle URLs with multiple colons (#12239) 2026-03-04 01:06:51 -08:00
shamoon
5b809122b5
Fix: apply ordering after annotating tag document count (#12238) 2026-03-04 00:33:13 -08:00
GitHub Actions
c623234769 Auto translate strings 2026-03-04 00:29:21 +00:00
shamoon
299dac21ee
Enhancement: “live” document updates (#12141) 2026-03-04 00:27:07 +00:00
Trenton H
5498503d60
Chore: Improve user migration path (#12232)
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-03-03 15:51:48 -08:00
shamoon
8b8307571a
Fix: enforce path limit for db filename fields (#12235) 2026-03-03 13:19:56 -08:00
GitHub Actions
16b58c2de5 Auto translate strings 2026-03-03 19:25:03 +00:00
shamoon
c724fbb5d9
Clarify bulk edit wording with versions 2026-03-03 11:22:22 -08:00
dependabot[bot]
9c0f112e94
docker(deps): Bump astral-sh/uv (#12191)
Bumps [astral-sh/uv](https://github.com/astral-sh/uv) from 0.10.5-python3.12-trixie-slim to 0.10.7-python3.12-trixie-slim.
- [Release notes](https://github.com/astral-sh/uv/releases)
- [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/uv/compare/0.10.5...0.10.7)

---
updated-dependencies:
- dependency-name: astral-sh/uv
  dependency-version: 0.10.7-python3.12-trixie-slim
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-03 07:56:35 -08:00
Trenton H
43406f44f2
Feature: Improve the retagger output using rich (#12194) 2026-03-03 07:14:59 -08:00