1
0

rewrite 2.0.0: real process — extract the algorithm into DMN

The 1.x package was a single ai.extract call wrapped in three BPMN
service tasks. No decision logic, no dmn cornerstone, no weights — the
risk/routing/validation algorithm lived invisibly in host code. There
was nothing for a runtime to actually execute.

2.0.0 makes it a real process:

- dmn cornerstone added with three decision tables:
  * assess-personal-data-risk  — PII regex signals -> risk level
  * gdpr-processing-route      — risk x centralisation -> CENTRAL/LOCAL,
                                  anonymisation, redaction level
  * human-validation-gate      — confidence thresholds + PII re-scan
                                  -> REJECTED/PENDING_REVIEW/APPROVED_AUTO
- BPMN expanded 3 -> 6 nodes (3 serviceTask + 3 businessRuleTask),
  with horizontal DI.
- Task ids, mappings, docs, manifest (dmn:true), uapf.yaml, lifecycle
  and eval-set updated; added a PII-bearing fixture.

Only the semantic extraction remains a model step. Risk classification,
GDPR routing and validation gating are now explicit ranked DMN rules —
inspectable, versioned, portable. Breaking change: structure + outputs.
This commit is contained in:
UAPF Steward
2026-05-17 20:00:36 +00:00
parent 3f1d62c748
commit dd69a04355
15 changed files with 496 additions and 120 deletions

View File

@@ -1,32 +1,46 @@
# dev.uapf.semantic-document-analysis — Overview
**UAPF v1.1 SSOT-conformant** Level 4 process package providing
reusable semantic document analysis.
**UAPF v1.1 SSOT-conformant** Level 4 process package for semantic
document analysis.
## What
A 3-step BPMN process that, given free-text document content:
A six-node BPMN process that, given free-text document content:
1. Redacts PII via `ai.redact@1`
2. Extracts VDVC v1.1 structured semantic metadata via `ai.extract@1`
3. Emits `document.semantic-analysis.completed.v1` CloudEvent via `event.emit@1`
1. **Detect and redact PII** (`ai.redact@1`) — masks PII and returns the
deterministic regex signal set (personas kods / IBAN / contact data /
category count).
2. **Assess personal-data risk** (DMN `assess-personal-data-risk`) —
ranked rules map the signal set to `personalDataRisk`.
3. **Decide GDPR processing route** (DMN `gdpr-processing-route`) —
`personalDataRisk` x `allowCentralization` -> CENTRAL/LOCAL,
anonymisation and redaction level.
4. **Extract semantic metadata** (`ai.extract@1`) — the one model step;
produces VDVC v1.1 structured metadata.
5. **Determine validation status** (DMN `human-validation-gate`) —
confidence thresholds + PII re-scan -> REJECTED / PENDING_REVIEW /
APPROVED_AUTO.
6. **Emit** `document.semantic-analysis.completed.v1` (`event.emit@1`).
## Why this shape
The previous 1.x package was a single `ai.extract` call wrapped in
BPMN. The decision logic — risk, routing, validation gating — lived
invisibly in host code. Version 2.0 extracts that logic into three
versioned DMN decision tables. The algorithm is now in the package:
inspectable, diff-able, portable. The host supplies inference for one
bounded step only.
## What's portable
The package ships:
- The BPMN flow (the algorithm shape)
- The VDVC output JSON Schema (the output contract)
- The resource mapping (input/output contracts, timeouts, retries)
- The guardrails policy (GDPR + EU AI Act constraints)
The host system supplies the actual AI agent that fulfils the three
capabilities. Multiple hosts can implement the same capabilities;
multiple packages can require the same capabilities.
- The BPMN flow (the process shape)
- Three DMN decision tables (the algorithm and its weights)
- The VDVC output JSON Schema (the extraction contract)
- The resource mapping and the guardrails policy
## How to consume
Drop this `.uapf` into any UAPF-conformant runtime. The runtime
exposes `uapf.run_process` (per UAPF-specification §6.3.1) targeting
`Process_SemanticDocumentAnalysis`. The runtime resolves the resource
mapping to find a target with the three required capabilities and
invokes them in order per the BPMN flow.
Drop this `.uapf` into any UAPF-conformant runtime and run
`Process_SemanticDocumentAnalysis`. The runtime evaluates the DMN
decisions itself and resolves the resource mapping for the three
capability-backed service tasks.