1
0
UAPF Steward dd69a04355 rewrite 2.0.0: real process — extract the algorithm into DMN
The 1.x package was a single ai.extract call wrapped in three BPMN
service tasks. No decision logic, no dmn cornerstone, no weights — the
risk/routing/validation algorithm lived invisibly in host code. There
was nothing for a runtime to actually execute.

2.0.0 makes it a real process:

- dmn cornerstone added with three decision tables:
  * assess-personal-data-risk  — PII regex signals -> risk level
  * gdpr-processing-route      — risk x centralisation -> CENTRAL/LOCAL,
                                  anonymisation, redaction level
  * human-validation-gate      — confidence thresholds + PII re-scan
                                  -> REJECTED/PENDING_REVIEW/APPROVED_AUTO
- BPMN expanded 3 -> 6 nodes (3 serviceTask + 3 businessRuleTask),
  with horizontal DI.
- Task ids, mappings, docs, manifest (dmn:true), uapf.yaml, lifecycle
  and eval-set updated; added a PII-bearing fixture.

Only the semantic extraction remains a model step. Risk classification,
GDPR routing and validation gating are now explicit ranked DMN rules —
inspectable, versioned, portable. Breaking change: structure + outputs.
2026-05-17 20:00:36 +00:00

Semantic Document Analysis

A UAPF Level-4 process package for extracting VDVC-conformant semantic metadata from free-text documents.

What this package is

A real, inspectable process — not a single AI call in BPMN costume. The flow has six executable nodes; three of them are DMN decision tables that carry the actual algorithm, with explicit ranked rules and weights.

Start
  -> [service]  Detect and redact PII          ai.redact@1
  -> [decision] Assess personal-data risk      DMN assess-personal-data-risk
  -> [decision] Decide GDPR processing route   DMN gdpr-processing-route
  -> [service]  Extract semantic metadata      ai.extract@1
  -> [decision] Determine validation status    DMN human-validation-gate
  -> [service]  Emit completed event           event.emit@1
End

Only one node performs model inference (semantic extraction). PII detection, risk classification, GDPR routing and the human-validation gate are deterministic — the host cannot make them up.

The decision tables (dmn/)

assess-personal-data-risk

PII regex signals -> personalDataRisk. Personas kods or IBAN forces HIGH; two or more PII categories, or contact data, gives MEDIUM; one category LOW; nothing NONE. Hit policy FIRST (ranked).

gdpr-processing-route

personalDataRisk x allowCentralization -> processingRoute (CENTRAL | LOCAL), anonymizationRequired, redactionLevel. A sensitive document whose owner has not permitted centralisation stays LOCAL with full redaction. This is the routing rule lifted out of the host's generate_semantic_metadata.

human-validation-gate

outputPiiErrorCount, aiConfidenceScore, personalDataRisk -> humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO) and requiresHumanReview. Any leaked PII or confidence below 0.3 -> REJECTED; below 0.7 or HIGH risk -> PENDING_REVIEW; 0.7+ with clean output -> APPROVED_AUTO. The thresholds 0.3 / 0.7 are the weights.

Capabilities required of the host

Capability Used by Purpose
ai.redact@1 Task_DetectRedactPii Mask PII + return regex signals
ai.extract@1 Task_ExtractSemantics VDVC semantic extraction
event.emit@1 Task_EmitResult Publish completion CloudEvent

DMN decisions need no host capability — the runtime evaluates them.

Output contract

resources/schemas/vdvc-semantic-summary.schema.json — the ai.extract@1 output. The process additionally yields the DMN-decided fields (personalDataRisk, processingRoute, redactionLevel, humanValidationStatus, requiresHumanReview).

Compliance

EU AI Act 2024/1689 Annex III high-risk; GDPR 2016/679 data minimisation. See resources/guardrails.yaml and docs/.

Description
No description provided
Readme 89 KiB
Languages
XML 100%