You've already forked dokumenta-semantiska-analize
Import UAPF package
The 1.x package was a single ai.extract call wrapped in three BPMN
service tasks. No decision logic, no dmn cornerstone, no weights — the
risk/routing/validation algorithm lived invisibly in host code. There
was nothing for a runtime to actually execute.
2.0.0 makes it a real process:
- dmn cornerstone added with three decision tables:
* assess-personal-data-risk — PII regex signals -> risk level
* gdpr-processing-route — risk x centralisation -> CENTRAL/LOCAL,
anonymisation, redaction level
* human-validation-gate — confidence thresholds + PII re-scan
-> REJECTED/PENDING_REVIEW/APPROVED_AUTO
- BPMN expanded 3 -> 6 nodes (3 serviceTask + 3 businessRuleTask),
with horizontal DI.
- Task ids, mappings, docs, manifest (dmn:true), uapf.yaml, lifecycle
and eval-set updated; added a PII-bearing fixture.
Only the semantic extraction remains a model step. Risk classification,
GDPR routing and validation gating are now explicit ranked DMN rules —
inspectable, versioned, portable. Breaking change: structure + outputs.
47 lines
1.8 KiB
Markdown
47 lines
1.8 KiB
Markdown
# dev.uapf.semantic-document-analysis — Overview
|
|
|
|
**UAPF v1.1 SSOT-conformant** Level 4 process package for semantic
|
|
document analysis.
|
|
|
|
## What
|
|
|
|
A six-node BPMN process that, given free-text document content:
|
|
|
|
1. **Detect and redact PII** (`ai.redact@1`) — masks PII and returns the
|
|
deterministic regex signal set (personas kods / IBAN / contact data /
|
|
category count).
|
|
2. **Assess personal-data risk** (DMN `assess-personal-data-risk`) —
|
|
ranked rules map the signal set to `personalDataRisk`.
|
|
3. **Decide GDPR processing route** (DMN `gdpr-processing-route`) —
|
|
`personalDataRisk` x `allowCentralization` -> CENTRAL/LOCAL,
|
|
anonymisation and redaction level.
|
|
4. **Extract semantic metadata** (`ai.extract@1`) — the one model step;
|
|
produces VDVC v1.1 structured metadata.
|
|
5. **Determine validation status** (DMN `human-validation-gate`) —
|
|
confidence thresholds + PII re-scan -> REJECTED / PENDING_REVIEW /
|
|
APPROVED_AUTO.
|
|
6. **Emit** `document.semantic-analysis.completed.v1` (`event.emit@1`).
|
|
|
|
## Why this shape
|
|
|
|
The previous 1.x package was a single `ai.extract` call wrapped in
|
|
BPMN. The decision logic — risk, routing, validation gating — lived
|
|
invisibly in host code. Version 2.0 extracts that logic into three
|
|
versioned DMN decision tables. The algorithm is now in the package:
|
|
inspectable, diff-able, portable. The host supplies inference for one
|
|
bounded step only.
|
|
|
|
## What's portable
|
|
|
|
- The BPMN flow (the process shape)
|
|
- Three DMN decision tables (the algorithm and its weights)
|
|
- The VDVC output JSON Schema (the extraction contract)
|
|
- The resource mapping and the guardrails policy
|
|
|
|
## How to consume
|
|
|
|
Drop this `.uapf` into any UAPF-conformant runtime and run
|
|
`Process_SemanticDocumentAnalysis`. The runtime evaluates the DMN
|
|
decisions itself and resolves the resource mapping for the three
|
|
capability-backed service tasks.
|