1
0

5 Commits

Author SHA1 Message Date
7e9cb63a4b Merge pull request 'feat(3.2.0): align with UAPF v2.5.0 — embed algorithm card tests, drop sidecar' (#3) from v3.2.0-embedded-tests into main
Reviewed-on: #3
2026-05-21 08:27:54 +00:00
9b3790c1fa feat(3.2.0): align with UAPF v2.5.0 — embed algorithm card tests, drop sidecar
Per UAPF v2.5.0, tests move from sidecar files
(tests/algorithms/<card-id>.test.yaml — removed in v2.5.0) into a
top-level tests array on each algorithm card. Minimum two entries per
card; the Algorithm Card viewer (UAPF chapter 13.16, ProcessGit
Preview tab) consumes these as its primary interaction surface.

This package's three cards now carry embedded tests:

- algo.semantic_document_analysis.pii_redactor (deterministic redactor)
  — 3 cases: Latvian personas kods inline (positive — three entity
  types detected), plain administrative text (negative — no PII
  signals), financial figures with IBAN (mixed — financial yes,
  personas_kods no).

- algo.semantic_document_analysis.vdvc_semantic_extractor (stochastic
  LLM extractor, EU AI Act high-risk + mandatory oversight) — 2
  cases: regulatory construction-permit appeal (in-domain, expected
  topic + applicable_regulations), non-regulatory thank-you note
  (out-of-domain, low confidence). Both carry ai_confidence_score
  tolerance bands appropriate for a stochastic output.

- algo.semantic_document_analysis.completion_event_emitter
  (deterministic CloudEvents emitter) — 2 cases: successful
  completion event, failure completion event. The emitter does not
  gate on payload contents, so both succeed.

Other changes:
- uapf.yaml + manifest.json: version 3.1.0 -> 3.2.0
- README.md: v3.2.0 section added describing embedded tests and the
  removed sidecar location

BPMN file unchanged from v3.1.0 — uapf:algorithmCardRef on each
service task per UAPF v2.4.0 + ioSpecification synthesis. Mappings
unchanged. DMN tables unchanged.

uapf-cli validate against v2.5.0 schemas passes cleanly.
2026-05-21 08:02:26 +00:00
e97b9d7d40 Merge pull request 'feat(3.1.0): align with UAPF v2.4.0 — algorithm card refs move to BPMN task' (#2) from v3.1.0-bpmn-algorithm-task into main
Reviewed-on: #2
2026-05-20 14:51:21 +00:00
59c87ee9a4 feat(3.1.0): align with UAPF v2.4.0 — algorithm card refs move to BPMN task
UAPF v2.4.0 reverses the v2.3.0 decision to place algorithm card
references on resource targets. The card belongs on the BPMN task
itself, where it is visible as a first-class process element and its
inputs/outputs render as visible data objects on the diagram.

Changes from v3.0.0:
- bpmn/semantic-document-analysis.bpmn: each of 3 service tasks now
  carries xmlns:uapf24=https://uapf.dev/bpmn/v2.4 + the
  uapf24:algorithmCardRef attribute pointing at the governing card,
  plus a <bpmn:ioSpecification> synthesised from the card's io block
  so inputs/outputs render as visible data objects
- resources/mappings.yaml: algorithm_card dropped from each of the
  3 targets (they go back to being just dispatch endpoints)
- uapf.yaml + manifest.json: version 3.0.0 -> 3.1.0
- README rewritten with v3.1.0 delta + audit-question table

Cards themselves are unchanged. DMN files are unchanged.
2026-05-20 14:23:16 +00:00
0a65c7ea5f Merge pull request 'feat(3.0.0): Algorithm Cards per UAPF v2.3.0 chapter 13' (#1) from v3.0.0-algorithm-cards into main
Reviewed-on: #1
2026-05-20 13:34:48 +00:00
8 changed files with 374 additions and 229 deletions

View File

@@ -1,17 +1,21 @@
# Semantic Document Analysis
UAPF Level-4 process for semantic analysis of free-text documents,
governed by **UAPF v2.3.0** (Algorithm Cards).
governed by **UAPF v2.4.0** (Algorithm Cards visible on BPMN tasks).
## What this package does
Three BPMN service tasks invoke three UAPF-IP host capabilities:
Three BPMN service tasks invoke three UAPF-IP host capabilities. Each
service task carries `uapf24:algorithmCardRef` pointing at the
Algorithm Card that governs the algorithm being invoked, and a
`<bpmn:ioSpecification>` synthesised from the card's `io` block so
inputs and outputs render as visible data objects.
| Task | Capability | Algorithm Card |
|-----------------------|----------------|---------------------------------------------------------------------|
| `Task_DetectRedactPii`| `ai.redact@1` | [`algorithms/pii_redactor.card.yaml`](algorithms/pii_redactor.card.yaml) |
| `Task_ExtractSemantics`| `ai.extract@1`| [`algorithms/vdvc_semantic_extractor.card.yaml`](algorithms/vdvc_semantic_extractor.card.yaml) |
| `Task_EmitResult` | `event.emit@1` | [`algorithms/completion_event_emitter.card.yaml`](algorithms/completion_event_emitter.card.yaml) |
| Task | Capability | Algorithm Card | Risk class |
|-----------------------|----------------|---------------------------------------------------------------------|------------|
| `Task_DetectRedactPii`| `ai.redact@1` | [`pii_redactor.card.yaml`](algorithms/pii_redactor.card.yaml) | limited |
| `Task_ExtractSemantics`| `ai.extract@1`| [`vdvc_semantic_extractor.card.yaml`](algorithms/vdvc_semantic_extractor.card.yaml) | high |
| `Task_EmitResult` | `event.emit@1` | [`completion_event_emitter.card.yaml`](algorithms/completion_event_emitter.card.yaml) | minimal |
Three DMN decision tables encode the deterministic policy:
@@ -25,42 +29,59 @@ Only `Task_ExtractSemantics` is a model-inference step (governed by the
high-risk `vdvc_semantic_extractor` Card). Everything else is
deterministic.
## v3.0.0 — Algorithm Cards
## v3.1.0 — Algorithm Cards visible on BPMN
The three opaque host capabilities are now wrapped in Algorithm Cards
under `algorithms/`. Each Card supplies, per UAPF v2.3.0 chapter 13:
intent, IO contract, ownership, validation history, risk class, audit
configuration, and (where relevant) `privacy` and `risk` extensions.
In v3.1.0, the Algorithm Card references move from `resources/mappings.yaml`
targets onto the BPMN service tasks themselves, per UAPF v2.4.0. This
matters because:
- A reader of the BPMN diagram now sees *which algorithm* runs at each
step, by inspecting the rendered task.
- The card's IO contract is synthesised into the task's
`<bpmn:ioSpecification>`, so downstream gateway conditions branching
on outputs like `ai_confidence_score` or `personas_koda_present`
are visually traceable to their source.
- A renderer that supports `uapf24:algorithmCardRef` (e.g., ProcessGit
preview, OpenDMS visualiser) draws the algorithm-card icon, name,
version, and risk-class dot directly on the task.
Audit question → answer-location:
| Auditor asks | Read this |
|-----------------------------------------------|------------------------------------------------|
| What does the redactor detect? | `algorithms/pii_redactor.card.yaml` § io |
| What's the AI Act risk class of the extractor?| `vdvc_semantic_extractor.card.yaml` § risk |
| Who owns each algorithm? | each Card § owners |
| When was each algorithm last validated? | each Card § validation |
| What gets logged, with what retention? | each Card § audit |
| Why is human oversight needed? | `vdvc_semantic_extractor.card.yaml` § confidence |
| Which algorithm runs at task X? | the BPMN itself: `uapf24:algorithmCardRef` attr |
| What inputs/outputs does it have? | the BPMN task's `<bpmn:ioSpecification>` block |
| What is the algorithm's risk class? | the Card's `risk.aiActRiskClass` field |
| When was the algorithm last validated? | the Card's `validation.last_validated` |
| What gets logged, with what retention? | the Card's `audit` block |
| Why is human oversight needed? | the Card's `confidence` + `risk` blocks |
### Delta from v2.0.0
### Delta from v3.0.0
- **+** `algorithms/` folder with three Cards (one per opaque host capability).
- **+** `algorithm_cards: true` and `paths.algorithms` in `uapf.yaml` / `manifest.json`.
- **~** `resources/mappings.yaml`: single `agent.semantic-extractor` target split into three algorithm-specific targets (`agent.pii_redactor`, `agent.vdvc_semantic_extractor`, `agent.completion_event_emitter`), each carrying its `algorithm_card` reference. Binding shape unchanged.
- **~** `bpmn/semantic-document-analysis.bpmn`: **unchanged**. Algorithm Cards live on resource targets, not in the BPMN — no extension elements required.
- **** `provides_decisions` removed from manifest (was not in the SSOT manifest schema; DMN decisions are self-describing via the `dmn/` cornerstone).
- **~** `bpmn/semantic-document-analysis.bpmn`: each of the 3 service tasks now carries `xmlns:uapf24="https://uapf.dev/bpmn/v2.4"` `uapf24:algorithmCardRef` attribute, plus a `<bpmn:ioSpecification>` synthesised from the card's `io` block.
- **~** `resources/mappings.yaml`: `algorithm_card:` removed from each of the 3 targets. They go back to being just dispatch endpoints, per UAPF v2.4.0.
- **~** `uapf.yaml` / `manifest.json`: version `3.0.0``3.1.0`.
- **=** `algorithms/*.card.yaml`: unchanged.
- **=** `dmn/*.dmn`: unchanged.
### Why the v3.0.0 → v3.1.0 churn
v3.0.0 followed UAPF v2.3.0, which placed the algorithm card on the
resource target. That hid the algorithm from the BPMN diagram. UAPF
v2.4.0 reverses that decision and moves the reference onto the BPMN
task. v3.1.0 of this package follows the corrected spec. Algorithm
Cards themselves are unchanged across both revisions.
## Structure
```
.
├── uapf.yaml + manifest.json # Package manifest (UAPF v2.3.0)
├── bpmn/ # 1 BPMN process (unchanged from v2.0.0)
├── dmn/ # 3 DMN decision tables (unchanged from v2.0.0)
├── algorithms/ # 3 Algorithm Cards (NEW in v3.0.0)
├── uapf.yaml + manifest.json # Package manifest (UAPF v2.4.0)
├── bpmn/ # 1 BPMN process (algorithm refs + ioSpecification)
├── dmn/ # 3 DMN decision tables
├── algorithms/ # 3 Algorithm Cards (introduced in v3.0.0)
├── resources/
│ ├── mappings.yaml # Resource targets w/ algorithm_card refs (REFACTORED)
│ ├── mappings.yaml # Resource targets (dispatch endpoints only)
│ ├── guardrails.yaml
│ └── schemas/ # Output JSON Schemas
├── metadata/ # ownership + lifecycle
@@ -71,9 +92,21 @@ Audit question → answer-location:
## Validation
Validates against UAPF v2.3.0 schemas at
Validates against UAPF v2.4.0 schemas at
`github.com/UAPFormat/UAPF-specification`:
```bash
python tools/uapf-cli/uapf.py validate /path/to/dokumenta-semantiska-analize
```
## v3.2.0 (UAPF v2.5.0 alignment)
Tests are now **embedded in each algorithm card** under a top-level `tests:` array (minimum 2 entries per card). The old sidecar location `tests/algorithms/<card-id>.test.yaml` is **removed** per UAPF v2.5.0 — that location no longer applies to algorithm cards.
Embedded tests for this package:
- `algo.semantic_document_analysis.pii_redactor` — 3 cases (Latvian personas kods inline, plain text with no PII, financial figures + IBAN)
- `algo.semantic_document_analysis.vdvc_semantic_extractor` — 2 cases (regulatory complaint, non-regulatory thank-you), both with `ai_confidence_score` tolerance bands appropriate for a stochastic LLM extractor
- `algo.semantic_document_analysis.completion_event_emitter` — 2 cases (success completion, failure completion)
The Algorithm Card viewer (UAPF v2.5.0 chapter 13.16, ProcessGit Preview tab) consumes these embedded tests as its primary interaction surface — sample browser for `external` cards, regex/FEEL/source-display for `inline` cards.

View File

@@ -1,64 +1,70 @@
kind: uapf.algorithm.card
id: algo.semantic_document_analysis.completion_event_emitter
version: "1.0.0"
name: "Process completion event emitter"
intent: >
Publishes a CloudEvents 1.0-conformant event marking the completion
of one semantic analysis cycle, with the DMN-decided fields
(personal data risk, processing route, redaction level, human
validation status) attached. Personal data is NEVER included in
the emitted payload — only the deterministic classification fields.
version: 1.0.0
name: Process completion event emitter
intent: |
Publishes a CloudEvents 1.0-conformant event marking the completion of one semantic analysis cycle, with the DMN-decided fields (personal data risk, processing route, redaction level, human validation status) attached. Personal data is NEVER included in the emitted payload — only the deterministic classification fields.
algorithm_kind: emitter
io:
inputs:
- id: event_type
type: string
cardinality: single
- id: payload
type: object
cardinality: single
- id: event_type
type: string
cardinality: single
- id: payload
type: object
cardinality: single
outputs:
- id: published
type: boolean
- id: published
type: boolean
implementation:
type: external
medium: mcp_tool
uri: "uapf-ip://capability/event.emit@1"
hash: "sha256:0000000000000000000000000000000000000000000000000000000000000000"
uri: uapf-ip://capability/event.emit@1
hash: sha256:0000000000000000000000000000000000000000000000000000000000000000
runtime:
capability: "event.emit@1"
cloud_events_spec: "1.0"
capability: event.emit@1
cloud_events_spec: '1.0'
determinism: deterministic
side_effects: writes_state
confidence:
type: none
complexity:
typical_latency_ms: 25
max_latency_ms: 1000
failure_mode: "throw — process must complete reliably or fail loudly."
failure_mode: throw — process must complete reliably or fail loudly.
reference:
standard: "CloudEvents 1.0"
url: "https://github.com/cloudevents/spec/blob/v1.0/spec.md"
standard: CloudEvents 1.0
url: https://github.com/cloudevents/spec/blob/v1.0/spec.md
owners:
- type: team
id: uapf-stewards
contact: stewards@uapf.dev
- type: team
id: uapf-stewards
contact: stewards@uapf.dev
lifecycle:
status: draft
since: "2026-05-20"
since: '2026-05-20'
audit:
log_inputs: full
log_outputs: full
retention: "1y"
retention: 1y
tests:
- name: Successful analysis completion
description: Standard happy-path completion event with full payload.
inputs:
event_type: dev.dokumenta.semantic_analysis.completed
payload:
document_id: doc-2026-05-21-001
outcome: ok
confidence: 0.87
expected_outputs:
published: true
- name: Analysis failure completion
description: Failure-path completion event still emits successfully (the emitter
does not gate on payload contents).
inputs:
event_type: dev.dokumenta.semantic_analysis.failed
payload:
document_id: doc-2026-05-21-002
outcome: extraction_failed
reason: low_confidence
expected_outputs:
published: true

View File

@@ -1,87 +1,117 @@
kind: uapf.algorithm.card
id: algo.semantic_document_analysis.pii_redactor
version: "1.0.0"
name: "PII detector and redactor"
intent: >
Detects personally identifiable information in free-text documents
(Latvian personas kods, IBAN, phone numbers, e-mail addresses,
names) and returns the source text with PII masked plus structured
regex-hit signals used by the downstream DMN decision
assess-personal-data-risk.
version: 1.0.0
name: PII detector and redactor
intent: |
Detects personally identifiable information in free-text documents (Latvian personas kods, IBAN, phone numbers, e-mail addresses, names) and returns the source text with PII masked plus structured regex-hit signals used by the downstream DMN decision assess-personal-data-risk.
algorithm_kind: redactor
io:
inputs:
- id: content
type: string
cardinality: single
constraints:
maxLength: 200000
documentation: "Raw document text submitted for semantic analysis."
- id: content
type: string
cardinality: single
constraints:
maxLength: 200000
documentation: Raw document text submitted for semantic analysis.
outputs:
- id: redacted_content
type: string
documentation: "Source text with PII masked by category tokens."
- id: detected_entity_types
type: array
documentation: "PII category names only — never values."
- id: personas_koda_present
type: boolean
- id: financial_data_present
type: boolean
- id: contact_data_present
type: boolean
- id: pii_category_count
type: integer
constraints: { minimum: 0 }
- id: redacted_content
type: string
documentation: Source text with PII masked by category tokens.
- id: detected_entity_types
type: array
documentation: PII category names only — never values.
- id: personas_koda_present
type: boolean
- id: financial_data_present
type: boolean
- id: contact_data_present
type: boolean
- id: pii_category_count
type: integer
constraints:
minimum: 0
implementation:
type: external
medium: mcp_tool
uri: "uapf-ip://capability/ai.redact@1"
hash: "sha256:0000000000000000000000000000000000000000000000000000000000000000"
uri: uapf-ip://capability/ai.redact@1
hash: sha256:0000000000000000000000000000000000000000000000000000000000000000
runtime:
capability: "ai.redact@1"
note: "Host-fulfilled UAPF-IP capability. Hash is a placeholder until the runtime publishes the implementation hash of its ai.redact handler."
capability: ai.redact@1
note: Host-fulfilled UAPF-IP capability. Hash is a placeholder until the runtime
publishes the implementation hash of its ai.redact handler.
determinism: deterministic
side_effects: pure
complexity:
typical_latency_ms: 250
max_latency_ms: 10000
failure_mode: "throw — refuse processing if redactor unavailable; PII risk dominates."
failure_mode: throw — refuse processing if redactor unavailable; PII risk dominates.
limitations:
- "Latviešu valodas personu vārdi atpazīstami ~92% gadījumu"
- "Pieņem, ka teksts jau ir digitāls — OCR nav iekļauta"
- Latviešu valodas personu vārdi atpazīstami ~92% gadījumu
- Pieņem, ka teksts jau ir digitāls — OCR nav iekļauta
reference:
legal: "GDPR 2016/679 5. pants (datu minimizēšana); Fizisko personu datu apstrādes likums."
standard: "NIST SP 800-188 — De-Identification of Personal Information."
legal: GDPR 2016/679 5. pants (datu minimizēšana); Fizisko personu datu apstrādes
likums.
standard: NIST SP 800-188 — De-Identification of Personal Information.
owners:
- type: role
id: data_protection_officer
contact: stewards@uapf.dev
- type: role
id: data_protection_officer
contact: stewards@uapf.dev
lifecycle:
status: draft
since: "2026-05-20"
since: '2026-05-20'
audit:
log_inputs: redacted
log_outputs: full
retention: "7y"
retention: 7y
privacy:
processesPII: true
technique: pseudonymization
reidentificationRisk: low
risk:
aiActRiskClass: limited
humanOversight: advisory
tests:
- name: Latvian personas kods inline in text
description: Standard 11-character Latvian personal identity code (NNNNNN-NNNNN)
should be detected and redacted.
inputs:
content: 'Lūgums izskatīt iesniegumu. Iesniedzējs: Jānis Bērziņš, personas kods:
010101-12345. Adrese: Brīvības iela 1, Rīga.'
expected_outputs:
redacted_content: 'Lūgums izskatīt iesniegumu. Iesniedzējs: [NAME], personas kods:
[REDACTED]. Adrese: [ADDRESS].'
detected_entity_types:
- PERSONAS_KODS
- PERSON
- ADDRESS
personas_koda_present: true
financial_data_present: false
contact_data_present: true
pii_category_count: 3
- name: Plain administrative text with no PII
description: Generic administrative paragraph; nothing to redact. Verifies the redactor
doesn't false-positive on plain text.
inputs:
content: Iesniegums tiek izskatīts atbilstoši normatīvajiem aktiem. Lēmums tiks
paziņots noteiktajā kārtībā.
expected_outputs:
redacted_content: Iesniegums tiek izskatīts atbilstoši normatīvajiem aktiem. Lēmums
tiks paziņots noteiktajā kārtībā.
detected_entity_types: []
personas_koda_present: false
financial_data_present: false
contact_data_present: false
pii_category_count: 0
- name: Financial figures and account numbers
description: EUR amounts and IBAN — both detected as financial PII; no personas_kods.
inputs:
content: Maksājums EUR 1250.00 pārskaitīts uz kontu LV80BANK0000435195001.
expected_outputs:
redacted_content: Maksājums EUR [AMOUNT] pārskaitīts uz kontu [IBAN].
detected_entity_types:
- AMOUNT
- IBAN
personas_koda_present: false
financial_data_present: true
contact_data_present: false
pii_category_count: 2

View File

@@ -1,88 +1,119 @@
kind: uapf.algorithm.card
id: algo.semantic_document_analysis.vdvc_semantic_extractor
version: "1.0.0"
name: "VDVC semantic metadata extractor"
intent: >
Extracts a VDVC v1.1-conformant structured semantic summary from
the redacted document text — primary topic, keywords,
classification, summary, sensitivity signals. Output validates
against resources/schemas/vdvc-semantic-summary.schema.json. This
is the sole model-inference step in the process; everything else
in the package is deterministic.
version: 1.0.0
name: VDVC semantic metadata extractor
intent: |
Extracts a VDVC v1.1-conformant structured semantic summary from the redacted document text — primary topic, keywords, classification, summary, sensitivity signals. Output validates against resources/schemas/vdvc-semantic-summary.schema.json. This is the sole model-inference step in the process; everything else in the package is deterministic.
algorithm_kind: extractor
io:
inputs:
- id: redacted_content
type: string
cardinality: single
constraints:
maxLength: 200000
documentation: "Output of the upstream PII redactor."
- id: schema_ref
type: string
documentation: "Path to the JSON Schema the output must validate against."
- id: redacted_content
type: string
cardinality: single
constraints:
maxLength: 200000
documentation: Output of the upstream PII redactor.
- id: schema_ref
type: string
documentation: Path to the JSON Schema the output must validate against.
outputs:
- id: semantic_summary
type: object
schema: "../resources/schemas/vdvc-semantic-summary.schema.json"
- id: sensitivity_control
type: object
- id: ai_confidence_score
type: probability
- id: output_pii_error_count
type: integer
constraints: { minimum: 0 }
- id: semantic_summary
type: object
schema: ../resources/schemas/vdvc-semantic-summary.schema.json
- id: sensitivity_control
type: object
- id: ai_confidence_score
type: probability
- id: output_pii_error_count
type: integer
constraints:
minimum: 0
implementation:
type: external
medium: llm_prompt
uri: "uapf-ip://capability/ai.extract@1"
hash: "sha256:0000000000000000000000000000000000000000000000000000000000000000"
uri: uapf-ip://capability/ai.extract@1
hash: sha256:0000000000000000000000000000000000000000000000000000000000000000
runtime:
capability: "ai.extract@1"
note: "Host-fulfilled UAPF-IP capability. Specific model identity and prompt hash are runtime concerns of the host; the Card declares the contract, not the implementation choice."
capability: ai.extract@1
note: Host-fulfilled UAPF-IP capability. Specific model identity and prompt hash
are runtime concerns of the host; the Card declares the contract, not the implementation
choice.
determinism: stochastic
side_effects: external_call
confidence:
type: probability
threshold: 0.70
below_threshold: "route-to:human.legal_reviewer (enforced by DMN human-validation-gate)"
threshold: 0.7
below_threshold: route-to:human.legal_reviewer (enforced by DMN human-validation-gate)
complexity:
typical_latency_ms: 8000
max_latency_ms: 60000
failure_mode: "default:null + flag — DMN human-validation-gate routes low-confidence outputs to PENDING_REVIEW."
failure_mode: default:null + flag — DMN human-validation-gate routes low-confidence
outputs to PENDING_REVIEW.
limitations:
- "Garie dokumenti (>50 000 znaki) tiek apgriezti — pirmie 50K + pēdējie 5K"
- "Nav juridisks vērtējums — tikai semantiska klasifikācija"
- "Latviešu valodas juridiskā retorika var samazināt recall"
- Garie dokumenti (>50 000 znaki) tiek apgriezti — pirmie 50K + pēdējie 5K
- Nav juridisks vērtējums — tikai semantiska klasifikācija
- Latviešu valodas juridiskā retorika var samazināt recall
reference:
legal: "EU AI Act 2024/1689, Pielikums III (augstā riska MI sistēmas), 13. pants (caurspīdība)."
url: "https://eur-lex.europa.eu/eli/reg/2024/1689/oj"
legal: EU AI Act 2024/1689, Pielikums III (augstā riska MI sistēmas), 13. pants
(caurspīdība).
url: https://eur-lex.europa.eu/eli/reg/2024/1689/oj
owners:
- type: team
id: uapf-stewards
contact: stewards@uapf.dev
- type: team
id: uapf-stewards
contact: stewards@uapf.dev
lifecycle:
status: draft
since: "2026-05-20"
since: '2026-05-20'
audit:
log_inputs: redacted
log_outputs: full
retention: "7y"
retention: 7y
risk:
aiActRiskClass: high
humanOversight: mandatory
transparencyTier: tier-3-full
tests:
- name: Regulatory iesniegums about administrative decision
description: Typical Latvian administrative complaint with redacted PII. The extractor
should identify topic + risk + applicable regulation.
inputs:
redacted_content: Iesniedzējs [NAME] iesniedza sūdzību par būvvaldes lēmumu Nr.
12345 atteikt būvatļauju adresē [ADDRESS]. Tiek lūgts pārskatīt lēmumu.
schema_ref: schemas/iesniegums/v1
expected_outputs:
semantic_summary:
topic: construction-permit-appeal
subject_area: administrative-law
applicable_regulations:
- BL
- APL
language: lv
sensitivity_control:
contains_decision_reference: true
external_communication_recommended: false
ai_confidence_score: 0.87
output_pii_error_count: 0
tolerance:
ai_confidence_score: 0.1
output_pii_error_count: 0
- name: Non-regulatory thank-you note
description: Out-of-domain input. Extractor should yield low-confidence summary
and a sensitivity flag that no decision is referenced.
inputs:
redacted_content: Paldies par jūsu pakalpojumu! Bija ļoti patīkami sadarboties
ar [NAME] no jūsu komandas.
schema_ref: schemas/iesniegums/v1
expected_outputs:
semantic_summary:
topic: non-actionable-correspondence
subject_area: feedback
applicable_regulations: []
language: lv
sensitivity_control:
contains_decision_reference: false
external_communication_recommended: false
ai_confidence_score: 0.62
output_pii_error_count: 0
tolerance:
ai_confidence_score: 0.15
output_pii_error_count: 0

View File

@@ -2,6 +2,7 @@
<bpmn:definitions
xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL"
xmlns:uapf="https://uapf.dev/bpmn-ext/v1"
xmlns:uapf24="https://uapf.dev/bpmn/v2.4"
xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI"
xmlns:dc="http://www.omg.org/spec/DD/20100524/DC"
xmlns:di="http://www.omg.org/spec/DD/20100524/DI"
@@ -16,15 +17,36 @@
<bpmn:serviceTask id="Task_DetectRedactPii"
name="Detect and redact PII"
uapf:capability="ai.redact@1">
uapf:capability="ai.redact@1"
uapf24:algorithmCardRef="algo.semantic_document_analysis.pii_redactor">
<bpmn:documentation>
Calls ai.redact@1 over the source text. Beyond masking, the host
Calls ai.redact@1 over the source text. Governed by Algorithm
Card algo.semantic_document_analysis.pii_redactor (see
algorithms/pii_redactor.card.yaml). Beyond masking, the host
runs the four Latvian PII regex detectors (personas kods, IBAN,
e-mail, phone) and returns the deterministic signal set the risk
decision consumes: personasKodaPresent, financialDataPresent,
contactDataPresent, piiCategoryCount, detectedEntityTypes, plus
redactedContent. No model inference — pure pattern detection.
decision consumes.
</bpmn:documentation>
<bpmn:ioSpecification>
<bpmn:dataInput id="content" name="content : string"/>
<bpmn:dataOutput id="redacted_content" name="redacted_content : string"/>
<bpmn:dataOutput id="detected_entity_types" name="detected_entity_types : array"/>
<bpmn:dataOutput id="personas_koda_present" name="personas_koda_present : boolean"/>
<bpmn:dataOutput id="financial_data_present" name="financial_data_present : boolean"/>
<bpmn:dataOutput id="contact_data_present" name="contact_data_present : boolean"/>
<bpmn:dataOutput id="pii_category_count" name="pii_category_count : integer"/>
<bpmn:inputSet>
<bpmn:dataInputRefs>content</bpmn:dataInputRefs>
</bpmn:inputSet>
<bpmn:outputSet>
<bpmn:dataOutputRefs>redacted_content</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>detected_entity_types</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>personas_koda_present</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>financial_data_present</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>contact_data_present</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>pii_category_count</bpmn:dataOutputRefs>
</bpmn:outputSet>
</bpmn:ioSpecification>
</bpmn:serviceTask>
<bpmn:businessRuleTask id="Decision_AssessRisk"
@@ -32,9 +54,7 @@
uapf:decision="assess-personal-data-risk">
<bpmn:documentation>
DMN dmn/assess-personal-data-risk.dmn. Maps the PII signal set to
personalDataRisk (NONE | LOW | MEDIUM | HIGH) by explicit ranked
rules. Personas kods or IBAN forces HIGH; two or more categories
or contact data gives MEDIUM. Deterministic and auditable.
personalDataRisk (NONE | LOW | MEDIUM | HIGH).
</bpmn:documentation>
</bpmn:businessRuleTask>
@@ -44,48 +64,75 @@
<bpmn:documentation>
DMN dmn/gdpr-processing-route.dmn. From personalDataRisk and
allowCentralization decides processingRoute (CENTRAL | LOCAL),
anonymizationRequired and redactionLevel. This is the routing
rule extracted from the host's generate_semantic_metadata: a
sensitive document where centralisation is not permitted stays
LOCAL with full redaction.
anonymizationRequired and redactionLevel.
</bpmn:documentation>
</bpmn:businessRuleTask>
<bpmn:serviceTask id="Task_ExtractSemantics"
name="Extract semantic metadata"
uapf:capability="ai.extract@1"
uapf:schemaRef="resources/schemas/vdvc-semantic-summary.schema.json">
uapf:schemaRef="resources/schemas/vdvc-semantic-summary.schema.json"
uapf24:algorithmCardRef="algo.semantic_document_analysis.vdvc_semantic_extractor">
<bpmn:documentation>
Calls ai.extract@1 on redactedContent with the VDVC v1.1 output
schema. This is the single bounded model step: it produces the
semanticSummary (topic, summary, keywords, urgency, risk) and
must validate against resources/schemas/vdvc-semantic-summary.
The host also returns flat aiConfidenceScore and the result of
the post-extraction PII re-scan as outputPiiErrorCount.
schema. Governed by Algorithm Card
algo.semantic_document_analysis.vdvc_semantic_extractor (see
algorithms/vdvc_semantic_extractor.card.yaml). EU AI Act
Annex III high-risk; human oversight is mandatory and is
enforced downstream by the human-validation-gate DMN.
</bpmn:documentation>
<bpmn:ioSpecification>
<bpmn:dataInput id="redacted_content" name="redacted_content : string"/>
<bpmn:dataInput id="schema_ref" name="schema_ref : string"/>
<bpmn:dataOutput id="semantic_summary" name="semantic_summary : object"/>
<bpmn:dataOutput id="sensitivity_control" name="sensitivity_control : object"/>
<bpmn:dataOutput id="ai_confidence_score" name="ai_confidence_score : probability"/>
<bpmn:dataOutput id="output_pii_error_count" name="output_pii_error_count : integer"/>
<bpmn:inputSet>
<bpmn:dataInputRefs>redacted_content</bpmn:dataInputRefs>
<bpmn:dataInputRefs>schema_ref</bpmn:dataInputRefs>
</bpmn:inputSet>
<bpmn:outputSet>
<bpmn:dataOutputRefs>semantic_summary</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>sensitivity_control</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>ai_confidence_score</bpmn:dataOutputRefs>
<bpmn:dataOutputRefs>output_pii_error_count</bpmn:dataOutputRefs>
</bpmn:outputSet>
</bpmn:ioSpecification>
</bpmn:serviceTask>
<bpmn:businessRuleTask id="Decision_ValidationGate"
name="Determine human-validation status"
uapf:decision="human-validation-gate">
<bpmn:documentation>
DMN dmn/human-validation-gate.dmn. From outputPiiErrorCount,
aiConfidenceScore and personalDataRisk decides
humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO)
and requiresHumanReview. Any leaked PII or confidence below 0.3
rejects; below 0.7, or HIGH risk, forces review; 0.7 and above
with clean output auto-approves. The thresholds are the weights.
DMN dmn/human-validation-gate.dmn. From output_pii_error_count,
ai_confidence_score and personalDataRisk decides
humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO).
</bpmn:documentation>
</bpmn:businessRuleTask>
<bpmn:serviceTask id="Task_EmitResult"
name="Emit semantic-analysis-completed event"
uapf:capability="event.emit@1"
uapf:eventType="document.semantic-analysis.completed.v1">
uapf:eventType="document.semantic-analysis.completed.v1"
uapf24:algorithmCardRef="algo.semantic_document_analysis.completion_event_emitter">
<bpmn:documentation>
Calls event.emit@1 to publish a CloudEvent carrying the semantic
summary, the routing decision and the validation status.
Calls event.emit@1 to publish a CloudEvent. Governed by
Algorithm Card algo.semantic_document_analysis.completion_event_emitter
(see algorithms/completion_event_emitter.card.yaml).
</bpmn:documentation>
<bpmn:ioSpecification>
<bpmn:dataInput id="event_type" name="event_type : string"/>
<bpmn:dataInput id="payload" name="payload : object"/>
<bpmn:dataOutput id="published" name="published : boolean"/>
<bpmn:inputSet>
<bpmn:dataInputRefs>event_type</bpmn:dataInputRefs>
<bpmn:dataInputRefs>payload</bpmn:dataInputRefs>
</bpmn:inputSet>
<bpmn:outputSet>
<bpmn:dataOutputRefs>published</bpmn:dataOutputRefs>
</bpmn:outputSet>
</bpmn:ioSpecification>
</bpmn:serviceTask>
<bpmn:endEvent id="End" name="Semantic analysis complete"/>

View File

@@ -2,9 +2,9 @@
"kind": "uapf.package",
"id": "dev.uapf.semantic-document-analysis",
"name": "Semantic Document Analysis",
"description": "Level-4 UAPF process for semantic analysis of free-text documents.\n\nThree BPMN service tasks invoke the UAPF-IP capabilities ai.redact@1,\nai.extract@1 and event.emit@1. Three DMN decision tables encode the\ndeterministic algorithm the host previously hid inside application\ncode: assess-personal-data-risk maps PII regex signals to a risk\nlevel; gdpr-processing-route selects CENTRAL vs LOCAL processing,\nanonymisation and redaction level; human-validation-gate applies the\nconfidence thresholds that decide REJECTED / PENDING_REVIEW /\nAPPROVED_AUTO.\n\nOnly the semantic extraction is a model step. Risk classification,\nGDPR routing and the validation gate are explicit ranked rules in\nversioned DMN \u2014 inspectable, auditable, portable. Extraction output\nvalidates against the VDVC v1.1 semantic-summary JSON Schema.\n\nv3.0.0: the three opaque host capabilities (ai.redact@1,\nai.extract@1, event.emit@1) are now governed by Algorithm Cards\nin algorithms/ per UAPF v2.3.0 chapter 13. Each Card supplies the\nintent, IO contract, ownership, validation history, risk class,\nand audit configuration for one algorithm. Cards are referenced\nfrom resource targets in resources/mappings.yaml.\n",
"description": "Level-4 UAPF process for semantic analysis of free-text documents.\n\nThree BPMN service tasks invoke the UAPF-IP capabilities ai.redact@1,\nai.extract@1 and event.emit@1. Three DMN decision tables encode the\ndeterministic algorithm the host previously hid inside application\ncode: assess-personal-data-risk maps PII regex signals to a risk\nlevel; gdpr-processing-route selects CENTRAL vs LOCAL processing,\nanonymisation and redaction level; human-validation-gate applies the\nconfidence thresholds that decide REJECTED / PENDING_REVIEW /\nAPPROVED_AUTO.\n\nOnly the semantic extraction is a model step. Risk classification,\nGDPR routing and the validation gate are explicit ranked rules in\nversioned DMN \u2014 inspectable, auditable, portable. Extraction output\nvalidates against the VDVC v1.1 semantic-summary JSON Schema.\n\nv3.1.0: aligned with UAPF v2.4.0 \u2014 Algorithm Card references move\nfrom resource targets to the BPMN service tasks themselves (via\nuapf24:algorithmCardRef attribute). Each card's io block is also\ndenormalised into a <bpmn:ioSpecification> on the task so inputs\nand outputs render as visible data objects on the diagram. The\ncards themselves and the DMN decisions are unchanged from v3.0.0.\n",
"level": 4,
"version": "3.0.0",
"version": "3.2.0",
"requires_capabilities": [
"ai.redact@1+",
"ai.extract@1+",

View File

@@ -2,10 +2,11 @@ kind: uapf.resources.mapping
# Host-readable contract for the capability-backed service tasks.
#
# v3.0.0 change: the single agent.semantic-extractor target has been
# split into three algorithm-specific targets, each referencing an
# Algorithm Card under algorithms/ (UAPF v2.3.0, chapter 13). The
# binding shape is unchanged. The BPMN file is unchanged.
# v3.1.0 change: the algorithm_card reference (added in v3.0.0 on each
# target) has been removed per UAPF v2.4.0 — the Algorithm Card reference
# now lives on the BPMN serviceTask itself via the
# uapf24:algorithmCardRef attribute (see bpmn/semantic-document-analysis.bpmn).
# Targets here keep their role as dispatch endpoints only.
#
# The three DMN decisions (assess-personal-data-risk,
# gdpr-processing-route, human-validation-gate) remain self-describing
@@ -21,7 +22,6 @@ targets:
pii_redactor Algorithm Card.
capabilities:
- capability.ai.redact
algorithm_card: algo.semantic_document_analysis.pii_redactor
- id: agent.vdvc_semantic_extractor
type: ai_agent
@@ -33,7 +33,6 @@ targets:
enforced downstream by the human-validation-gate DMN.
capabilities:
- capability.ai.extract
algorithm_card: algo.semantic_document_analysis.vdvc_semantic_extractor
- id: agent.completion_event_emitter
type: ai_agent
@@ -43,7 +42,6 @@ targets:
completion_event_emitter Algorithm Card.
capabilities:
- capability.event.emit
algorithm_card: algo.semantic_document_analysis.completion_event_emitter
bindings:
- source: { type: bpmn.serviceTask, ref: Task_DetectRedactPii }

View File

@@ -18,15 +18,15 @@ description: |
versioned DMN — inspectable, auditable, portable. Extraction output
validates against the VDVC v1.1 semantic-summary JSON Schema.
v3.0.0: the three opaque host capabilities (ai.redact@1,
ai.extract@1, event.emit@1) are now governed by Algorithm Cards
in algorithms/ per UAPF v2.3.0 chapter 13. Each Card supplies the
intent, IO contract, ownership, validation history, risk class,
and audit configuration for one algorithm. Cards are referenced
from resource targets in resources/mappings.yaml.
v3.1.0: aligned with UAPF v2.4.0 — Algorithm Card references move
from resource targets to the BPMN service tasks themselves (via
uapf24:algorithmCardRef attribute). Each card's io block is also
denormalised into a <bpmn:ioSpecification> on the task so inputs
and outputs render as visible data objects on the diagram. The
cards themselves and the DMN decisions are unchanged from v3.0.0.
level: 4
version: "3.0.0"
version: "3.2.0"
# ── UAPF-IP integration (capability needs + profile + guardrails) ──
requires_capabilities: