2026-05-20 14:51:22 +00:00
5 changed files with 139 additions and 73 deletions
--- a/README.md
+++ b/README.md
@@ -1,17 +1,21 @@
 # Semantic Document Analysis

 UAPF Level-4 process for semantic analysis of free-text documents,
-governed by **UAPF v2.3.0** (Algorithm Cards).
+governed by **UAPF v2.4.0** (Algorithm Cards visible on BPMN tasks).

 ## What this package does

-Three BPMN service tasks invoke three UAPF-IP host capabilities:
+Three BPMN service tasks invoke three UAPF-IP host capabilities. Each
+service task carries `uapf24:algorithmCardRef` pointing at the
+Algorithm Card that governs the algorithm being invoked, and a
+`<bpmn:ioSpecification>` synthesised from the card's `io` block so
+inputs and outputs render as visible data objects.

-| Task                  | Capability     | Algorithm Card                                                      |
-|-----------------------|----------------|---------------------------------------------------------------------|
-| `Task_DetectRedactPii`| `ai.redact@1`  | [`algorithms/pii_redactor.card.yaml`](algorithms/pii_redactor.card.yaml) |
-| `Task_ExtractSemantics`| `ai.extract@1`| [`algorithms/vdvc_semantic_extractor.card.yaml`](algorithms/vdvc_semantic_extractor.card.yaml) |
-| `Task_EmitResult`     | `event.emit@1` | [`algorithms/completion_event_emitter.card.yaml`](algorithms/completion_event_emitter.card.yaml) |
+| Task                  | Capability     | Algorithm Card                                                      | Risk class |
+|-----------------------|----------------|---------------------------------------------------------------------|------------|
+| `Task_DetectRedactPii`| `ai.redact@1`  | [`pii_redactor.card.yaml`](algorithms/pii_redactor.card.yaml)       | limited |
+| `Task_ExtractSemantics`| `ai.extract@1`| [`vdvc_semantic_extractor.card.yaml`](algorithms/vdvc_semantic_extractor.card.yaml) | high |
+| `Task_EmitResult`     | `event.emit@1` | [`completion_event_emitter.card.yaml`](algorithms/completion_event_emitter.card.yaml) | minimal |

 Three DMN decision tables encode the deterministic policy:

@@ -25,42 +29,59 @@ Only `Task_ExtractSemantics` is a model-inference step (governed by the
 high-risk `vdvc_semantic_extractor` Card). Everything else is
 deterministic.

-## v3.0.0 — Algorithm Cards
+## v3.1.0 — Algorithm Cards visible on BPMN

-The three opaque host capabilities are now wrapped in Algorithm Cards
-under `algorithms/`. Each Card supplies, per UAPF v2.3.0 chapter 13:
-intent, IO contract, ownership, validation history, risk class, audit
-configuration, and (where relevant) `privacy` and `risk` extensions.
+In v3.1.0, the Algorithm Card references move from `resources/mappings.yaml`
+targets onto the BPMN service tasks themselves, per UAPF v2.4.0. This
+matters because:
+
+- A reader of the BPMN diagram now sees *which algorithm* runs at each
+  step, by inspecting the rendered task.
+- The card's IO contract is synthesised into the task's
+  `<bpmn:ioSpecification>`, so downstream gateway conditions branching
+  on outputs like `ai_confidence_score` or `personas_koda_present`
+  are visually traceable to their source.
+- A renderer that supports `uapf24:algorithmCardRef` (e.g., ProcessGit
+  preview, OpenDMS visualiser) draws the algorithm-card icon, name,
+  version, and risk-class dot directly on the task.

 Audit question → answer-location:

 | Auditor asks                                  | Read this                                      |
 |-----------------------------------------------|------------------------------------------------|
-| What does the redactor detect?                | `algorithms/pii_redactor.card.yaml` § io       |
-| What's the AI Act risk class of the extractor?| `vdvc_semantic_extractor.card.yaml` § risk     |
-| Who owns each algorithm?                      | each Card § owners                             |
-| When was each algorithm last validated?       | each Card § validation                         |
-| What gets logged, with what retention?        | each Card § audit                              |
-| Why is human oversight needed?                | `vdvc_semantic_extractor.card.yaml` § confidence |
+| Which algorithm runs at task X?               | the BPMN itself: `uapf24:algorithmCardRef` attr |
+| What inputs/outputs does it have?             | the BPMN task's `<bpmn:ioSpecification>` block  |
+| What is the algorithm's risk class?           | the Card's `risk.aiActRiskClass` field          |
+| When was the algorithm last validated?        | the Card's `validation.last_validated`          |
+| What gets logged, with what retention?        | the Card's `audit` block                        |
+| Why is human oversight needed?                | the Card's `confidence` + `risk` blocks         |

-### Delta from v2.0.0
+### Delta from v3.0.0

- **+** `algorithms/` folder with three Cards (one per opaque host capability).
- **+** `algorithm_cards: true` and `paths.algorithms` in `uapf.yaml` / `manifest.json`.
- **~** `resources/mappings.yaml`: single `agent.semantic-extractor` target split into three algorithm-specific targets (`agent.pii_redactor`, `agent.vdvc_semantic_extractor`, `agent.completion_event_emitter`), each carrying its `algorithm_card` reference. Binding shape unchanged.
- **~** `bpmn/semantic-document-analysis.bpmn`: **unchanged**. Algorithm Cards live on resource targets, not in the BPMN — no extension elements required.
- **−** `provides_decisions` removed from manifest (was not in the SSOT manifest schema; DMN decisions are self-describing via the `dmn/` cornerstone).
+- **~** `bpmn/semantic-document-analysis.bpmn`: each of the 3 service tasks now carries `xmlns:uapf24="https://uapf.dev/bpmn/v2.4"` `uapf24:algorithmCardRef` attribute, plus a `<bpmn:ioSpecification>` synthesised from the card's `io` block.
+- **~** `resources/mappings.yaml`: `algorithm_card:` removed from each of the 3 targets. They go back to being just dispatch endpoints, per UAPF v2.4.0.
+- **~** `uapf.yaml` / `manifest.json`: version `3.0.0` → `3.1.0`.
+- **=** `algorithms/*.card.yaml`: unchanged.
+- **=** `dmn/*.dmn`: unchanged.
+
+### Why the v3.0.0 → v3.1.0 churn
+
+v3.0.0 followed UAPF v2.3.0, which placed the algorithm card on the
+resource target. That hid the algorithm from the BPMN diagram. UAPF
+v2.4.0 reverses that decision and moves the reference onto the BPMN
+task. v3.1.0 of this package follows the corrected spec. Algorithm
+Cards themselves are unchanged across both revisions.

 ## Structure

 ```
 .
-├── uapf.yaml + manifest.json     # Package manifest (UAPF v2.3.0)
-├── bpmn/                          # 1 BPMN process (unchanged from v2.0.0)
-├── dmn/                           # 3 DMN decision tables (unchanged from v2.0.0)
-├── algorithms/                    # 3 Algorithm Cards (NEW in v3.0.0)
+├── uapf.yaml + manifest.json     # Package manifest (UAPF v2.4.0)
+├── bpmn/                          # 1 BPMN process (algorithm refs + ioSpecification)
+├── dmn/                           # 3 DMN decision tables
+├── algorithms/                    # 3 Algorithm Cards (introduced in v3.0.0)
 ├── resources/
-│   ├── mappings.yaml              # Resource targets w/ algorithm_card refs (REFACTORED)
+│   ├── mappings.yaml              # Resource targets (dispatch endpoints only)
 │   ├── guardrails.yaml
 │   └── schemas/                   # Output JSON Schemas
 ├── metadata/                      # ownership + lifecycle
@@ -71,7 +92,7 @@ Audit question → answer-location:

 ## Validation

-Validates against UAPF v2.3.0 schemas at
+Validates against UAPF v2.4.0 schemas at
 `github.com/UAPFormat/UAPF-specification`:

 ```bash
--- a/bpmn/semantic-document-analysis.bpmn
+++ b/bpmn/semantic-document-analysis.bpmn
@@ -2,6 +2,7 @@
 <bpmn:definitions
    xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL"
    xmlns:uapf="https://uapf.dev/bpmn-ext/v1"
+    xmlns:uapf24="https://uapf.dev/bpmn/v2.4"
    xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI"
    xmlns:dc="http://www.omg.org/spec/DD/20100524/DC"
    xmlns:di="http://www.omg.org/spec/DD/20100524/DI"
@@ -16,15 +17,36 @@

    <bpmn:serviceTask id="Task_DetectRedactPii"
                      name="Detect and redact PII"
-                      uapf:capability="ai.redact@1">
+                      uapf:capability="ai.redact@1"
+                      uapf24:algorithmCardRef="algo.semantic_document_analysis.pii_redactor">
      <bpmn:documentation>
-        Calls ai.redact@1 over the source text. Beyond masking, the host
+        Calls ai.redact@1 over the source text. Governed by Algorithm
+        Card algo.semantic_document_analysis.pii_redactor (see
+        algorithms/pii_redactor.card.yaml). Beyond masking, the host
        runs the four Latvian PII regex detectors (personas kods, IBAN,
        e-mail, phone) and returns the deterministic signal set the risk
-        decision consumes: personasKodaPresent, financialDataPresent,
-        contactDataPresent, piiCategoryCount, detectedEntityTypes, plus
-        redactedContent. No model inference — pure pattern detection.
+        decision consumes.
      </bpmn:documentation>
+      <bpmn:ioSpecification>
+        <bpmn:dataInput  id="content"                name="content : string"/>
+        <bpmn:dataOutput id="redacted_content"       name="redacted_content : string"/>
+        <bpmn:dataOutput id="detected_entity_types"  name="detected_entity_types : array"/>
+        <bpmn:dataOutput id="personas_koda_present"  name="personas_koda_present : boolean"/>
+        <bpmn:dataOutput id="financial_data_present" name="financial_data_present : boolean"/>
+        <bpmn:dataOutput id="contact_data_present"   name="contact_data_present : boolean"/>
+        <bpmn:dataOutput id="pii_category_count"     name="pii_category_count : integer"/>
+        <bpmn:inputSet>
+          <bpmn:dataInputRefs>content</bpmn:dataInputRefs>
+        </bpmn:inputSet>
+        <bpmn:outputSet>
+          <bpmn:dataOutputRefs>redacted_content</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>detected_entity_types</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>personas_koda_present</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>financial_data_present</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>contact_data_present</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>pii_category_count</bpmn:dataOutputRefs>
+        </bpmn:outputSet>
+      </bpmn:ioSpecification>
    </bpmn:serviceTask>

    <bpmn:businessRuleTask id="Decision_AssessRisk"
@@ -32,9 +54,7 @@
                           uapf:decision="assess-personal-data-risk">
      <bpmn:documentation>
        DMN dmn/assess-personal-data-risk.dmn. Maps the PII signal set to
-        personalDataRisk (NONE | LOW | MEDIUM | HIGH) by explicit ranked
-        rules. Personas kods or IBAN forces HIGH; two or more categories
-        or contact data gives MEDIUM. Deterministic and auditable.
+        personalDataRisk (NONE | LOW | MEDIUM | HIGH).
      </bpmn:documentation>
    </bpmn:businessRuleTask>

@@ -44,48 +64,75 @@
      <bpmn:documentation>
        DMN dmn/gdpr-processing-route.dmn. From personalDataRisk and
        allowCentralization decides processingRoute (CENTRAL | LOCAL),
-        anonymizationRequired and redactionLevel. This is the routing
-        rule extracted from the host's generate_semantic_metadata: a
-        sensitive document where centralisation is not permitted stays
-        LOCAL with full redaction.
+        anonymizationRequired and redactionLevel.
      </bpmn:documentation>
    </bpmn:businessRuleTask>

    <bpmn:serviceTask id="Task_ExtractSemantics"
                      name="Extract semantic metadata"
                      uapf:capability="ai.extract@1"
-                      uapf:schemaRef="resources/schemas/vdvc-semantic-summary.schema.json">
+                      uapf:schemaRef="resources/schemas/vdvc-semantic-summary.schema.json"
+                      uapf24:algorithmCardRef="algo.semantic_document_analysis.vdvc_semantic_extractor">
      <bpmn:documentation>
        Calls ai.extract@1 on redactedContent with the VDVC v1.1 output
-        schema. This is the single bounded model step: it produces the
-        semanticSummary (topic, summary, keywords, urgency, risk) and
-        must validate against resources/schemas/vdvc-semantic-summary.
-        The host also returns flat aiConfidenceScore and the result of
-        the post-extraction PII re-scan as outputPiiErrorCount.
+        schema. Governed by Algorithm Card
+        algo.semantic_document_analysis.vdvc_semantic_extractor (see
+        algorithms/vdvc_semantic_extractor.card.yaml). EU AI Act
+        Annex III high-risk; human oversight is mandatory and is
+        enforced downstream by the human-validation-gate DMN.
      </bpmn:documentation>
+      <bpmn:ioSpecification>
+        <bpmn:dataInput  id="redacted_content"      name="redacted_content : string"/>
+        <bpmn:dataInput  id="schema_ref"            name="schema_ref : string"/>
+        <bpmn:dataOutput id="semantic_summary"      name="semantic_summary : object"/>
+        <bpmn:dataOutput id="sensitivity_control"   name="sensitivity_control : object"/>
+        <bpmn:dataOutput id="ai_confidence_score"   name="ai_confidence_score : probability"/>
+        <bpmn:dataOutput id="output_pii_error_count" name="output_pii_error_count : integer"/>
+        <bpmn:inputSet>
+          <bpmn:dataInputRefs>redacted_content</bpmn:dataInputRefs>
+          <bpmn:dataInputRefs>schema_ref</bpmn:dataInputRefs>
+        </bpmn:inputSet>
+        <bpmn:outputSet>
+          <bpmn:dataOutputRefs>semantic_summary</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>sensitivity_control</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>ai_confidence_score</bpmn:dataOutputRefs>
+          <bpmn:dataOutputRefs>output_pii_error_count</bpmn:dataOutputRefs>
+        </bpmn:outputSet>
+      </bpmn:ioSpecification>
    </bpmn:serviceTask>

    <bpmn:businessRuleTask id="Decision_ValidationGate"
                           name="Determine human-validation status"
                           uapf:decision="human-validation-gate">
      <bpmn:documentation>
-        DMN dmn/human-validation-gate.dmn. From outputPiiErrorCount,
-        aiConfidenceScore and personalDataRisk decides
-        humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO)
-        and requiresHumanReview. Any leaked PII or confidence below 0.3
-        rejects; below 0.7, or HIGH risk, forces review; 0.7 and above
-        with clean output auto-approves. The thresholds are the weights.
+        DMN dmn/human-validation-gate.dmn. From output_pii_error_count,
+        ai_confidence_score and personalDataRisk decides
+        humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO).
      </bpmn:documentation>
    </bpmn:businessRuleTask>

    <bpmn:serviceTask id="Task_EmitResult"
                      name="Emit semantic-analysis-completed event"
                      uapf:capability="event.emit@1"
-                      uapf:eventType="document.semantic-analysis.completed.v1">
+                      uapf:eventType="document.semantic-analysis.completed.v1"
+                      uapf24:algorithmCardRef="algo.semantic_document_analysis.completion_event_emitter">
      <bpmn:documentation>
-        Calls event.emit@1 to publish a CloudEvent carrying the semantic
-        summary, the routing decision and the validation status.
+        Calls event.emit@1 to publish a CloudEvent. Governed by
+        Algorithm Card algo.semantic_document_analysis.completion_event_emitter
+        (see algorithms/completion_event_emitter.card.yaml).
      </bpmn:documentation>
+      <bpmn:ioSpecification>
+        <bpmn:dataInput  id="event_type" name="event_type : string"/>
+        <bpmn:dataInput  id="payload"    name="payload : object"/>
+        <bpmn:dataOutput id="published"  name="published : boolean"/>
+        <bpmn:inputSet>
+          <bpmn:dataInputRefs>event_type</bpmn:dataInputRefs>
+          <bpmn:dataInputRefs>payload</bpmn:dataInputRefs>
+        </bpmn:inputSet>
+        <bpmn:outputSet>
+          <bpmn:dataOutputRefs>published</bpmn:dataOutputRefs>
+        </bpmn:outputSet>
+      </bpmn:ioSpecification>
    </bpmn:serviceTask>

    <bpmn:endEvent id="End" name="Semantic analysis complete"/>
--- a/manifest.json
+++ b/manifest.json
@@ -2,9 +2,9 @@
  "kind": "uapf.package",
  "id": "dev.uapf.semantic-document-analysis",
  "name": "Semantic Document Analysis",
-  "description": "Level-4 UAPF process for semantic analysis of free-text documents.\n\nThree BPMN service tasks invoke the UAPF-IP capabilities ai.redact@1,\nai.extract@1 and event.emit@1. Three DMN decision tables encode the\ndeterministic algorithm the host previously hid inside application\ncode: assess-personal-data-risk maps PII regex signals to a risk\nlevel; gdpr-processing-route selects CENTRAL vs LOCAL processing,\nanonymisation and redaction level; human-validation-gate applies the\nconfidence thresholds that decide REJECTED / PENDING_REVIEW /\nAPPROVED_AUTO.\n\nOnly the semantic extraction is a model step. Risk classification,\nGDPR routing and the validation gate are explicit ranked rules in\nversioned DMN \u2014 inspectable, auditable, portable. Extraction output\nvalidates against the VDVC v1.1 semantic-summary JSON Schema.\n\nv3.0.0: the three opaque host capabilities (ai.redact@1,\nai.extract@1, event.emit@1) are now governed by Algorithm Cards\nin algorithms/ per UAPF v2.3.0 chapter 13. Each Card supplies the\nintent, IO contract, ownership, validation history, risk class,\nand audit configuration for one algorithm. Cards are referenced\nfrom resource targets in resources/mappings.yaml.\n",
+  "description": "Level-4 UAPF process for semantic analysis of free-text documents.\n\nThree BPMN service tasks invoke the UAPF-IP capabilities ai.redact@1,\nai.extract@1 and event.emit@1. Three DMN decision tables encode the\ndeterministic algorithm the host previously hid inside application\ncode: assess-personal-data-risk maps PII regex signals to a risk\nlevel; gdpr-processing-route selects CENTRAL vs LOCAL processing,\nanonymisation and redaction level; human-validation-gate applies the\nconfidence thresholds that decide REJECTED / PENDING_REVIEW /\nAPPROVED_AUTO.\n\nOnly the semantic extraction is a model step. Risk classification,\nGDPR routing and the validation gate are explicit ranked rules in\nversioned DMN \u2014 inspectable, auditable, portable. Extraction output\nvalidates against the VDVC v1.1 semantic-summary JSON Schema.\n\nv3.1.0: aligned with UAPF v2.4.0 \u2014 Algorithm Card references move\nfrom resource targets to the BPMN service tasks themselves (via\nuapf24:algorithmCardRef attribute). Each card's io block is also\ndenormalised into a <bpmn:ioSpecification> on the task so inputs\nand outputs render as visible data objects on the diagram. The\ncards themselves and the DMN decisions are unchanged from v3.0.0.\n",
  "level": 4,
-  "version": "3.0.0",
+  "version": "3.1.0",
  "requires_capabilities": [
    "ai.redact@1+",
    "ai.extract@1+",
--- a/resources/mappings.yaml
+++ b/resources/mappings.yaml
@@ -2,10 +2,11 @@ kind: uapf.resources.mapping

 # Host-readable contract for the capability-backed service tasks.
 #
-# v3.0.0 change: the single agent.semantic-extractor target has been
-# split into three algorithm-specific targets, each referencing an
-# Algorithm Card under algorithms/ (UAPF v2.3.0, chapter 13). The
-# binding shape is unchanged. The BPMN file is unchanged.
+# v3.1.0 change: the algorithm_card reference (added in v3.0.0 on each
+# target) has been removed per UAPF v2.4.0 — the Algorithm Card reference
+# now lives on the BPMN serviceTask itself via the
+# uapf24:algorithmCardRef attribute (see bpmn/semantic-document-analysis.bpmn).
+# Targets here keep their role as dispatch endpoints only.
 #
 # The three DMN decisions (assess-personal-data-risk,
 # gdpr-processing-route, human-validation-gate) remain self-describing
@@ -21,7 +22,6 @@ targets:
      pii_redactor Algorithm Card.
    capabilities:
      - capability.ai.redact
-    algorithm_card: algo.semantic_document_analysis.pii_redactor

  - id: agent.vdvc_semantic_extractor
    type: ai_agent
@@ -33,7 +33,6 @@ targets:
      enforced downstream by the human-validation-gate DMN.
    capabilities:
      - capability.ai.extract
-    algorithm_card: algo.semantic_document_analysis.vdvc_semantic_extractor

  - id: agent.completion_event_emitter
    type: ai_agent
@@ -43,7 +42,6 @@ targets:
      completion_event_emitter Algorithm Card.
    capabilities:
      - capability.event.emit
-    algorithm_card: algo.semantic_document_analysis.completion_event_emitter

 bindings:
  - source: { type: bpmn.serviceTask, ref: Task_DetectRedactPii }
--- a/uapf.yaml
+++ b/uapf.yaml
@@ -18,15 +18,15 @@ description: |
  versioned DMN — inspectable, auditable, portable. Extraction output
  validates against the VDVC v1.1 semantic-summary JSON Schema.

-  v3.0.0: the three opaque host capabilities (ai.redact@1,
-  ai.extract@1, event.emit@1) are now governed by Algorithm Cards
-  in algorithms/ per UAPF v2.3.0 chapter 13. Each Card supplies the
-  intent, IO contract, ownership, validation history, risk class,
-  and audit configuration for one algorithm. Cards are referenced
-  from resource targets in resources/mappings.yaml.
+  v3.1.0: aligned with UAPF v2.4.0 — Algorithm Card references move
+  from resource targets to the BPMN service tasks themselves (via
+  uapf24:algorithmCardRef attribute). Each card's io block is also
+  denormalised into a <bpmn:ioSpecification> on the task so inputs
+  and outputs render as visible data objects on the diagram. The
+  cards themselves and the DMN decisions are unchanged from v3.0.0.

 level: 4
-version: "3.0.0"
+version: "3.1.0"

 # ── UAPF-IP integration (capability needs + profile + guardrails) ──
 requires_capabilities: