Step 7: register-to-BPMN transcoder tool
Adds tools/register-transcoder — a Python tool that reads a published Valsts Kase accounting-process register (.xlsx/.xlsm) and emits BPMN process skeletons. For a given sub-process it produces one userTask per register step, swimlanes from the RACI columns (placing each step in its Responsible actor's lane), sequence flows reconstructed from the register's own predecessor/successor step references, and synthesised start/end events per entry and exit step. Output is an isExecutable=false skeleton — the deterministic first pass of the transcription pipeline; refinement into a Level 4 executable package is the human/AI-assisted second pass that produced the curated FG3-1/FG3-4/FG3-5 packages. Includes a README and sample-output skeletons emitted from the FG3 register for sub-processes 3.5.2 and 3.5.3.
This commit is contained in:
94
tools/register-transcoder/README.md
Normal file
94
tools/register-transcoder/README.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# register-transcoder
|
||||
|
||||
Transcodes a published Valsts Kase accounting-process register
|
||||
(`.xlsx` / `.xlsm`) into BPMN process skeletons — one deterministic step in
|
||||
the `vk-gramatvediba` transcription pipeline.
|
||||
|
||||
The Valsts Kase / VPC *Grāmatvedības uzskaites procesu apraksts* is published
|
||||
as a set of function-group spreadsheets (FG1–FG6). Each row of a register is a
|
||||
process step with explicit predecessor and successor step references, a RACI
|
||||
split across the responsible actors, the IT system used, an SLA, and the data
|
||||
the step produces. That structure is already a process graph; this tool reads
|
||||
it and emits the corresponding BPMN.
|
||||
|
||||
## What it produces
|
||||
|
||||
For a given sub-process the tool emits one `.bpmn` file containing a single
|
||||
`bpmn:process` with `isExecutable="false"`:
|
||||
|
||||
- one `bpmn:userTask` per register step, named from the register and carrying
|
||||
the step's description, system, SLA, RACI and cross-references in
|
||||
`bpmn:documentation`;
|
||||
- `bpmn:lane`s derived from the RACI columns — a step is placed in the lane of
|
||||
its **Responsible** actor (Nodarbinātais / Iestāde / VPC);
|
||||
- `bpmn:sequenceFlow`s reconstructed from the register's own
|
||||
*No procesa darbības soļa* (predecessor) and *Uz procesa darbības soli*
|
||||
(successor) columns, restricted to links whose endpoints are both inside the
|
||||
emitted sub-process;
|
||||
- synthesised `bpmn:startEvent` / `bpmn:endEvent` nodes — one per entry step
|
||||
(no in-group predecessor) and one per exit step (no in-group successor) — so
|
||||
the fragment's real boundary is visible rather than hidden.
|
||||
|
||||
## Register format expected
|
||||
|
||||
The parser locates the worksheet and header row by content, not by position,
|
||||
so it tolerates the leading title rows the registers carry. It expects a
|
||||
header row containing `Nr.p.k.` and the columns *No procesa darbības soļa*
|
||||
(predecessor, with the FG-group and step-number in adjacent cells),
|
||||
*Process, apakšprocess*, *Atbildības sadalījums (RACI)* (a three-column block
|
||||
for Nodarbinātais / Iestāde / VPC), *Darbību apraksts*, *Izmantotā IS*,
|
||||
*Izpildes termiņš*, *Sagatavotie dati* and *Uz procesa darbības soli*
|
||||
(successor). Rows that carry a number and a name but no description and no
|
||||
RACI are treated as sub-process headers; rows with a description or any RACI
|
||||
entry are treated as steps. Steps are grouped under the most recent header.
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
transcode.py list <register.xlsx>
|
||||
transcode.py emit <register.xlsx> <subprocess> [-o <output.bpmn>]
|
||||
```
|
||||
|
||||
`list` reports the sub-processes that contain steps, with step counts. `emit`
|
||||
writes (or, without `-o`, prints) the BPMN skeleton for one sub-process.
|
||||
|
||||
```
|
||||
python3 transcode.py list fg3_process.xlsm
|
||||
python3 transcode.py emit fg3_process.xlsm 3.5.2 -o 3.5.2.skeleton.bpmn
|
||||
```
|
||||
|
||||
The only dependency is `openpyxl`.
|
||||
|
||||
## Limitations — a skeleton, not an executable
|
||||
|
||||
The output is deliberately a faithful mechanical transcription, not a finished
|
||||
package. It does **not**:
|
||||
|
||||
- detect decisions — every step becomes a `userTask`; branching points are not
|
||||
promoted to `exclusiveGateway`s and no DMN is extracted;
|
||||
- repair the register — where the register's predecessor and successor columns
|
||||
disagree, the skeleton reproduces the result as-is (this can surface as a
|
||||
reciprocal edge / short cycle, or as a step that reaches the rest of its
|
||||
sub-process only through a cross-FG excursion);
|
||||
- carry BPMN diagram interchange (`bpmndi`) — the output is a logical model,
|
||||
laid out by an editor on import;
|
||||
- emit a UAPF package — there is no `uapf.yaml`, no resources and no metadata.
|
||||
|
||||
The RACI-to-lane rule is a heuristic: the lane is the first actor whose RACI
|
||||
cell contains `R`. The full RACI is preserved verbatim in each task's
|
||||
documentation so the heuristic can be checked and corrected.
|
||||
|
||||
## Position in the pipeline
|
||||
|
||||
A skeleton is the deterministic first pass. Refining one into a Level 4
|
||||
executable — introducing explicit gateways, extracting decision logic into
|
||||
DMN, writing resource roles/agents/mappings and the package manifest — is the
|
||||
human / AI-assisted second pass. The curated `processes/fg3-1`, `fg3-4` and
|
||||
`fg3-5` packages are what that second pass yields; `docs/methodology.md`
|
||||
discusses the transcoder skeleton against the curated executable for the same
|
||||
sub-process.
|
||||
|
||||
`sample-output/` holds skeletons emitted from the FG3 register for
|
||||
sub-processes 3.5.2 (*Saimnieciskie norēķini*) and 3.5.3 (*Komandējuma
|
||||
norēķini*) — the two that have curated executable counterparts in this
|
||||
workspace.
|
||||
Reference in New Issue
Block a user