This branch contains an extension module for the SIESTA framework, designed to extend the incremental mining procedure of Declare constraints. The module introduces support for branching mechanisms on both source and target events within the constraint discovery process.
The primary objective of this module is to enhance the standard Declare constraints mining by enabling branched relationships:
- Source Branching: Allowing multiple source events to jointly activate a constraint.
- Target Branching: Allowing multiple target events to be linked under a single activation event.
This is achieved by building on top of SIESTA’s original incremental mining strategy, maintaining scalability while supporting more expressive process models.
- Implementation of AND, OR, and XOR branching policies for both sources and targets.
- Compatible with batch event log processing.
- Optimized to maintain the efficiency of the incremental mining procedure, suitable for big data scenarios.
- From
docker-compose.yml, run the following S3 service:
docker compose up minio- Preprocess a new log file (if not already done) saved locally in
input/:
docker compose up preprocess
curl -X 'POST' 'http://localhost:8000/preprocess' \
-H 'Content-Type: application/json' \
-d '{
"spark_master": "local[*]",
"file": "test.xes",
"logname": "test"
}'-
Configure the CBDeclare service by modifying it in the
docker-compose.ymlfile:- Replace
testin thecommandsection with the logname of your log file. - Optionally, add any arguments in the
commandsection as needed separated by commas. See the next section for all available arguments. - If your RAM is less than 10g, consider changing the
entrypointdriver memory configuration as well.
E.g.
command:["-l", "test", "--support", "0.5", "-p", "AND"] - Replace
-
Run the CBDeclare service:
docker compose up cbdeclareNote: if network issues arise, consider uncommenting the external: true configuration in the yml file.
-l, --logname <logname>
Type:String
Required: Yes
Description: Name of the log file. This refers to the corresponding log stored in S3.
-
-s, --support <value>
Type:Double
Default:0
Description: Minimum support threshold for constraint discovery. Only constraints with support greater than this value will be considered. If set to0, all constraints are considered. -
-p, --branchingPolicy <AND|OR|XOR>
Type:String
Default:null
Description: Branching policy to apply during constraint discovery. -
-t, --branchingType <SOURCE|TARGET>Type:String
Default:"TARGET"
Description: Specifies the direction of the branching. UseTARGETto group multiple targets under a single source. -
-b, --branchingBound <value>
Type:Int
Default:0
Description: Maximum number of activities allowed in a branched group. If0, no branching bound is enforced. -
-d, --dropFactor <value>
Type:Double
Default:1.5
Description: Drop factor used to reduce candidate branching sets. A lower value increases the likelihood of early stopping during greedy expansion. -
-r, --filterRare <true|false>
Type:Boolean
Default:false
Description: If set totrue, filters out events that are considered rare (e.g., based on frequency thresholds). -
-u, --filterUnderBound <true|false>
Type:Boolean
Default:false
Description: If enabled, removes templates that do not satisfy the required branching bound. -
-h, --hardRediscover <true|false>
Type:Boolean
Default:false
Description: Forces a full rediscovery from scratch, ignoring any previously cached intermediate results.