Case Switching is a dynamic optimization and is only applied when it is compatible with the optimization goal.
Case expressions are normally compiled to a sequence of conditional jumps: for each when branch the entry condition(s) of that clause is evaluated; when it is false, the control is transferred to the next when branch, and eventually to the else branch or end of the expression. This means the case expression evaluates—on average—half of all existing conditions, assuming even distribution of the case expression input values. (If some input values of the case expressions are more frequent, it is possible to achieve better average execution times by placing those values first.)
The Case Switching optimization improves case expressions which branch on integer values of the expression, or on Mindustry content objects.
Preconditions:
The following conditions must be met for a case expression to be processed by this optimization:
- All values used in
whenclauses must be effectively constant. - All values used in
whenclauses must be integers, or must be convertible to integers (see Mindustry content conversion). - Values used in
whenclauses must be unique; when ranges are used, they must not overlap with other ranges or standalone values.
Warning
It is assumed that a case expression branching exclusively on integer values always gets an integer value on input as well. Similarly, a case expression branching exclusively on a Mindustry content of a given type is expected to always get a Mindustry content of the same type on input. If an unexpected input value may appear on input (e.g., a non-integer, or a Mindustry content of a different type), this optimization will produce incorrect results. At this moment Mindcode isn't able to recognize such a situation; if this is the case, you need to disable the Case Switching optimization manually (possibly for just the problematic case expression using #setlocal).
Depending on the target version and other compiler options, three different optimization techniques are available:
All optimization techniques fully support null values in the when clauses. When the null value is explicitly handled (i.e., there is a when null branch present), the corresponding branch is executed for null input values. In case the when null branch is missing, the behavior depends on the type of the case expression:
- Integers:
nullinput values are handled by the0branch, or skipped altogether if there is no0branch. - Mindustry content:
nullinput values are handled by theelsebranch, or skipped altogether if there is no else branch.
This behavior is identical to the behavior of an unoptimized case expression.
Mindcode arranges the code to only perform checks distinguishing between null and the zero value where both of these values can occur. When a code path is known to not possibly handle both null and 0, these checks are eliminated. As a result, an optimized case expressions checking for null in when branches is typically more efficient than letting the null value go into the else branch and checking it explicitly there, or checking for them prior to the case expression itself.
When all when branches in the case expression contain built-in constants representing Mindustry content of the same type (items, liquids, unit types, or block types) and the optimization level is set to advanced, this optimization converts these built-in constants to logic IDs, adds an instruction to convert the input value to a logic ID (using the sensor instruction with the @id property) and attempts to build a jump table over the resulting numeric values.
The following preconditions need to be met to apply content conversion:
- The optimization level is set to
advanced. - The
builtin-evaluationoption is set tocompatibleorfull. - All values in
whenbranches are eithernull, or built-in variables referencing Mindustry content of the same type (items, blocks, and so on). - Values used in
whenclauses are unique. - The logic ID is known to Mindcode for all
whenvalues. - All logic IDs corresponding to the
whenvalues are stable, orbuiltin-evaluationmode is set tofull.
When all possible input values in case expression are known to be handled by one of the when branches, it is not necessary to handle other values, which may save some instructions. Mindcode is generally incapable of determining this is the case and keeps these jumps in place by default. By setting the unsafe-case-optimization compiler directive to true, you inform Mindcode that all input values are handled by case expressions. This prevents the out-of-range handling instructions from being generated, making the optimized case expression faster by two instructions per execution, and leads to the optimization being considered for case expressions with four branches or more.
Putting an else branch into a case expression indicates not all input values are handled, and doing so disables the unsafe case optimization: the basic optimization may still happen, but the out-of-range checks will remain.
If you activate the unsafe-case-optimization option, and an unhandled input value is encountered, the behavior of the generated code is undefined.
Note
When the input value originates in the game (e.g., item selected in a sorter), keep in mind the value obtained this way might be null, and do include the when null branch in the case expression.
The range check is also partially or fully removed when the following conditions are met:
- There is a
whenbranch corresponding to the Mindustry content with a zero ID: in this case, Mindcode knows when the minimum possible numerical value of the ID (that is, zero) is handled by the case expression and may omit the check for values less than zero. - There is a
whenbranch corresponding to the Mindustry content with a maximum ID, andbuiltin-evaluationis set tofull: in this case, Mindcode knows when the maximum possible numerical value of the ID (that is, zero) is handled by the case expression and may omit the check for values greater than that.
Mindcode may use Jump table padding to be able to remove the range checks in the above two cases.
In Mindustry 8, it is possible to read character values from a string at a given index in a single operation. When the case expression assigns a single new value to a variable in each branch, and the values can be represented as integers in a reasonable range, the entire expression can be encoded as a read from a string containing the encoded values. In the basic case, the entire case expression can be replaced by a single read instruction. In more complex case expressions, additional instructions may be needed to handle some more specific cases.
Assuming the case expression conforms to the requirements described above, the following prerequisites need to be met for this optimization to be applied:
- The target must be set to version
8.1or higher. - The use-text-translations option must be active.
The basic case is:
#set target = 8m;
volatile output = case input
when 0 then 'A';
when 1 then 'B';
when 2 then 'C';
end;
which produces
read .output "ABC" :input
Note that input values other than 0, 1, or 2 result in a null, which is what the original case expression does too.
Further on, we call the input values explicitly handled by the case expression "keys" and the possible values assigned to output "output values."
Assuming the case statement can be implemented using value translation, several factors may complicate things:
- Input different from an integer range starting at 0:
inputis a Mindustry content: additional mapping fom content to logic ID is needed,inputis an integer, andmin(keys)is negative:inputis offset by-min(keys),inputis an integer, andmin(keys)is above zero:- the translation table is padded by
elsevalues up tomin(keys), if possible, or inputis offset by- min(keys).
- the translation table is padded by
- Output values cannot be mapped to characters valid for a string literal: an offset is added to output values which must then be subtracted. (Unfortunately, this subtraction damages null values naturally produced by the
readinstruction, may need to be compensated for later on.) - Keys contain
nulland thenullkey maps to a different value than the zero key (meaning a null value ofinputneeds to be specifically handled):nullmaps to a non-null value:null-handlingselectwill be added after translation.nullmaps to a branch which doesn't assign a new value: aselectis used to restore the original value of the variable.
- Some output values need special handling:
- some keys produce
null: the null-producing keys are mapped to an unused output value (a null placeholder), and aselectconverts it into anull. - some keys map to branches which don't assign a new value: the corresponding key is mapped to an unused output value (a void placeholder), and a
selectis later used to convert it back to the original variable.
- some keys produce
- Handling of values outside the mapping string (these values are always handed by the
elsebranch):- the
elsebranch doesn't outputnull, or it does outputnull, but this value is damaged by subtracting offset: aselectinstruction replaces the originalnullvalue produced by translation to the actual output value, - the
elsebranch doesn't change the variable value: aselectinstruction handles thenullproduced by translation to restore the original variable.
- the
- When the minimal output value in the translation map is greater than
1(either naturally or because of the applied offset), it is possible to map theelsevalues within the map to a value lower than all other output values. Theelsebranch then produces values lower than the minimal output value (asnullequals to0in non-strict comparisons), and just a singleselectinstruction is needed to convert them to the actual output values. Two possible cases are handled this way:- The else branch produces
null(case 6.1), and an additional branch producesnullor there is a position in the translation table mapping the key to the else branch (case 4.2): thenullvalue is represented by a placeholder lower than all other values in the map, and case 4.2 and 6.1 are replaced by a single instruction. - The else branch doesn't modify the output variable (case 6.2), and there are positions in the map that also do not modify the output variable (case 5): the original variable value is represented by a placeholder lower than all other values in the map, and case 5 and 6.2 are replaced by a single instruction.
- The else branch produces
- Output values are a Mindustry content: additional mapping from logic ID to content is added.
As has been mentioned in case 1.iii.a, the translation table can be padded on the lower end to avoid the need to offset the input value. Similarly, when the Mindustry content conversion is applied and the builtin-evaluation option is set to full, the translation table may be padded on the high end up to the largest ID of the respective Mindustry content, possibly avoiding the need to handle case 6.
Instructions corresponding to the above cases (in the order in which they're generated):
| Case | Instruction | Note |
|---|---|---|
| 1 | sensor tmp input @id or op sub tmp input minKeys |
|
| - | read origOutput "translation" input |
|
| 2 | op sub output origOutput offset |
|
| 3.1 | select output strictEqual input null nullOutput prevOutput |
|
| 4.1 | select output equal origOutput nullPlaceholder null prevOutput |
|
| 5.1 | select output strictEqual origOutput null elseValue prevOutput |
|
| 6.1 | select output lessThanEq origOutput nullPlaceholder elseValue prevOutput |
Combines 4.1 and 5.1 |
| 7 | lookup <contentType> output prevOutput |
|
| 3.2 | select output strictEqual input null initialOutput prevOutput |
|
| 4.2 | select output equal origOutput voidPlaceholder initialOutput prevOutput |
|
| 5.2 | select output strictEqual origOutput null initialOutput prevOutput |
|
| 6.2 | select output lessThanEq origOutput voidPlaceholder initialOutput prevOutput |
Combines 4.2 and 5.2 |
Description of the variable names used in the table above:
initialOutput: the original value of the output variable,prevOutput: holds the output value from the previous step,origOutputoroutputis the output from the current step.
The last output value produced by the code is the resulting value of the translation. Instructions which restore the initial value of the output variable are placed after the lookup instruction to avoid the need to convert the input value to logic id first.
When unsafe-case-optimization is set to true, the handling of case 6 and sometimes case 4 may be omitted, as it is supposed not to occur.
The most complex value translations consist of up to eight instructions. When the optimization goal is speed, another solution will probably perform better than the more complex cases. When optimizing for size, though, even the largest value translations may be smaller than any of the alternatives:
#set target = 8m;
#set goal = size;
volatile var value;
case value
when @copper then value = @silicon;
when null then value = @lead; // causes case 2
when @coal then value = @copper; // causes case 3.1
when @lead then value = null; // causes case 4.1
when @silicon then null; // causes case 4.2
else value = @scrap; // causes case 5.1
end;
produces
sensor *tmp2 .value @id
read *tmp3 "9:8880888;" *tmp2
op sub *tmp4 *tmp3 48
select *tmp5 strictEqual *tmp2 null 1 *tmp4
select *tmp6 equal *tmp3 58 null *tmp5
select *tmp7 strictEqual *tmp3 null 8 *tmp6
lookup item *tmp8 *tmp7
select .value equal *tmp3 59 .value *tmp8
Multiple value translations
When a single case expression assigns new values to several distinct variables, the optimization may be applied to each of them separately. Handling of the input value (e.g., translating it to logic id or applying an offset) is performed just once and shared by all translated variables. Example:
#set target = 8m;
#set goal = size;
a = '0';
while true do
case a
when '0' then a = '1'; b = 'a';
when '1' then a = '2'; b = 'b';
when '2' then a = '0'; b = 'c';
end;
printchar(a);
printchar(b);
end;
compiles to:
set :a 48
read *tmp2 "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!abc" :a
select :b lessThanEq *tmp2 33 :b *tmp2
read *tmp4 "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!120" :a
select :a lessThanEq *tmp4 33 :a *tmp4
printchar :a
printchar :b
jump 1 always 0 0
Note
The text translation optimization works best when using readable characters as values produced by the case expression. This can be easily achieved by using character literals Mindcode provides, as demonstrated in the last example.
In Mindustry 8, character values can be read from a string at a given index in a single operation directly into the @counter variable. This allows encoding target instruction addresses into strings, similarly to encoding the actual values in value translations. Furthermore, since assigning @null to @counter is ignored by the processor, values outside the range covered by the string jump table cause the next instruction to be executed. This can be leveraged by placing the else branch just after the read instruction, ensuring that the control is transferred to the proper branch by just this one instruction.
Additionally, the branch handling the highest number of input values is moved to the end of the case expression. As all branches except the last one need to execute additional jump to the end of the case expression; this ensures that the most frequented branch will use this benefit.
The following prerequisites need to be met for this optimization to be applied:
- The target must be set to version
8or higher. - The symbolic labels option must be inactive.
- The null-counter-is-noop option must be active.
- The use-text-jump-tables option must be active.
Since the out-of-range values are fully handled by fast dispatch, the case-optimization-strength compiler directive is ignored when this optimization is performed. Even unspecified unhandled values are correctly handled by the single-jump dispatch.
This optimization typically only gets performed when neither value translation nor fast dispatch is possible, and in some rare cases where value translation produces less optimal (typically slower) code according to the optimization goal. This optimization is available in all targets.
A jump table replaces the sequence of conditional jumps by direct jumps to the corresponding when branch. The actual instructions used to build a jump table are
jump <else branch address> lessThan value minimal_when_value
jump <else branch address> greaterThan value maximal_when_value
op add @counter <start_of_jump_table - minimal_when_value> value
start_of_jump_table:
jump <when branch for minimal when value address>
jump <when branch for minimal when value + 1 address>
jump <when branch for minimal when value + 2 address>
...
jump <when branch for maximal when value address>
The jump table is put in front of the when branches. Original conditions in front of each processed when branch are removed. Each when branch jumps to the end of the case expression as usual. The bodies of when branches are moved into correct places inside the case expression when possible, to avoid unnecessary jumps. On experimental level, the bodies of when branches may be duplicated to several suitable places to avoid even more jumps at the cost of additional code size increase. This optimization usually only kicks in for small branch bodies, since for larger code increases, a better performing solution can be achieved by a different segment arrangement.
To build the jump table, the minimum and maximum value of existing when branches are determined first. Values outside this range are handled by the else branch (if there isn't an explicit else branch in the case statement, the else branch just jumps to the end of the case expression). Values inside this range are mapped to a particular when branch, or, if the value doesn't correspond to any of the when branches, to the else branch.
The first two instructions in the example above (jump lessThan, jump greaterThan) handle the cases where the input value lies outside the range supported by the jump table. The op add @counter instruction then transfers the control to the corresponding specific jump in the jump table and consequently to the proper when branch.
A basic jump table executes at most four instructions on each case expression execution (less if the input value lies outside the supported range). We've mentioned above that the original case statement executes half of the conditional jumps on average. This means that converting the case expression to a jump table only makes sense when there are at least eight conditional jumps in the case expression.
Notes:
- When evaluating execution speed, the optimizer computes and averages execution costs of each value present in a
whenclause. All of these values are deemed equally probable to occur, and values leading to anelsebranch are not considered at all. In an unoptimizedcaseexpression, values handled by theelsebranch take the longest time to handle, while in the optimized case expression, values completely outside the range ofwhenvalues are executed faster than any other values. This is a side effect of the optimization. - As a consequence, if you put the more frequent values first in the case expression, and the value distribution is very skewed, converting the case expression to the jump table might actually worsen the average execution time. Mindcode has no way to figure this on its own; if you encounter this situation, you might need to disable the Case Switching optimization for your program.
- For smaller case expressions, a full jump table might provide worse average performance than the original case expression. Mindcode might still optimize the case expression by applying the bisection search used in Jump table compression, providing both better average execution time of the entire case expression and more balanced execution time of individual branches.
Building a single jump table for the entire case expression often produces the fastest code, but the jump table might become huge. The optimizer therefore tries to break the table into smaller segments, handling these segments specifically. The resulting code uses a bisection search to locate the segment handling a particular input value. Some segments might contain a single value, or a single value with a few exceptions, and can be handled by only a few jump instructions. More diverse segments may be encoded as separate, smaller jump tables. The optimizer considers a number of such arrangements and selects the best one according to the current optimization goal.
The total number of possible segment arrangements can be quite large. The more arrangements are considered, the better code may be generated. However, generating and evaluating these arrangements can take a long time. The case-optimization-strength compiler directive can be used to control the number of considered arrangements. Setting this option to 0 disables jump table compression entirely.
Typically, compressing the jump table produces smaller, but slightly slower code. For more complex case expressions, it is possible that the optimized code will be both smaller and significantly faster than the unoptimized case expression.
Jump table compression is, for example, particularly useful when using blocks in case expressions: given the large dispersion of block type IDs, full jump tables tend to get quite large.
Notes:
- Jump table compression is not performed when range checks for the given case expression are eliminated via the
unsafe-case-optimizationoption, or when thecase-optimization-strengthcompiler directive option has been set to 0. - The optimizer will sometimes choose a smaller implementation version over a larger one, even when the optimization goal is speed and there is plenty of instruction space. This happens when the smaller implementation is actually faster than the larger one, considering the performance of the values ending up at the
elsebranch. - Since the bisection search provides better execution time than a linear search, it may be applied even to case expressions too small for a full jump table optimization.
When the jump table starts at zero value, it is possible to generate a faster code due to these effects:
- When the Mindustry content conversion is applied, the optimizer knows the logic IDs cannot be less than zero. A jump instruction handling values smaller than the start of the jump table can therefore be omitted.
- When the
symbolic-labelsdirective is set totrue, an additional operation handling the non-zero offset can be omitted.
Similarly, when the Mindustry content conversion is applied, the builtin-evaluation option is set to full and the jump table ends at the largest ID of the respective Mindustry content, a jump instruction handling values larger than the end of the table can be omitted, as the optimizer knows no larger values may occur.
When the jump table doesn't start or end at these values naturally, Mindcode may pad the table at either end with additional jumps to the else branch. The optimizer considers the possibility of padding the table at the low end, high end, or both, and chooses the option that gives the best performance given the instruction space limit.
The ability to read character values from a string at a given index into the @counter variable, as described in Fast dispatch, can be also used to replace the actual jumps in a jump table with the read instruction. In most cases, when this is possible, fast dispatch will be used instead, but if that optimization is not possible for some reason, text-based jump tables may be used in this optimization too. The following prerequisites need to be met for text-based jump tables to be used:
- The target must be set to version
8or higher. - The symbolic labels option must be inactive.
- The use-text-jump-tables option must be active.
The text-based jump table may be also used for individual segments produced during jump table compression.
The example illustrates the following optimization aspects:
- Case switching optimization in general
- Mindustry content conversion
- Handling of
nullvalues - Jump table compression
- Jump table padding
- Moving bodies of
whenbranches
The sample has been artificially constructed to demonstrate the above effects.
#set target = 7m;
#set builtin-evaluation = full;
#set symbolic-labels = true;
#set instruction-limit = 150;
#set case-optimization-strength = 4;
text = case getlink(0).@type
when null then "none";
when @kiln, @phase-weaver, @pyratite-mixer, @melter, @disassembler then "A";
when @plastanium-compressor, @cryofluid-mixer, @blast-mixer, @separator, @spore-press then "B";
when @unit-repair-tower, @prime-refabricator, @mech-refabricator, @slag-heater, @scathe then "C";
when @diffuse, @tank-refabricator, @ship-refabricator, @lustre then "D";
else "E";
end;
print(text);
The above case expression is transformed to this:
# Mlog code compiled with support for symbolic labels
# You can safely add/remove instructions, in most parts of the program
# Pay closer attention to sections of the program manipulating @counter
getlink *tmp1 0
sensor *tmp2 *tmp1 @type
sensor *tmp4 *tmp2 @id
jump label_21 greaterThanEq *tmp4 230
jump label_45 greaterThanEq *tmp4 14
op add @counter @counter *tmp4
jump label_44 always 0 0
jump label_45 always 0 0
jump label_45 always 0 0
jump label_45 always 0 0
jump label_42 always 0 0
jump label_19 always 0 0
jump label_42 always 0 0
jump label_19 always 0 0
jump label_42 always 0 0
jump label_19 always 0 0
jump label_42 always 0 0
jump label_19 always 0 0
jump label_42 always 0 0
label_19:
set *tmp0 "B"
jump label_46 always 0 0
label_21:
jump label_45 greaterThanEq *tmp4 243
jump label_26 greaterThanEq *tmp4 232
jump label_38 lessThan *tmp4 231
label_24:
set *tmp0 "D"
jump label_46 always 0 0
label_26:
op sub *tmp5 *tmp4 232
op add @counter @counter *tmp5
jump label_38 always 0 0
jump label_45 always 0 0
jump label_45 always 0 0
jump label_24 always 0 0
jump label_38 always 0 0
jump label_24 always 0 0
jump label_38 always 0 0
jump label_45 always 0 0
jump label_45 always 0 0
jump label_24 always 0 0
label_38:
set *tmp0 "C"
jump label_46 always 0 0
label_40:
set *tmp0 "none"
jump label_46 always 0 0
label_42:
set *tmp0 "A"
jump label_46 always 0 0
label_44:
jump label_40 strictEqual *tmp4 null
label_45:
set *tmp0 "E"
label_46:
print *tmp0
« Previous: Case Expression Optimization | Up: Code optimization | Next: Condition Optimization »