You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The `hot_offset` and `cold_offset` are absolute byte offsets from the start of the FST section.
777
+
Implementations MUST reject FST data where magic bytes do not match or version is unsupported.
788
778
789
-
### 14.5 Node Index
779
+
### 14.4 Node Index
790
780
791
781
The Node Index is an array of `node_count` entries, with each entry being 8 bytes.
792
782
@@ -799,7 +789,7 @@ The Node Index is an array of `node_count` entries, with each entry being 8 byte
799
789
800
790
Offsets are relative to the start of their respective sections. Node IDs are indices into this array (node 0 = first entry, node 1 = second entry, etc.).
801
791
802
-
### 14.6 Hot Section
792
+
### 14.5 Hot Section
803
793
804
794
The Hot Section contains compact node headers optimized for cache efficiency. Each node's hot data has the following structure:
805
795
@@ -830,7 +820,7 @@ The lookup data format depends on the `INDEXED` flag:
830
820
831
821
Array of `edge_count` entries, each a `u16`. Each offset points to the corresponding edge's data within this node's cold section data block.
832
822
833
-
### 14.7 Cold Section
823
+
### 14.6 Cold Section
834
824
835
825
The Cold Section contains edge data and final output values. Each node's cold data has the following structure:
836
826
@@ -858,7 +848,7 @@ If IS_FINAL flag is set:
858
848
859
849
If the `IS_FINAL` flag is set, the final output value is stored at the end of the node's cold data block. This value is added to the accumulated output when the traversal terminates at this node.
860
850
861
-
### 14.8 Path Encoding in FST
851
+
### 14.7 Path Encoding in FST
862
852
863
853
Paths stored in the FST use the same encoding as BoxPath (Section 9):
864
854
@@ -872,7 +862,7 @@ Paths stored in the FST use the same encoding as BoxPath (Section 9):
872
862
- Duplicate keys are not permitted
873
863
- Values are `u64` record indices (as returned by `RecordIndex.get()`)
874
864
875
-
### 14.9 Output Value Accumulation
865
+
### 14.8 Output Value Accumulation
876
866
877
867
The FST stores output values along edges and at final nodes. The final lookup value is computed by:
878
868
@@ -882,7 +872,7 @@ The FST stores output values along edges and at final nodes. The final lookup va
882
872
883
873
The result is the `RecordIndex` value for the path.
884
874
885
-
### 14.10 Implementation Notes
875
+
### 14.9 Implementation Notes
886
876
887
877
**Adaptive Node Format:**
888
878
@@ -903,13 +893,12 @@ Implementations MAY use SIMD instructions for compact node lookups:
903
893
- x86_64: SSE2 `_mm_cmpeq_epi8` for 16-byte parallel comparison
904
894
- aarch64: NEON `vceqq_u8` for similar parallel comparison
905
895
906
-
### 14.11 Constants
896
+
### 14.10 Constants
907
897
908
898
| Constant | Value | Description |
909
899
|----------|-------|-------------|
910
900
| Magic |`BFST`| FST section identifier |
911
-
| Version | 3 | Current format version |
912
-
| Header size | 16 bytes | Fixed header size |
913
-
| Footer size | 16 bytes | Fixed footer size |
901
+
| Version | 1 | Current format version |
902
+
| Header size | 24 bytes | Fixed header size |
914
903
| Index entry size | 8 bytes | Per-node index entry |
915
904
| Indexed threshold | 17 | Minimum edges for indexed format |
0 commit comments