Skip to content

Add optional :transaction-order argument to rw_register/check (addresses #30)#31

Open
will62794 wants to merge 1 commit intojepsen-io:mainfrom
will62794:txn-order
Open

Add optional :transaction-order argument to rw_register/check (addresses #30)#31
will62794 wants to merge 1 commit intojepsen-io:mainfrom
will62794:txn-order

Conversation

@will62794
Copy link

As discussed in #30, this change allows optional transaction version order information to be passed in to the rw_register test checker, enabling inference of a wider range of dependency edge types (e.g. ww and rw). This version info is passed in as a map from completed transaction event indices to the version for that transaction, which for now we assume is some totally ordered numeric value.

The implementation is straightforward, though I am no Clojure expert so open to comments on making things more idiomatic on that front. Basically, we just (1) augment the given history with version tags based on the transaction map that was passed in (2) sort all writes to each key by their version and then (3) construct the version graph for each key based on this list of sorted writes per key. Happy to add some more thorough test cases if you'd like, but putting this up as a first draft.

Also, for reference, a basic prototype of a system making use of this idea can be seen here, in RocksDB. Essentially, we instrument transaction writes upon commit so that the value written to each key is logged with its associated "sequence number", which should be a unique numeric counter assigned to each transaction, and is thus adequate to serve as the "version" ordering information as discussed here, as it corresponds to visibility/commit ordering.

This change allows optional transaction version order information to be passed
in to the rw_register checker, enabling inference of a wider range of
dependency edge types (e.g. ww/rw) than by default. It is expected to be
used with a database system under test that can provide this type of whitebox
version order information e.g. an explicit total order over committed
transactions, which can then serve as the explicit version order for values
written in the transaction history.
Copy link
Contributor

@aphyr aphyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hell yeah. Great job on this, Will--Elle is not an easy codebase and I know Clojure isn't your main language, but you did a fantastic job. I've got a handful of minor things that should change before merge, just for performance/clarity, but the general structure here looks great. Also left a bunch of advisory comments which you're free to ignore, but I figure might be helpful for you down the road. :-)

I'm excited to use this!

"Takes a history and a mapping from transaction completion indices to transaction orders
and annotates each history event with this info as a :version field."
[history transaction-order]
(map (fn [op]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, OK, so now I have the unfortunate pleasure of introducing you to https://github.com/jepsen-io/history :-)

Long story short, Jepsen and Elle are trying to work with histories that are larger than RAM. To do that, Jepsen.history provides a lazy datatype that looks like a Clojure Vector, but isn't. Instead, it lazily deserializes and caches chunks of the history from disk. However, using map here returns a built-in Clojure lazy sequence which will cache its entire contents (once realized) in memory.

As I'll suggest later down, it might be better to avoid transforming the history altogether, and just looking things up in the transaction-order index on the fly. There's also h/map, but I think that's probably both slower and longer to write.

versions) written to that key, sorted ascendingly by version."
[history]
(->
(reduce (fn [m op]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is only a single pass over the history, I suggest passing transaction-order into key-writes-version-sorted as an argument, and looking it up directly in the reducer fn. That way you won't have to materialize any intermediate data structures.

Just as a note for later, we can parallelize this using Tesser and jepsen.history's built-in support for parallel reductions. Don't worry about that for this PR, just a headsup if you come back and go "what happened here?" :-)

(reduce (fn [m [k v]] (update m k (fn [vs] (conj vs {:value v :version version})))) m writes))))
{}
history)
(update-vals ,,, (fn [vs] (sort-by :version vs)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's up with the ,,, here? I think this gets parsed as whitespace, so you probably don't need it unless you want to signal something explicitly.

(let [txn (:value op)
writes (txn/ext-writes txn)
version (:version op)]
(reduce (fn [m [k v]] (update m k (fn [vs] (conj vs {:value v :version version})))) m writes))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misreading this--little tough for me to parse--but the docstring says this function returns a map of keys to vectors of all values. It looks like instead it returns a map of keys vectors like [{:value _, :version _} ...]. Maybe change the docstring to match?

history)
(update-vals ,,, (fn [vs] (sort-by :version vs)))
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical--I'd be fine merging this--but just as a style note for Clojure, you'll typically put closing parens at the end of the last line of code, rather than on separate lines. :-)

]
;; Now, for each key in the map of keys, reduce over its list of version graph edges,
;; building the version graph for that key.
(update-vals versionEdgeMap (fn [edges] (reduce (fn [vg [ei ej]] (g/link vg (:value ei) (:value ej))) (g/digraph) edges)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not mandatory, just advisory: This is exactly the right kind of transformation, and you can be proud of this. It does get a little gnarly with the triple-nested fns though! You might be able to write this a little shorter and clearer by having a single call to update-vals, and pulling (reduce ....) into the previous update-vals fn. That lets you get rid of the let binding too. It'll be faster to boot, since you only build the map once, and won't have versions drop out of CPU cache between the two steps. :-)

which should be merged with our own dependency
graph.

:transaction-order A map from completed transaction operation indices in the history
Copy link
Contributor

@aphyr aphyr Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hell yeah, look at these docs. Specification AND context! The only suggestion I have is that you say "completion operation" instead of "completed operation", because an invocation is "completed" if it has a corresponding completion. Just makes it clear we mean specifically completion ops, and that will help folks do the right thing.

to a numeric, total transaction order. This is used to allow external,
'whitebox' dependency order information from a database system to be passed in
for automatic inference of additional dependency edges. Assumes that events
in the given history can be uniquely identified by their :index fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, history events always have unique long :index fields for exactly this reason. See jepsen.history for more, if you're curious. :-)

:value' 2,
:a-mop-index 1,
:b-mop-index 1}],
:type :G-single-item}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job establishing that the augmented version order not only works, but catches something new. :-)

3 3}}
h)))
))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yes, yes, very good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants