How is binary serialisation done so fast? #1774

Please-just-dont · 2025-06-01T20:43:06Z

Please-just-dont
Jun 1, 2025

I tried serialising an std::vector where MyType is a trivially copyable type, and therefore can be individually serialised with memcpy, or indeed the entire array/vector can be just memcpied with sizeof(MyType) * my_vector.size(). I compared glz::write_beve_untagged with a simple memcpy of the entire vector, and my timings consistently show that glz::write_beve_untagged is faster by between 3 and 5 times. This makes absolutely no sense and have spent a long time trying to figure out why, including reversing the order of the tests so that it might show a difference with respect to the caches 'being warm', but it's not the case, glz::write_beve_untagged is always faster than a simple memcpy, and it makes no sense.

Furthermore, Glaze doesn't compress the types, and I'm pretty sure doesn't use memcpy for the entire vector itself, but loops through each object and serialises the type for each member of the vector. This makes absolutely no sense. Please explain what might be happening.

stephenberry · 2025-06-01T21:47:48Z

stephenberry
Jun 1, 2025
Maintainer

This is surprising and suspicious, but I was surprised by a performance test last week, so I won't immediately rule out something wrong in the test.

Are you sure you are testing in Release and not Debug?
What compiler are you using?
I assume you are allocating the output buffer up front. If this is true, what is the type of your output buffer and how are you allocating memory?

If you could share a simple example of the kind of test you are doing I could run some benchmarks as well.

Make sure you are just profiling time and not MB/s, because the BEVE output requires more memory.

0 replies

Please-just-dont · 2025-06-02T09:59:52Z

Please-just-dont
Jun 2, 2025
Author

@stephenberry Ahh, never mind. It took me ages but I figured out that it was a warm cache issue. It's really hard to get things right with benchmarking

1 reply

stephenberry Jun 2, 2025
Maintainer

Ah, yes it is. Glad you figured it out.

I'm curious, what was the final result?

Please-just-dont · 2025-06-02T18:45:36Z

Please-just-dont
Jun 2, 2025
Author

@stephenberry Hi. These are the results I got. reflect-cpp uses yyjson:

Time for reflect-cpp to output minified json = 256.851 microseconds

Time for reflect-cpp to read minified json = 124.319 microseconds

Time to do first memcpy = 2.671 microseconds

Time to do second memcpy = 2.256 microseconds

Time to serialise to Glaze minified JSON format = 39.754 microseconds

Time to deserialise Glaze minified JSON format = 48.574 microseconds

Time to serialise to BEVE untagged binary format = 10.179 microseconds

Time to deserialise BEVE untagged binary format = 11.854 microseconds

Although it should be noted that the yyjson test isn't comparable because the reflect-cpp api returns a new std::string buffer, so there's no way to warm the caches. My binary serialisation method is the same Glaze's, there's no way to get faster.

I noticed that when, let's just say you're serialising/deserialising a vector/array, if the element type of the array is trivially copyable you can do it four times faster by memcpying the entire array, but outside of that there are no performance improvements, and I actually think it's risky because for a number of reasons the type might not be trivially memcpyable, or become non trivially memcpyable at a later stage. Or you've written out little endian and the machine is big-endian (actually I'm not even sure this is even worth worrying about nowadays).

I had two other questions.

One of the things that's been really annoying about std::vector is that it's really tricky to do a resize without initialising the elements. Does Glaze just call the vector resize when deserialising into a vector? It's really good that Glaze can take any Buffer, so I provided my own that doesn't behave like vector in this regard.
Most serialisation libraries require a fully constructed object to exist, and deserialise into that. Let's just say you have a struct like:

struct NonTriviallyDestructibleType
{
      ~NonTriviallyDestructibleType() { }
};

struct A
{
      NonTriviallyDestructibleType a;
};

Then it's the deserialiser's job to destroy the any already existing objects, right?

Is there a way to deserialise without default constructing first? I think there is but it gets complicated and I haven't seriously looked into it.

1 reply

stephenberry Jun 2, 2025
Maintainer

Thanks for the benchmark numbers.

Glaze just calls the resize method, so if your vector-like type doesn't default initialize it will work properly and avoid this code. Same goes for buffers with Glaze, where the resize method is called and the final operation Glaze resizes back down to the actually written size. You can gain a little performance by avoiding default initialization, although performance improvements tend to be minor.
The typical approach is to read into default values, which is in most cases simpler and faster. But, it is possible to directly emplace_back without reading into a default value in a vector. Below is a unit test from Glaze showing this behavior:

struct ImmutableStruct
{
   const int val1;
   const float val2;
};

struct MyStruct
{
   std::vector<ImmutableStruct> vals;
};

struct ImmutableStructInserter_t
{
   std::vector<ImmutableStruct>& container;
   int parsedInt{};
   float parsedFloat{};
};

template <auto Member>
constexpr auto ImmutableStructInserter = [] { return [](auto&& v) { return ImmutableStructInserter_t{v.*Member}; }; }();

template <>
struct glz::meta<ImmutableStructInserter_t>
{
   using T = ImmutableStructInserter_t;
   static constexpr auto insert = [](ImmutableStructInserter_t& s) {
      s.container.emplace_back(s.parsedInt, s.parsedFloat);
      return true;
   };

   static constexpr auto value = object("val1", &T::parsedInt, "val2", manage<&T::parsedFloat, insert, nullptr>);
};

template <>
struct glz::meta<MyStruct>
{
   using T = MyStruct;
   static constexpr auto value = object("vals", custom<array_apply<ImmutableStructInserter<&T::vals>>, &T::vals>);
};

suite immutable_array_read_tests = [] {
   "immutable_read"_test = [] {
      MyStruct myStruct;
      myStruct.vals.emplace_back(1, 1.1f);
      myStruct.vals.emplace_back(2, 2.1f);
      myStruct.vals.emplace_back(3, 3.1f);

      std::string buffer = glz::write_json(myStruct).value_or("error");

      myStruct.vals.clear();
      expect(not glz::read_json(myStruct, buffer));
      buffer = glz::write<glz::opts{.format = glz::JSON}>(myStruct).value_or("error");
      expect(buffer == R"({"vals":[{"val1":1,"val2":1.1},{"val1":2,"val2":2.1},{"val1":3,"val2":3.1}]})") << buffer;
   };
};

array_apply is a fairly new helper for this behavior. If you find yourself using this approach and see ways for Glaze to better support this and make it easier to work with, let me know!

Please-just-dont · 2025-06-03T04:54:47Z

Please-just-dont
Jun 3, 2025
Author

@stephenberry Just another question. If I have a std::vector that already has objects in it, that I pass as a buffer to glz::read_json to deserialise, what does it do with it? It has to destroy each object before it deserialises something. Does it call .clear() to destroy them, then default construct an object with push back or what?

1 reply

stephenberry Jun 3, 2025
Maintainer

If the std::vector already has objects, Glaze will read into the already allocated memory (clear is not called). It does not destroy the old objects, which allows partial updates and better performance. If you want to clear the elements you need to do so yourself.

If there are more new elements than old ones, then it will add the additional elements as default constructed elements before reading.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is binary serialisation done so fast? #1774

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How is binary serialisation done so fast? #1774

Uh oh!

Please-just-dont Jun 1, 2025

Replies: 4 comments · 3 replies

Uh oh!

stephenberry Jun 1, 2025 Maintainer

Uh oh!

Please-just-dont Jun 2, 2025 Author

Uh oh!

stephenberry Jun 2, 2025 Maintainer

Uh oh!

Please-just-dont Jun 2, 2025 Author

Uh oh!

stephenberry Jun 2, 2025 Maintainer

Uh oh!

Please-just-dont Jun 3, 2025 Author

Uh oh!

stephenberry Jun 3, 2025 Maintainer

Please-just-dont
Jun 1, 2025

Replies: 4 comments 3 replies

stephenberry
Jun 1, 2025
Maintainer

Please-just-dont
Jun 2, 2025
Author

stephenberry Jun 2, 2025
Maintainer

Please-just-dont
Jun 2, 2025
Author

stephenberry Jun 2, 2025
Maintainer

Please-just-dont
Jun 3, 2025
Author

stephenberry Jun 3, 2025
Maintainer