Skip to content

Nimbus http response latency has high variance, slow upper quantiles #7981

@jshufro

Description

@jshufro

Describe the bug

I'm running nimbus as part of a vouch cluster. Vouch observes very high variance in nimbus's http reponse time across all endpoints. Using /eth/v1/validator/attestation_data as an example, this can be corroborated with a simple curl:

$ time curl "nimbus:5052/eth/v1/validator/attestation_data?slot=2406257&committee_index=0"
{"code":400,"message":"Invalid slot value","stacktraces":["Slot already finalized"]}
real    0m0.015s
user    0m0.004s
sys     0m0.001s
$ time curl "nimbus:5052/eth/v1/validator/attestation_data?slot=2406257&committee_index=0"
{"code":400,"message":"Invalid slot value","stacktraces":["Slot already finalized"]}
real    0m0.382s
user    0m0.003s
sys     0m0.002s

Note that this variance is visible even with a light application workload (ie, producing the error in the above result should not use any meaningful amount of resources itself), implying a more systemic latency issue.

This discord conversation has a bit more context:
https://discord.com/channels/613988663034118151/771033358363918347/1472642064788492441

To Reproduce
Steps to reproduce the behavior:

  1. Platform details (OS, architecture): linux/amd x64
  2. Branch/commit used: Nimbus beacon node v26.2.0-fa7a87-stateofus
  3. Commands being executed:
  4. Relevant log lines:

Screenshots
Response time quantiles on the attestation_data uri:

Image

Compared with lodestar:

Image

Additional context
Observed on hoodi on a peerdas supernode + subscribed to all attestation subnets
Observed with besu and nethermind as well

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions