[BUG] Workload group(Query Group) creation fails when request hits data node due to updated_at validation against CM clock

### Describe the bug

**Summary**
Creating a WLM workload group (PUT /_wlm/workload_group) intermittently fails with WorkloadGroup.updatedAtInMillis is not a valid epoch when the REST request is served by a data node. The failure rate increases when traffic is routed to data nodes, but disappears when sent directly to the cluster-manager node.

**Environment**
* OpenSearch: 3.4.0
* Feature: Workload Management (WLM) workload groups
* TLS enabled
* Cluster with separate data and cluster-manager nodes
* NTP appears healthy (sub‑50ms jitter, no obvious clock skew in node stats)

**Current Behavior** 
Request sometimes returns:

```json
{
  "error": {
    "root_cause": [
      { "type": "illegal_argument_exception", "reason": "WorkloadGroup.updatedAtInMillis is not a valid epoch" }
    ],
    "type": "illegal_argument_exception",
    "reason": "WorkloadGroup.updatedAtInMillis is not a valid epoch"
  },
  "status": 400
}
```

It happens more often when the REST request is served by a data node; sending the same request directly to the cluster-manager node avoids the error.

### Related component

Plugins

### To Reproduce

1. check node timestamps

```
GET _nodes/stats?filter_path=nodes.*.name,nodes.*.timestamp
 "nodes": {
    "RIYgIe82SJKB6_v0tFhrHw":{
       "timestamp":1769418930198,
       "name":"cm-0"
    },
    "uRuwSYQYQOCVumlqnvp9AQ":{
       "timestamp":1769418930193,
       "name":"data-0"
    },
    "lbuwkeheQg6e8CYkUrmhBQ":{
       "timestamp":1769418930202,
       "name":"cm-1"
    },
    "QnvaRRBlRv6kg2DGoXKmiw":{
       "timestamp":1769418930215,
       "name":"cm-2"
    },
    "88Hk51bYRailzRngI33wmg":{
       "timestamp":1769418930211,
       "name":"data-1"
}
```

2. call request directly in `data-1` node(1769418930211) -> `cm-0` node (1769418930198)

3. api call failed in data-1 node
```
curl -XPUT https://localhost:9200/_wlm/workload_group
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "WorkloadGroup.updatedAtInMillis is not a valid epoch"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "WorkloaGroup.updatedAtInMillis is not a valid epoch"
  },
  "status": 400
}
```



### Expected behavior

Query group creation should not fail due to minor clock skew between REST node and cluster-manager.

### Additional Details

**Plugins**
workload management

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Host/Environment (please complete the following information):**
 - OS: Ubuntu
 - Version: 22

**Additional context**
Add any other context about the problem here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Workload group(Query Group) creation fails when request hits data node due to updated_at validation against CM clock #20485

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Workload group(Query Group) creation fails when request hits data node due to updated_at validation against CM clock #20485

Description

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions