Skip to content

[awslogsencodingextension] Improve CloudTrail unmarshaling performance #45180

@Kavindu-Dodan

Description

@Kavindu-Dodan

Component(s)

extension/encoding/awslogsencoding

Describe the issue you're reporting

Overview

In the current implementation of the CloudTrail unmarshaling logic, there's double buffering of the input data.

  • First, full payload is read into memory 1
  • Then, above bytes are unmarshaled into records 2 3

This is inefficient and can create unwanted memory pressure.

Solution

I am proposing the following,

  • Detect CloudWatch log type by peaking into the input stream
  • Detect unmarshaling type based on the JSON key
  • Unmarshal using the input stream

I have tested above in a local setup and have observed ~45% memory usage reduction with current benchmark test (using 1000 log)

Image

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Footnotes

  1. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/6719e7af8de36f2b5e07ce602eb13996d2933017/extension/encoding/awslogsencodingextension/internal/unmarshaler/cloudtraillog/unmarshaler.go#L144

  2. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/6719e7af8de36f2b5e07ce602eb13996d2933017/extension/encoding/awslogsencodingextension/internal/unmarshaler/cloudtraillog/unmarshaler.go#L149-L151

  3. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/6719e7af8de36f2b5e07ce602eb13996d2933017/extension/encoding/awslogsencodingextension/internal/unmarshaler/cloudtraillog/unmarshaler.go#L159-L161

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions