Add std::future async overloads (Phase 1) and C++20 coroutine support…#229
Conversation
43211e4 to
3a54647
Compare
|
Warning Review limit reached
More reviews will be available in 19 minutes and 53 seconds. Learn how PR review limits work. To continue reviewing without waiting, enable usage-based billing in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThis PR adds ChangesAsync S3 API surface
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/baseclient.cc`:
- Around line 2057-2058: The async wrapper for AbortMultipartUpload is still
copying the moved-in args because the lambda is not mutable and passes args as a
const lvalue into AbortMultipartUpload. Update the std::async lambda in the
AbortMultipartUpload wrapper to match the mutable + std::move pattern used by
GetPresignedPostFormDataAsync, so args is forwarded into the synchronous call
without the redundant copy.
- Around line 2325-2351: The Async wrappers in BaseClient are capturing Args by
value with std::move, but those Args still contain reference members that can
dangle once the caller returns. Fix the lifetime bug by making the async path
own its inputs: either refactor the relevant Args types to store owned data (not
raw references) or copy the underlying request/config data into local owned
variables before calling std::async in SelectObjectContentAsync,
SetBucketEncryptionAsync, SetBucketLifecycleAsync, and
SetBucketNotificationAsync.
In `@tests/tests.cc`:
- Around line 785-1241: The async test coverage in TestAsyncOperations is
missing the failure and concurrency cases called out in the review. Extend this
test to cover at least one error-path that asserts failures surface through
future.get(), and add a genuine concurrent-overlap scenario by starting multiple
Async calls before waiting on any of them. Also include the
reference-holding-arg overloads mentioned in the comment, such as
SelectObjectContentAsync and the
SetBucketNotification/Encryption/Replication/LifecycleAsync paths, so the
lifetime-sensitive code in client async APIs is exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 58243c70-017c-4628-91e9-4759d542c25b
📒 Files selected for processing (5)
include/miniocpp/baseclient.hinclude/miniocpp/client.hsrc/baseclient.ccsrc/client.cctests/tests.cc
apply suggestion
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/baseclient.cc (1)
2057-2534: 🩺 Stability & Availability | 🟠 MajorLaunch policy uses
asyncinstead of the stateddeferred.All async wrapper implementations explicitly use
std::launch::async. The stated design intent (presumablydeferredto avoid threading issues) conflicts with the implementation.Update the design documentation or change
std::launch::asyncto the intended default orstd::launch::deferredif immediate thread execution is not required.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/baseclient.cc` around lines 2057 - 2534, The async wrapper methods in BaseClient currently hardcode std::launch::async, which conflicts with the stated deferred behavior. Review the BaseClient::*Async methods in this diff and align the implementation with the intended launch policy by either switching the std::async calls to std::launch::deferred or updating the design/docs if eager execution is actually desired. Keep the behavior consistent across the wrapper methods such as BucketExistsAsync, ListBucketsAsync, and UploadPartAsync.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/baseclient.cc`:
- Around line 2322-2328: `MakeBucketAsync` can now invoke `MakeBucket`
concurrently, but `MakeBucket` still updates `region_map_` without
synchronization while the rest of the map access uses `region_map_mutex_`.
Update the `MakeBucket` implementation so the `region_map_[args.bucket] =
region` write is protected by the same mutex used for other `region_map_`
reads/writes, keeping the locking consistent across the code path.
- Around line 2366-2383: `SelectObjectContentAsync` is copying `SelectRequest`
into `owned_request`, but `SelectRequest` still shallow-copies raw serialization
pointers, which can leave `SelectRequest::ToXML` reading dangling stack memory.
Update `SelectRequest` (and any related ownership in `SelectObjectContentArgs`)
so serialization members like the CSV/JSON inputs are owned via deep-copyable or
move-owning types rather than raw pointers. Ensure the async path in
`SelectObjectContentAsync` transfers a safe owned copy into the lambda before
calling `SelectObjectContent`.
In `@src/client.cc`:
- Around line 1471-1475: `Client::PutObjectAsync` currently captures
`PutObjectArgs` by move into the async task, but that does not guarantee the
caller’s stream stays alive until `future.get()` runs. Update the async path
around `PutObjectAsync` and `Client::PutObject` to make stream ownership
explicit: either add an overload that takes an owning stream handle/shared
ownership, or enforce and document that `args.stream` must outlive the returned
future. Ensure the implementation no longer dereferences a potentially destroyed
`args.stream` inside `PutObject`.
In `@tests/tests.cc`:
- Around line 1249-1255: Remove the obvious narration comments in the affected
test cases and keep only comments that add meaningful context. In the test code
around the StatObjectAsync and related checks, delete the comments that merely
restate the following if/cleanup statements, and leave the actual assertions and
cleanup logic unchanged. Use the surrounding test functions in tests.cc to find
the repeated comment patterns flagged by the review.
- Around line 1273-1310: The concurrent bucket test currently throws before the
cleanup block, which can leave successfully created buckets behind. Update the
create/verify flow in the test around MakeBucketAsync, BucketExistsAsync, and
the local clean lambda so that any failure from get() or existence checks
triggers cleanup for all buckets that were created successfully. Wrap the whole
sequence in a try/catch and ensure the catch path removes b1, b2, and b3 before
rethrowing or failing the test.
- Around line 1313-1479: The async lifetime tests in the
SetBucketEncryptionAsync, SetBucketLifecycleAsync, SetBucketNotificationAsync,
SetBucketReplicationAsync, and SelectObjectContentAsync blocks are still
awaiting the futures while the referenced stack objects remain alive, so they do
not verify ownership correctly. Refactor each case so the future is created
inside an inner scope, let the stack-owned inputs like SseConfig,
LifecycleConfig, NotificationConfig, ReplicationConfig, SelectRequest, and the
CSV serializers go out of scope, and only then call get() on the stored future
to validate the async wrappers copy their inputs.
- Around line 1478-1488: The async S3 Select test in SelectObjectContentAsync
currently only checks the response status and can miss a broken callback path;
update the test to validate the returned records content as well. In the
SelectObjectContentAsync flow, capture the callback output/aggregated records
from sel_resp (or the existing callback state) and assert that it matches the
expected data payload before RemoveObject is called, so the test fails if
records are not delivered correctly.
---
Outside diff comments:
In `@src/baseclient.cc`:
- Around line 2057-2534: The async wrapper methods in BaseClient currently
hardcode std::launch::async, which conflicts with the stated deferred behavior.
Review the BaseClient::*Async methods in this diff and align the implementation
with the intended launch policy by either switching the std::async calls to
std::launch::deferred or updating the design/docs if eager execution is actually
desired. Keep the behavior consistent across the wrapper methods such as
BucketExistsAsync, ListBucketsAsync, and UploadPartAsync.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 048d4f36-b062-4149-a420-93bd9527a716
📒 Files selected for processing (3)
src/baseclient.ccsrc/client.cctests/tests.cc
apply suggestion
apply suggestion
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/tests.cc (1)
790-829: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick winClean up the bucket if async verification fails.
After
MakeBucketAsync()succeeds, any failure inBucketExistsAsync()orRemoveBucketAsync()throws before cleanup, leaving the temporary bucket behind.Proposed cleanup guard
{ std::string bucket_name = RandBucketName(); + bool bucket_created = false; + auto cleanup = [this, &bucket_name]() { + try { + minio::s3::RemoveBucketArgs args; + args.bucket = bucket_name; + client_.RemoveBucket(args); + } catch (...) { + } + }; + try { { minio::s3::MakeBucketArgs args; args.bucket = bucket_name; auto make_fut = client_.MakeBucketAsync(args); minio::s3::MakeBucketResponse make_resp = make_fut.get(); @@ if (!make_resp) { throw std::runtime_error("MakeBucketAsync(): " + make_resp.Error().String()); } + bucket_created = true; } { minio::s3::BucketExistsArgs args; @@ } } { minio::s3::RemoveBucketArgs args; @@ if (!rm_resp) { throw std::runtime_error("RemoveBucketAsync(): " + rm_resp.Error().String()); } + bucket_created = false; } + } catch (...) { + if (bucket_created) cleanup(); + throw; + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/tests.cc` around lines 790 - 829, The async bucket test leaves temporary buckets behind if verification or teardown fails after MakeBucketAsync succeeds. Update the bucket lifecycle block in tests.cc around MakeBucketAsync, BucketExistsAsync, and RemoveBucketAsync to ensure cleanup always runs, even when BucketExistsAsync or the final removal throws. Use the existing bucket_name and async response variables to add a cleanup guard or equivalent try/finally-style pattern so the bucket is removed on every exit path.
♻️ Duplicate comments (1)
src/baseclient.cc (1)
2363-2387: 🩺 Stability & Availability | 🔴 Critical | 🏗️ Heavy liftCritical:
SelectObjectContentAsyncstill copies dangling raw pointers — UAF reintroduced.The comment at Lines 2364-2365 claims the
shared_ptrchain makes a singlemake_shared<SelectRequest>(args.request)sufficient, butSelectRequestitself holds raw pointers to the serialization structs (csv_input,json_input,csv_output, ...).make_shared<SelectRequest>(args.request)shallow-copies those raw pointers; theshared_ptrmembers live one level deeper (insideCsvInputSerializationetc.), so they do not protect the serialization objects themselves.In the test/example pattern,
CsvInputSerialization/CsvOutputSerializationare stack-allocated and go out of scope right after the call (// csv_input, csv_output, request, etc. go out of scope here.), soowned_request->csv_inputdangles andSelectRequest::ToXML()dereferences freed memory on the worker thread. This is the same UAF flagged previously; the simplification reverted the earlier deep-copy fix.Either deep-copy the pointed-to serialization objects into owned storage before the task, or change
SelectRequestto own them (e.g.shared_ptr/value members).#!/bin/bash # Confirm SelectRequest member ownership types. rg -nP -A18 'struct SelectRequest' include/miniocpp/types.h🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/baseclient.cc` around lines 2363 - 2387, SelectObjectContentAsync still reintroduces a UAF because SelectRequest is only shallow-copied and its raw serialization pointers (for csv_input, json_input, csv_output, etc.) can dangle after the caller’s stack objects go out of scope. Fix BaseClient::SelectObjectContentAsync by either deep-copying the serialization objects into owned storage before launching the async task or changing SelectRequest to own those members directly, and ensure the worker lambda in SelectObjectContentAsync passes safe owned data into SelectObjectContent/SelectRequest::ToXML.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@examples/SelectObjectContent.cc`:
- Around line 35-39: The enum/template arguments used in the SelectObjectContent
example are unqualified, which can cause name lookup issues. Update the
`csv_input.file_header_info` and `csv_output.quote_fields` assignments in
`SelectObjectContent.cc` to use the fully qualified `minio::s3::FileHeaderInfo`
and `minio::s3::QuoteFields` types inside `std::make_shared`, keeping the rest
of the example unchanged.
---
Outside diff comments:
In `@tests/tests.cc`:
- Around line 790-829: The async bucket test leaves temporary buckets behind if
verification or teardown fails after MakeBucketAsync succeeds. Update the bucket
lifecycle block in tests.cc around MakeBucketAsync, BucketExistsAsync, and
RemoveBucketAsync to ensure cleanup always runs, even when BucketExistsAsync or
the final removal throws. Use the existing bucket_name and async response
variables to add a cleanup guard or equivalent try/finally-style pattern so the
bucket is removed on every exit path.
---
Duplicate comments:
In `@src/baseclient.cc`:
- Around line 2363-2387: SelectObjectContentAsync still reintroduces a UAF
because SelectRequest is only shallow-copied and its raw serialization pointers
(for csv_input, json_input, csv_output, etc.) can dangle after the caller’s
stack objects go out of scope. Fix BaseClient::SelectObjectContentAsync by
either deep-copying the serialization objects into owned storage before
launching the async task or changing SelectRequest to own those members
directly, and ensure the worker lambda in SelectObjectContentAsync passes safe
owned data into SelectObjectContent/SelectRequest::ToXML.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: b6ba44c7-3407-4448-8024-a2d83e21423e
📒 Files selected for processing (6)
examples/SelectObjectContent.ccinclude/miniocpp/args.hinclude/miniocpp/client.hinclude/miniocpp/types.hsrc/baseclient.cctests/tests.cc
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/baseclient.cc (1)
2058-2060: 🩺 Stability & Availability | 🟠 MajorSwitch the
BaseClient::*Asyncwrappers tostd::launch::deferred.src/baseclient.ccstill hardcodesstd::launch::asyncacross the async overloads, so they won’t follow the intended deferred execution path for the CURL thread-safety fix.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/baseclient.cc` around lines 2058 - 2060, The BaseClient async wrapper currently hardcodes eager execution, so update the std::async calls in BaseClient::*Async helpers to use std::launch::deferred instead of std::launch::async. Make this change in the async overloads around AbortMultipartUpload and the other matching wrappers in BaseClient so they follow the deferred execution path required for the CURL thread-safety fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/args.cc`:
- Around line 465-475: Guard SelectObjectContentArgs::request before any
dereference in the validator and async path. In the request validation logic in
args.cc, add an explicit null check for request before accessing expr or the
input/output members and return an error::Error instead of crashing. In the
async wrapper in baseclient.cc, avoid copying *args.request until the pointer
has been validated, and either validate first or short-circuit with the same
error path so async callers fail cleanly.
- Around line 469-472: The validation in the request parsing logic uses a
three-way XOR on csv_input, json_input, and parquet_input, which allows all
three to be set at once; update the check in the argument handling code so it
enforces exactly one input serialization is present. Keep the existing error
path and message, but change the condition in the request validation block to an
exact-count style check tied to those three fields.
In `@src/baseclient.cc`:
- Around line 2378-2383: The async wrapper in the SetBucketEncryption path is
storing owned_config.get() into args.config and then launching work without
preserving owned_config, so the worker can later dereference a dangling pointer.
Update the async lambdas in the SetBucketEncryption, lifecycle, notification,
and replication wrapper helpers to capture owned_config by value alongside args
so the shared config stays alive for the duration of the async task, while still
passing the same args object through to the underlying Set* methods.
- Around line 2363-2364: Remove the stale reference-layout comments in the
SelectObjectContent argument handling code, since they repeat implementation
details and some are now inaccurate about the stored config/request pointers.
Clean up the comments around the SelectObjectContentArgs and related
SelectRequest flow so the code relies on the types themselves rather than
explanatory notes that are no longer correct.
---
Outside diff comments:
In `@src/baseclient.cc`:
- Around line 2058-2060: The BaseClient async wrapper currently hardcodes eager
execution, so update the std::async calls in BaseClient::*Async helpers to use
std::launch::deferred instead of std::launch::async. Make this change in the
async overloads around AbortMultipartUpload and the other matching wrappers in
BaseClient so they follow the deferred execution path required for the CURL
thread-safety fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 616d195d-0272-4d34-8c65-ad7e2dfbf827
📒 Files selected for processing (3)
include/miniocpp/args.hsrc/args.ccsrc/baseclient.cc
… (Phase 2)
Add std::future overloads for all ~61 public methods across BaseClient (53) and Client (8), using std::launch::deferred to avoid CURL thread-safety issues that caused SIGSEGV with std::launch::async.
Add C++20 coroutine support via coro.h with FutureAwaitable, CoroTask, WhenAll, and WhenAny helpers.
Fix ListObjectsResponse and RemoveObjectsResponse by adding explicit copy/move constructors (user-declared ~X() = default suppressed implicit move ctors), which caused dangling iterator when ListObjectsResult was moved through std::future shared state.
Add async test methods for all async overloads and enable ListObjectsAsync test.
fix #217
Summary by CodeRabbit
*Asyncvariants across the S3 base and client APIs, returningstd::futurefor common bucket/object operations, multipart workflows, listing, tags/policies/retention/locks, notifications, S3 Select, and presigned URL generation.PutObjectAsyncstream lifetime requirements; updated SelectObjectContent example and serialization/config handling to usestd::shared_ptr.