feat: add bulk delete (DeleteObjects) support#448
feat: add bulk delete (DeleteObjects) support#448michaelludwig1 wants to merge 2 commits intodurch:masterfrom
Conversation
Implement the S3 Multi-Object Delete API, allowing up to 1000 objects to be deleted in a single request. Requests with more than 1000 objects are automatically batched. - Add Command::DeleteObjects variant with XML body, Content-MD5 header - Add ObjectIdentifier, DeleteObjectsRequest/Result serde types - Add Bucket::delete_objects() with auto-chunking for >1000 keys - Add unit tests for serialization, escaping, and deserialization - Add integration tests for AWS and Minio - Update README with delete_objects documentation Closes durch#420
📝 WalkthroughWalkthroughAdds S3 bulk delete support: new serde types for DeleteObjects, a DeleteObjects command and request handling, a Bucket::delete_objects method that batches up to 1000 items, tests, and a README link. No other public signatures beyond the new types/variant. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Bucket
participant Command
participant S3
Client->>Bucket: delete_objects(objects: Vec)
activate Bucket
Bucket->>Bucket: split into chunks (<=1000)
loop for each chunk
Bucket->>Command: Create DeleteObjects command with XML body
activate Command
Command->>Command: Serialize XML & compute hashes (sha256, content-md5)
deactivate Command
Command->>S3: POST /?delete with body
activate S3
S3-->>Command: DeleteObjectsResult (XML)
deactivate S3
Command-->>Bucket: parsed result
end
Bucket->>Client: aggregated DeleteObjectsResult
deactivate Bucket
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Brownian Motion (Brass)Recommendation: Proceed Summary: PR effectively implements bulk delete for S3, aligning with stated goals. Highlights
Unknowns
Next actions
Reflection questions
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
s3/src/bucket.rs (1)
1600-1628:⚠️ Potential issue | 🟡 MinorDoc comment range doesn’t match the clamp logic.
The comment says “between 2 and 10” but the code clamps to 100. Either adjust the clamp or update the doc to match.
📝 Suggested doc alignment
- /// Returns a value between 2 and 10, defaulting to 3 if memory detection fails. + /// Returns a value between 2 and 100, defaulting to 3 if memory detection fails.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@s3/src/bucket.rs` around lines 1600 - 1628, The doc comment for calculate_max_concurrent_chunks() claims the returned value is "between 2 and 10" but the implementation clamps to 100; update either the comment or the clamp to match intent. Locate the function calculate_max_concurrent_chunks (which uses CHUNK_SIZE, safety_factor, memory_per_chunk, and calculated_chunks) and either change the clamp call from .clamp(2, 100) to .clamp(2, 10) if the intended max is 10, or modify the doc comment to state "between 2 and 100" (or a more accurate range) and adjust the explanatory comments (safety_factor and CHUNK_SIZE rationale) accordingly so docs and code are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@s3/src/bucket.rs`:
- Around line 2120-2191: The delete_objects method is sending raw
ObjectIdentifier keys (including leading '/') in the DeleteObjectsRequest,
causing mismatches with other APIs that strip leading slashes; fix by
normalizing keys before building the XML: in delete_objects (and when
constructing the DeleteObjectsRequest.objects), map each ObjectIdentifier to one
with leading '/' characters removed (or replace its key string with stripped
version) so the DeleteObjectsRequest contains normalized keys consistent with
put_object and other bucket APIs.
In `@s3/src/serde_types.rs`:
- Around line 407-501: The DeleteObjectsRequest serialization and public docs
need fixes: in the fmt impl for DeleteObjectsRequest update the root element to
include xmlns="http://s3.amazonaws.com/doc/2006-03-01/" (i.e. write "<Delete
xmlns=\"...\">"), XML-escape the VersionId the same way you escape Key (use
quick_xml::escape::escape on obj.version_id when writing VersionId), change
is_empty() to return self.objects.is_empty() (leave len() if you want serialized
length), and add documentation examples for the public types ObjectIdentifier,
DeleteObjectsRequest, DeletedObject, DeleteError, and DeleteObjectsResult
showing typical construction/serialization and deserialization usage; reference
the functions/impls: ObjectIdentifier::new/with_version, impl fmt::Display for
DeleteObjectsRequest, DeleteObjectsRequest::is_empty, and the
DeletedObject/DeleteError/DeleteObjectsResult structs when adding the examples.
---
Outside diff comments:
In `@s3/src/bucket.rs`:
- Around line 1600-1628: The doc comment for calculate_max_concurrent_chunks()
claims the returned value is "between 2 and 10" but the implementation clamps to
100; update either the comment or the clamp to match intent. Locate the function
calculate_max_concurrent_chunks (which uses CHUNK_SIZE, safety_factor,
memory_per_chunk, and calculated_chunks) and either change the clamp call from
.clamp(2, 100) to .clamp(2, 10) if the intended max is 10, or modify the doc
comment to state "between 2 and 100" (or a more accurate range) and adjust the
explanatory comments (safety_factor and CHUNK_SIZE rationale) accordingly so
docs and code are consistent.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
README.mds3/src/bucket.rss3/src/command.rss3/src/request/request_trait.rss3/src/serde_types.rs
| /// Represents a single object identifier for the bulk delete request. | ||
| #[derive(Debug, Clone)] | ||
| pub struct ObjectIdentifier { | ||
| /// The key of the object to delete. | ||
| pub key: String, | ||
| /// The version ID of the object to delete (optional). | ||
| pub version_id: Option<String>, | ||
| } | ||
|
|
||
| impl ObjectIdentifier { | ||
| pub fn new(key: impl Into<String>) -> Self { | ||
| Self { | ||
| key: key.into(), | ||
| version_id: None, | ||
| } | ||
| } | ||
|
|
||
| pub fn with_version(key: impl Into<String>, version_id: impl Into<String>) -> Self { | ||
| Self { | ||
| key: key.into(), | ||
| version_id: Some(version_id.into()), | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /// Request body for the DeleteObjects (bulk delete) API. | ||
| #[derive(Debug, Clone)] | ||
| pub struct DeleteObjectsRequest { | ||
| pub objects: Vec<ObjectIdentifier>, | ||
| /// If true, the response only includes errors (Quiet mode). | ||
| pub quiet: bool, | ||
| } | ||
|
|
||
| impl fmt::Display for DeleteObjectsRequest { | ||
| fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
| write!(f, "<Delete>").expect("Can't fail"); | ||
| if self.quiet { | ||
| write!(f, "<Quiet>true</Quiet>").expect("Can't fail"); | ||
| } | ||
| for obj in &self.objects { | ||
| let escaped_key = quick_xml::escape::escape(&obj.key); | ||
| write!(f, "<Object><Key>{}</Key>", escaped_key).expect("Can't fail"); | ||
| if let Some(ref vid) = obj.version_id { | ||
| write!(f, "<VersionId>{}</VersionId>", vid).expect("Can't fail"); | ||
| } | ||
| write!(f, "</Object>").expect("Can't fail"); | ||
| } | ||
| write!(f, "</Delete>") | ||
| } | ||
| } | ||
|
|
||
| impl DeleteObjectsRequest { | ||
| pub fn len(&self) -> usize { | ||
| self.to_string().len() | ||
| } | ||
|
|
||
| pub fn is_empty(&self) -> bool { | ||
| self.to_string().len() == 0 | ||
| } | ||
| } | ||
|
|
||
| /// A single deleted object in the DeleteObjects response. | ||
| #[derive(Deserialize, Debug, Clone)] | ||
| pub struct DeletedObject { | ||
| #[serde(rename = "Key")] | ||
| pub key: String, | ||
| #[serde(rename = "VersionId")] | ||
| pub version_id: Option<String>, | ||
| #[serde(rename = "DeleteMarker")] | ||
| pub delete_marker: Option<bool>, | ||
| #[serde(rename = "DeleteMarkerVersionId")] | ||
| pub delete_marker_version_id: Option<String>, | ||
| } | ||
|
|
||
| /// A single error entry in the DeleteObjects response. | ||
| #[derive(Deserialize, Debug, Clone)] | ||
| pub struct DeleteError { | ||
| #[serde(rename = "Key")] | ||
| pub key: String, | ||
| #[serde(rename = "Code")] | ||
| pub code: String, | ||
| #[serde(rename = "Message")] | ||
| pub message: String, | ||
| #[serde(rename = "VersionId")] | ||
| pub version_id: Option<String>, | ||
| } | ||
|
|
||
| /// The parsed result of a DeleteObjects (bulk delete) response. | ||
| #[derive(Deserialize, Debug, Clone)] | ||
| pub struct DeleteObjectsResult { | ||
| #[serde(rename = "Deleted", default)] | ||
| pub deleted: Vec<DeletedObject>, | ||
| #[serde(rename = "Error", default)] | ||
| pub errors: Vec<DeleteError>, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the actual code in the file to confirm the state
cd s3
head -n 510 src/serde_types.rs | tail -n 110Repository: durch/rust-s3
Length of output: 3239
🌐 Web query:
AWS S3 DeleteObjects XML request format namespace xmlns requirement
💡 Result:
For the S3 DeleteObjects (Multi-Object Delete) REST request body, AWS expects the root element to be in the S3 XML namespace, i.e. include the default namespace on <Delete>:
<?xml version="1.0" encoding="UTF-8"?>
<Delete xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Object><Key>key1</Key></Object>
<Object><Key>key2</Key></Object>
<Quiet>false</Quiet>
</Delete>That xmlns="http://s3.amazonaws.com/doc/2006-03-01/" is the documented request format for DeleteObjects. [1]
Sources
[1] AWS S3 API Reference – DeleteObjects (request syntax / XML namespace) (docs.aws.amazon.com)
Citations:
🏁 Script executed:
# Check if there are any tests for DeleteObjectsRequest serialization
rg -A 10 "DeleteObjectsRequest" src/serde_types.rs | head -n 50Repository: durch/rust-s3
Length of output: 115
🏁 Script executed:
# Check if documentation examples exist elsewhere for similar structs
rg -B 3 "/// Example" src/serde_types.rs | head -n 30Repository: durch/rust-s3
Length of output: 115
Tighten DeleteObjectsRequest serialization and add documentation examples for public APIs.
Three functional issues and one documentation gap:
- XML escaping inconsistency:
VersionIdshould be XML-escaped likeKeyto prevent malformed XML with special characters. - is_empty() logic error: Currently checks
serialized XML length == 0, which is always false (minimum XML is<Delete></Delete>). Should beself.objects.is_empty(). - Missing XML namespace: AWS S3 DeleteObjects requires
xmlns="http://s3.amazonaws.com/doc/2006-03-01/"on the<Delete>root element; currently absent. - Missing documentation examples: Per coding guidelines, all public APIs must include documentation examples. Add examples for
ObjectIdentifier,DeleteObjectsRequest,DeletedObject,DeleteError, andDeleteObjectsResult.
🔧 Suggested fixes
if let Some(ref vid) = obj.version_id {
- write!(f, "<VersionId>{}</VersionId>", vid).expect("Can't fail");
+ let escaped_vid = quick_xml::escape::escape(vid);
+ write!(f, "<VersionId>{}</VersionId>", escaped_vid).expect("Can't fail");
}
@@
pub fn is_empty(&self) -> bool {
- self.to_string().len() == 0
+ self.objects.is_empty()
}
@@
impl fmt::Display for DeleteObjectsRequest {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
- write!(f, "<Delete>").expect("Can't fail");
+ write!(f, r#"<Delete xmlns="http://s3.amazonaws.com/doc/2006-03-01/">"#).expect("Can't fail");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@s3/src/serde_types.rs` around lines 407 - 501, The DeleteObjectsRequest
serialization and public docs need fixes: in the fmt impl for
DeleteObjectsRequest update the root element to include
xmlns="http://s3.amazonaws.com/doc/2006-03-01/" (i.e. write "<Delete
xmlns=\"...\">"), XML-escape the VersionId the same way you escape Key (use
quick_xml::escape::escape on obj.version_id when writing VersionId), change
is_empty() to return self.objects.is_empty() (leave len() if you want serialized
length), and add documentation examples for the public types ObjectIdentifier,
DeleteObjectsRequest, DeletedObject, DeleteError, and DeleteObjectsResult
showing typical construction/serialization and deserialization usage; reference
the functions/impls: ObjectIdentifier::new/with_version, impl fmt::Display for
DeleteObjectsRequest, DeleteObjectsRequest::is_empty, and the
DeletedObject/DeleteError/DeleteObjectsResult structs when adding the examples.
- Add xmlns namespace to Delete XML element - XML-escape VersionId values (same as Key) - Fix is_empty() to check objects.is_empty() instead of serialized length - Strip leading '/' from object keys for consistency with other Bucket methods - Add doc comments and examples to all public types
Brownian Motion (Brass)Recommendation: Refactor Summary: PR effectively implements bulk delete for S3, but risks complexity. Highlights
Unknowns
Next actions
Reflection questions
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
s3/src/serde_types.rs (1)
407-506:⚠️ Potential issue | 🟡 MinorAdd rustdoc examples for the new public delete types.
ObjectIdentifier,DeleteObjectsRequest,DeletedObject,DeleteError, andDeleteObjectsResultare public but lack documentation examples.As per coding guidelines: “Provide documentation examples for all public APIs.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@s3/src/serde_types.rs` around lines 407 - 506, Public types ObjectIdentifier, DeleteObjectsRequest, DeletedObject, DeleteError, and DeleteObjectsResult lack rustdoc examples; add brief /// doc examples for each struct showing typical usage: for ObjectIdentifier show constructing with new() and with_version(); for DeleteObjectsRequest show building a request, calling to_string() (or len/is_empty) to demonstrate the XML serialization; for DeletedObject/DeleteError show example deserialization from representative XML (or constructing instances) and for DeleteObjectsResult show how deleted/errors vectors are accessed. Place these examples directly above the corresponding struct definitions (ObjectIdentifier, DeleteObjectsRequest, DeletedObject, DeleteError, DeleteObjectsResult) using standard rustdoc code blocks so they compile and run in doc tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@s3/src/serde_types.rs`:
- Around line 407-506: Public types ObjectIdentifier, DeleteObjectsRequest,
DeletedObject, DeleteError, and DeleteObjectsResult lack rustdoc examples; add
brief /// doc examples for each struct showing typical usage: for
ObjectIdentifier show constructing with new() and with_version(); for
DeleteObjectsRequest show building a request, calling to_string() (or
len/is_empty) to demonstrate the XML serialization; for
DeletedObject/DeleteError show example deserialization from representative XML
(or constructing instances) and for DeleteObjectsResult show how deleted/errors
vectors are accessed. Place these examples directly above the corresponding
struct definitions (ObjectIdentifier, DeleteObjectsRequest, DeletedObject,
DeleteError, DeleteObjectsResult) using standard rustdoc code blocks so they
compile and run in doc tests.
Implement the S3 Multi-Object Delete API, allowing up to 1000 objects to be deleted in a single request. Requests with more than 1000 objects are automatically batched.
Closes #420
This change is
Summary by CodeRabbit
New Features
Documentation