This powerful and optimized actor is designed for efficient merging, cleaning, and transformation of large datasets.
- Speed: Experience blazing-fast data processing with parallelized workloads—up to 20x faster than standard methods.
- Efficiency: Simultaneously read from multiple datasets, making it ideal for merging data after scraping.
- Reliability: Actor migration proof with persisted steps, ensuring no repeated work or duplicated data.
- Memory Management: 'Dedup as loading' mode allows for efficient memory usage, even with huge datasets (10M+ items).
- Flexibility: Remove duplicates using multiple fields and nested objects/arrays with deep equality checks.
- Storage Options: Store results in key-value store records.
- Fast Blank Runs: Quickly identify duplicates without processing data.
Combine items from multiple datasets into a single dataset or key-value store output. In 'Dedup after load' mode, the order of items retains the order of the provided datasets.
Specify fields for deduplication to remove duplicate items based on field values. Combine multiple fields for more precise deduplication. Deep comparison is used for objects and arrays.
Perform custom data transformations before and after deduplication with preDedupTransformFunction and postDedupTransformFunction. These functions take an array of items and return a modified array.
Access helper variables and the Apify SDK reference within transformation functions. Customize transformations to suit your needs—whether filtering, adding, or modifying items.
Start optimizing your data processing workflow today with the Easy Data Processor and handle large datasets with ease!
- Blog: Read our latest articles
- YouTube: Visit our channel
- Instagram: Follow us on Instagram
- AI Newsletter: Subscribe to our newsletter
- Free Consultation: Book a free consultation call
- More Tools: Explore our Apify actors
- Discord: Raise a support ticket here
- Email: Contact us
Start enhancing your data processing today with QuickLifeSolutions' Easy Data Processor!