-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Description
- [Feature] Add segment range in tablet metadata #67082
- [Feature] Add tablet range proto and find split points rpc interface #65389
- [Refactor] Refactor materialized index meta id for multi-version materialized indexes #66775
- [Refactor] Refactor materialized index meta id for multi-version materialized indexes (part 2) #67415
- [Feature] Support query range distribution table #67056
- [Feature] Add tablet range pruning for range-distribution Lake tables for shared segments and sstables. #66743
Feature request
StarRocks provides strong capabilities for large-scale analytical workloads, its current design introduces usability challenges, lacks adaptive mechanisms for handling data skew, and may deliver suboptimal performance for small-tenant in multi-tenant scenarios.
A new multi-tenant data management is needed for StarRocks, aiming to:
Fewer concepts and simpler usage
- In most cases, table creation should require only column definitions and the specification of ORDER BY clauses.
- Users don't need to define DISTRIBUTED BY clauses.
Ability to handle data skew
- Tablets support automatic splitting and merging, enabling the system to rebalance data dynamically and mitigate skew.
Balanced data locality and distribution
- Data from a small tenant can be located on a single compute node for optimal locality.
- Data from a large tenant can be distributed across multiple compute nodes to maintain scalability and performance.
No dependency on time-based partitioning
- System can no longer rely on time-based partitioning for data management.
- A single partition will be capable of storing large data volumes, and the amount of data in different partitions can vary greatly, allowing users to decide partitioning strategies based on their specific needs.
yangrong688