feat: Add vector value type support to expression system#893
Open
AlexFilipImproving wants to merge 4 commits intovalkey-io:mainfrom
Open
feat: Add vector value type support to expression system#893AlexFilipImproving wants to merge 4 commits intovalkey-io:mainfrom
AlexFilipImproving wants to merge 4 commits intovalkey-io:mainfrom
Conversation
This commit adds comprehensive vector value type support to the expression evaluation system, enabling vectors to be used in FT.AGGREGATE operations. Key changes: - Implement AsVector() accessor method for Value class - Add Vector variant to expr::Value with std::vector<double> storage - Implement vector-specific functions (VECLEN, VECGET, VECSET, etc.) - Add vector support to mathematical and string functions - Implement vector serialization to RESP format - Add comprehensive error handling for vector operations - Add vector comparison operators and nested vector support - Refactor Value::Nil to use std::string instead of const char* Testing: - Add comprehensive unit tests for vector operations - Add nested vector tests - Add vector comparison tests This also includes upstream changes from main branch merged during development, including text search optimizations, compatibility tests, and various bug fixes. Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
Member
|
I think what you call a vector should be called an Array. That's because in the context of vector search a "vector" field data type should be a scalar value when seen in the expression language. So we're really introducing two new datatypes: vector (a scalar consisting of some number of floating point numbers) and an array which is a one-dimensional list of Values. The broadcast pattern applies to Arrays in and out. For vector types, we need some new functions that are distance functions (same set as the vector search functions). |
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR adds native vector support to the
Valuetype in valkey-search's expression system, enabling vector operations in FT.AGGREGATE expressions. This represents a significant improvement over Redis's limited vector handling, which treats vectors as opaque output-only values.Motivation
Currently, Redis FT.AGGREGATE has minimal vector operation support:
nilValkey-Search now diverges from Redis by supporting intuitive vector operations that enable more powerful data transformations.
Key Features
1. Vector as First-Class Value Type
std::shared_ptr<std::vector<Value>>variant to the Value type2. Vector-Scalar Operations
Apply scalar functions element-wise to vectors:
Supported functions:
lower,upper,strlen,floor,ceil,abs,log,sqrt, and more.3. Vector Arithmetic with Broadcasting
Perform arithmetic operations with automatic scalar broadcasting:
Supported operations:
+,-,*,/,^4. Vector-Specific Functions
vectorlen(@vec)- Get vector lengthvectorat(@vec, index)- Access element at indexisvector(@val)- Check if value is a vectormakevector(...)- Create vector from elementsflatten(@vec, depth)- Flatten nested vectors5. Comprehensive Error Handling
Clear, actionable error messages:
"Type error: cannot add vector to string""Length mismatch: vectors have lengths 3 and 5""Element error at index 2: division by zero""Index out of bounds: index 5, vector length 3"Implementation Details
Architecture
shared_ptr<vector<Value>>for efficient copying and passingModified Components
src/expr/value.h/value.cc- Core Value type extensionsrc/expr/functions.cc- Vector operation support in all functionssrc/expr/comparison.cc- Lexicographic vector comparisonsrc/expr/serialization.cc- RESP array serialization for vectorsTesting
Examples
Element-wise String Operations
Vector Arithmetic
Vector Metadata
Breaking Changes
None. This is a purely additive feature that maintains full backward compatibility with existing scalar operations.
Performance Considerations
Related Issues
Addresses the need for better vector operation support in FT.AGGREGATE expressions, enabling more powerful data transformations than Redis currently provides.