Skip to content

fix: accept array of categories in classifier initialization#111

Merged
cardmagic merged 2 commits intomasterfrom
fix/array-categories-initialization
Dec 29, 2025
Merged

fix: accept array of categories in classifier initialization#111
cardmagic merged 2 commits intomasterfrom
fix/array-categories-initialization

Conversation

@cardmagic
Copy link
Owner

Summary

  • Bayes.new and LogisticRegression.new now accept array arguments
  • Bayes.new(['Spam', 'Ham']) is now equivalent to Bayes.new('Spam', 'Ham')
  • Added .flatten to category processing in both initializers

Test plan

  • Added test for Bayes array initialization
  • Added test for LogisticRegression array initialization
  • Added test for array with single element (should raise ArgumentError)
  • All 593 tests passing

Fixes #110

Bayes.new and LogisticRegression.new now accept either:
- Splat arguments: Bayes.new('Spam', 'Ham')
- Array argument: Bayes.new(['Spam', 'Ham'])

Both forms are now equivalent, fixing the issue where array
arguments were treated as a single category name.

Fixes #110
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 29, 2025

Greptile Summary

This PR adds support for array-based category initialization in Classifier::Bayes and Classifier::LogisticRegression.

Changes:

  • Bayes.new(['Spam', 'Ham']) now works equivalently to Bayes.new('Spam', 'Ham')
  • LogisticRegression.new(['Positive', 'Negative']) now works equivalently to LogisticRegression.new('Positive', 'Negative')
  • Added .flatten to category processing in both initializers
  • Updated @rbs type annotations to reflect support for Array[String | Symbol]
  • Added comprehensive test coverage for array initialization
  • Tests include edge case for single-element arrays (correctly raises ArgumentError in LogisticRegression)

Implementation:

  • Bayes: Changed categories.each to categories.flatten.each (line 34)
  • LogisticRegression: Added categories = categories.flatten before validation (line 64)

The fix is minimal, clean, and consistent with Ruby's splat argument handling. Tests confirm functionality works as expected.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are minimal and well-tested. Using .flatten is the standard Ruby approach for handling both splat and array arguments. The fix directly addresses the reported issue, maintains backward compatibility, includes comprehensive test coverage, and follows existing code patterns.
  • No files require special attention

Important Files Changed

Filename Overview
lib/classifier/bayes.rb Added .flatten to handle array category initialization cleanly
lib/classifier/logistic_regression.rb Added categories.flatten to handle array category initialization cleanly
test/bayes/bayesian_test.rb Added test for array initialization that validates the fix works correctly
test/logistic_regression/logistic_regression_test.rb Added tests for array initialization including edge case with single element

Sequence Diagram

sequenceDiagram
    participant User
    participant Bayes
    participant LogisticRegression
    participant prepare_category_name
    
    User->>Bayes: new(['Spam', 'Ham'])
    Bayes->>Bayes: categories.flatten
    Note over Bayes: ['Spam', 'Ham'] remains flat
    loop each category
        Bayes->>prepare_category_name: 'Spam'.prepare_category_name
        prepare_category_name-->>Bayes: :Spam
        Bayes->>Bayes: @categories[:Spam] = {}
    end
    Bayes-->>User: classifier instance
    
    User->>LogisticRegression: new(['Positive', 'Negative', 'Neutral'])
    LogisticRegression->>LogisticRegression: categories.flatten
    Note over LogisticRegression: ['Positive', 'Negative', 'Neutral'] remains flat
    LogisticRegression->>LogisticRegression: Check categories.size >= 2
    loop each category
        LogisticRegression->>prepare_category_name: 'Positive'.prepare_category_name
        prepare_category_name-->>LogisticRegression: :Positive
    end
    LogisticRegression-->>User: classifier instance
Loading

Add checkmark and X emoji to the Why This Library comparison table
for better visual scanning. Fix Logistic Regression guide URL to
use correct path (logisticregression without hyphen).
@cardmagic cardmagic merged commit 7e1b38e into master Dec 29, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

all classifiers that need a list of categories on initialization should accept array of categories

1 participant