Commit 8da21dc
feat(lsi): expose tuning parameters with validation and introspection API (#92)
* feat(lsi): expose tuning parameters with validation and introspection API
The LSI classifier previously used undocumented magic numbers for critical
cutoff parameters with no validation or introspection capabilities. Users
had no guidance on tuning for different corpus sizes.
This change adds:
- Parameter validation for cutoff (must be between 0 and 1 exclusive)
- `singular_values` attr_reader exposing SVD singular values after build
- `singular_value_spectrum` method for analyzing variance distribution
- Documentation with tuning guides for different use cases
The introspection API enables users to make informed decisions about cutoff
tuning by examining how much variance each semantic dimension captures.
Fixes #67
* fix(lsi): clamp cutoff index to prevent negative array access
Addresses review feedback:
- Clamp s_cutoff_index to 0 minimum to prevent negative indices with
very small cutoffs (e.g., cutoff=0.01 with size=3 would give -1)
- Fix documentation example to handle nil from find_index
* style: fix RuboCop offenses
- Use cutoff.positive? instead of cutoff > 0
- Parenthesize block param in assert
- Use assert_predicate and assert_operator
- Add empty line before assertion
- Rename dims_for_75 to dims_for_threshold
- Exclude lsi.rb from ClassLength check (inherently complex)
* style: reduce verbose documentation per review feedback
Simplified comments that restated obvious code behavior.
* Update .rubocop.yml
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update lib/classifier/lsi.rb
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update lib/classifier/lsi.rb
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update lib/classifier/lsi.rb
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* fix: remove duplicate lines from merge
---------
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>1 parent 661411d commit 8da21dc
3 files changed
+232
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
| 74 | + | |
74 | 75 | | |
75 | | - | |
| 76 | + | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
| |||
98 | 99 | | |
99 | 100 | | |
100 | 101 | | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
101 | 121 | | |
102 | 122 | | |
103 | 123 | | |
| |||
177 | 197 | | |
178 | 198 | | |
179 | 199 | | |
| 200 | + | |
| 201 | + | |
180 | 202 | | |
181 | 203 | | |
182 | 204 | | |
| |||
295 | 317 | | |
296 | 318 | | |
297 | 319 | | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | 320 | | |
303 | 321 | | |
| 322 | + | |
| 323 | + | |
304 | 324 | | |
305 | 325 | | |
306 | 326 | | |
| |||
311 | 331 | | |
312 | 332 | | |
313 | 333 | | |
| 334 | + | |
| 335 | + | |
314 | 336 | | |
315 | 337 | | |
316 | 338 | | |
| |||
327 | 349 | | |
328 | 350 | | |
329 | 351 | | |
| 352 | + | |
| 353 | + | |
330 | 354 | | |
331 | 355 | | |
332 | 356 | | |
| |||
437 | 461 | | |
438 | 462 | | |
439 | 463 | | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
440 | 471 | | |
441 | 472 | | |
442 | 473 | | |
| |||
536 | 567 | | |
537 | 568 | | |
538 | 569 | | |
539 | | - | |
540 | | - | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
541 | 574 | | |
542 | 575 | | |
543 | 576 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
489 | 489 | | |
490 | 490 | | |
491 | 491 | | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
492 | 683 | | |
0 commit comments