Skip to content

(JS) Option to group checkboxes by proximity #183

@athewsey

Description

@athewsey

In real-world forms, checkboxes / selection elements are usually grouped similar to the example below (from the Textract try-it-out console doc):

image

In Textract Key-Value Forms results, these items generally appear as un-grouped K-V pairs like:

  • (Key) VA -> (Value) NOT_SELECTED
  • (Key) Conventional -> (Value) SELECTED
  • ...
  • (Key) Fixed Rate -> (Value) SELECTED
  • ...

As of today, Textract doesn't do any grouping of these selection element fields, and also doesn't give us any mapping to predicted overall group label (e.g. Mortgage Applied for: versus Authorization Type:). I received a request from a customer for TRP (JS) to try and help more with this.

Since we don't really do ML within TRP itself, we can't get too fancy here... But I think it should be feasible to provide a way to access and iterate "selection groups" of form fields whose values are selection elements, by basic proximity heuristics?

something along the lines of e.g:

for (const group of page.form.iterSelectionGroups({
  // Whatever *optional* heuristic grouping parameters make sense:
  vDistTol: 0.6,
  hDistTol: 2.4,
})) {
  // Can loop through the Form Fields:
  group.listFields();
  // Maybe some other convenience methods?:
  group.listSelectedNames() == ["Conventional"];
  group.listUnselectedNames() == ["VA", "Other (explain):", "FHA", "USDA/Rural Housing Service"];

  // This will *not* be feasible:
  // group.name == "Mortgage Applied For:"
}

Tagging the label/name of the group wouldn't really be possible without a feasible ML model, which I don't think we're looking to introduce in TRP at this time. While I think we could get okay performance on grouping the checkboxes from heuristics alone, identifying the label would be much less likely to work well.

Interested to hear feedback from others on what kind of API & accessors you'd find most helpful for this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestjavascriptRelates to the JavaScript/TypeScript version of TRP

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions