Skip to content

Abandon lib in favour of training neural modules #3

Description

@cassiotbatista

It's past time to give up on these rule based taggers and train some models based on NMT architectures or something.

  • Java lib is more than 10 years old. The papers in which these systems have been based OTOH are almost 20 years old
  • G2P phone set is not really friendly to anyone, some phonemes are mysterious, so the IMHO whole thing should be based on IPA
  • Multitask phone, syllable and syllphones would be nice
  • Data from Dicio and Priberam should do. Seed lex is not discarded
  • Good exercise on ML fundamentals :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions