Help deciding parameter levels

Hi, 
I am interested in producing a cluster simmilar to the one you did with arxiv. I'm working with a set of web pages from Common Crawl ~6M urls. I have them reduced to embeddings using [this](https://arxiv.org/pdf/2402.03216). How did you decide for the arxiv project the config of `node_embedding_dim`, `neighbor_scale`, and `n_neighbors` or at least what are rational ranges so and I can search on that areas. because currently I end with ~65% of points not being noise in no cluster. even using `noise_level=0`
thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help deciding parameter levels #17

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Help deciding parameter levels #17

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions