refactor NN graph building (included in #43)#21
refactor NN graph building (included in #43)#21naspert wants to merge 33 commits intoepfl-lts2:masterfrom
Conversation
mdeff
left a comment
There was a problem hiding this comment.
Thanks for your contribution @naspert :)
This refactoring was much needed and I like it. Note that we are still in early development and so we don't care too much about breaking the API (e.g. for the distances' names).
Some general concerns:
- Shouldn't
'scipy-pdist'be the default? For sure the default should be from scipy (so users won't have to install any additional library), but I wonder how good the scipy KDTree is. I tried the LSHForest once and it wasn't great... - If you have some intuitions about which backend to use when, it would be nice to share it in the documentation. ;)
- Travis is failing, please fix it.
- For now the tests are simply running the code, without any check. (As it was before I know...) It would be great to check the approximation error in the tests, with
'scipy-pdist'as the ground truth. - While refactoring, I would use the opportunity to rename
NNtypetotype,dist_typetometric,symmetrize_typetosymmetrizationfor consistency. - The
knnandradiusbranches should certainly share more code. At least the computation of the Gaussian kernel should be factored out.
Thanks again, and feel free to disagree with the above. ;)
pygsp/graphs/nngraphs/nngraph.py
Outdated
|
|
||
| self._nn_functions = { | ||
| 'knn': { | ||
| 'scipy-kdtree':_knn_sp_kdtree, |
There was a problem hiding this comment.
you miss a space after the colon
- do not use underscores in functions args
…nce -> fix fix radius nn graph for spdist
try to install via before_install
# Conflicts: # pygsp/graphs/nngraphs/nngraph.py # pygsp/tests/test_graphs.py
…efactor # Conflicts: # README.rst
|
Thanks for merging the knn and radius matrix build 👍 |
|
This might be an interesting alternative to FLANN: |
|
Thanks @bricaud :) There's actually a whole lot of libraries, see e.g. this benchmark. We currently support the following backends (thanks to @naspert work):
@bricaud: which kNN libraries do you have experience with? Would you recommend any? |
|
cKDTree is the same as kdtree, with a C backend. It generates the same results, much faster so no good reason to prefer kdtree over ckdtree. If you are looking into having a dataset with many dimensions, nmslib will be faster. |
No description provided.