It would be helpful, especially when cacheing datasets, to have a list of corresponding tokens for each dataset in a collection. These could be auto-generated by `dask.base.tokenize`