Implement PFOCR-based results annotation in BTE

Pending the benchmarking study on PFOCR as a resource for enrichment analysis, we will optimize a procedure to calculating score as part of a BTE query, implmenting an API (or Python lib) to perform the calculation.

#### Background
Andrew Su: "We want to use PFOCR as a bioentity set against which we do enrichment analysis.  So, for example, someone does a BTE query that returns 1000 paths -- we could rank those paths based on how enriched the entities in that path are in all the PFOCR sets. As Kevin noted, that would be similar to the Normalized Google Distance functionality (demoed in [this notebook](https://github.com/biothings/biothings_explorer/blob/master/jupyter%20notebooks/drug_response_predict.ipynb)). To implement that functionality, we'd need an API that would compute an enrichment score based on multiple inputs. This would be similar in operation to the mrcoc API (which provides the NGD results) which can take two inputs (e.g., https://biothings.ncats.io/mrcoc/query?q=combo:C0008203-C0969679). But since a PFOCR enrichment tool would have to take an arbitrary number of inputs, we'd have to dynamically compute enrichment (rather than precomputing and indexing all pairwise scores as we did in mrcoc)."

Alex: "We are in the middle of a benchmarking study to assess PFOCR content as a resource for enrichment analysis. We are comparing with GO and WikiPathways, and we’ve defined a few benchmarking tests. This study should  inform any enrichment analysis use cases. We should have enough results by our mid-Jan meeting to make a clear, detailed plan. An example detail to consider: PFOCR is larger than the Biological Process branch of GO, so standard enrichment algos take a while to run; too long for dynamic queries. So, part of our study is to identify subsets and utilize clustering to make it more efficient."


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement PFOCR-based results annotation in BTE #24

Background

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement PFOCR-based results annotation in BTE #24

Description

Background

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions