-
Notifications
You must be signed in to change notification settings - Fork 10
Changing the DUG Data Model #393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
83 commits
Select commit
Hold shift + click to select a range
1b5a3c5
Changing DugElement, DugConcept, adding DugVariable
4feeb97
Changing the HEAL parser
d8559cd
Changes to crawler
7650e93
Adding study data type
5d679cf
Testing parser HEAL parser
d7270c1
ENH: Changing indexing, annotator
66de6be
TEST: Fixing loader test
988aa28
Making indexing work on one file
d8b34b7
Correcting a few errors and cleanup
2319c7c
Make indices configurable through .env
1f1ae76
Merge branch 'DugModel2.0' into data-model-update
6b7fd8a
Merge DugDataModel2.0
72415de
Updating indices to a dict
3f76749
Cleaning print statements
4de2efd
Adjusting tests to code changes
dd08019
FEAT: Adding a parser for HEAL studies to get data from MDS
687bf0b
Changes for studies annotation and index
00b2a50
Added /variables and /concepts endpoints that use new indexes
vladimir2217 99c5f30
Adding DugSection element
2058131
Merge pull request #400 from helxplatform/data-model-update-api
hina-shah 565eec8
Updating HEAL Parser to get DugSections
6218a68
Merge branch 'data-model-update' of github.com:helxplatform/dug into …
6dbf8fb
Adding Sections Index to the pipeline
8260af8
CLEANUP
5056c82
Fixed missing new model fields
vladimir2217 715bde9
Changing search to use index dictionary
811a8a8
Added new Studues API
vladimir2217 ca1a6e7
added import
vladimir2217 9510344
Added comments to new APIs
vladimir2217 3440ef0
Added cde endpoint. Added study_sources endpoint
vladimir2217 f6de20c
Added SearchQuery parameter. Changed API to handle post. Added get_va…
vladimir2217 8e39beb
Merge pull request #401 from helxplatform/data-model-update-api
vladimir2217 2a4f93c
Merge pull request #402 from helxplatform/data-model-update-studies-api
YaphetKG ffab250
Standardiszing response types, some minor edits to search
YaphetKG 713bd80
Standardiszing response types, some minor edits to search
YaphetKG 752ea78
Merge pull request #404 from helxplatform/openapi-docs
hina-shah 6371653
ENH: Updating the schema for keyword data types
b511400
ENH: Changing request/response types for Sections/CDEs
8956abe
Merge pull request #405 from helxplatform/add-section-response
hina-shah 5cef684
Studies API reuse variable ES query
vladimir2217 d4f7bf6
updated studies API search
vladimir2217 3ad2467
fix for cde endpoint
YaphetKG d8ccae1
Merge pull request #407 from helxplatform/patch-cde-api
vladimir2217 04ef64d
Merge pull request #406 from helxplatform/data-model-update-es-query
YaphetKG 62e9359
Adding HEAL DDM2 parser, and its test
50a97ef
Merge branch 'data-model-update' of github.com:helxplatform/dug into …
f0d3ee1
Adding HEAL DDM2 parser, and its test
e7c7090
First stab at a JSON Schema export for the Dug Data Model.
gaurav 8754448
Fixed overall JSON schema.
gaurav a402da2
Documented that you need a PYTHONPATH=src to run this.
gaurav 23bde0b
response request model revamps
YaphetKG 97aaf78
Updating Dug Data Model for efficient import
330b2f8
Merge pull request #409 from helxplatform/data-model-update-9-2
hina-shah d6efdac
pushing search edits
YaphetKG fcf7a1d
add identfiers
YaphetKG b8f200d
Merge pull request #410 from helxplatform/data-model-update-search-re…
hina-shah 6b6073a
Merge pull request #408 from helxplatform/dug-data-model-json-schema
hina-shah 5ccaa75
ENH: Change input element type to program name
e557769
Merge pull request #411 from helxplatform/add-program-name
YaphetKG d631dc6
revert identifier changes
ae7e840
Merging with remote
cefa880
add minimum should to make sure search returns relevant results
YaphetKG 8ca118f
fixing api endpoints and some idnexing bug
YaphetKG 53b172a
Ignoring elements that don't have an id
6eaafc1
Adding parents to concepts
3e7242d
Removing is_standardized to match search API
13ccba6
Merge branch 'data-model-update' into fix-search-filter
hina-shah 4617328
add conditions for query being present or not. make sure that empty s…
YaphetKG 0e34ea0
Documentation and cleanup
c716ed7
Merge pull request #412 from helxplatform/fix-search-filter
hina-shah 9468c36
remove filter
YaphetKG 012c839
Merge pull request #414 from helxplatform/remove-concept-filter
hina-shah 61df653
Update Dockerfile and Makefile
123cf45
Put back requirements for descriptions for concepts
f88f17a
Updating programs to enable endpoints
cd5ae78
Update pydantic, and fix documentation page
5a60861
Adding concept URLs to the new data model
6c7aded
CLEANUP andn FIX description population
de2aab9
CLEANUP, TESTS and BUG: Correcting index name retrieval
dbfd8f7
Merge branch 'data-model-update' of github.com:helxplatform/dug into …
e4a7b24
FIX: return a string purl and not none
8ab011a
FIX index passing
60d0709
Correcting tests
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| #!/usr/bin/env python | ||
| # | ||
| # export_ddm_as_json_schema.py - Export Dug Data Model as JSON Schema | ||
| # | ||
| # SYNOPSIS | ||
| # PYTHONPATH=src python bin/export_ddm_as_json_schema.py | ||
| # | ||
|
|
||
| import click | ||
| import json | ||
| import logging | ||
|
|
||
| from dug.core.parsers._base import DugStudy, DugSection, DugVariable | ||
|
|
||
| logging.basicConfig(level=logging.INFO) | ||
|
|
||
| @click.command() | ||
| def export_ddm_as_json_schema(): | ||
| """ | ||
|
|
||
| :return: | ||
| """ | ||
| logging.info("Exporting Dug Data Model as JSON Schema") | ||
|
|
||
| json_schema = { | ||
| '$schema': 'https://json-schema.org/draft/2020-12/schema', | ||
| # This is what Pydantic supports: https://docs.pydantic.dev/latest/api/json_schema/#pydantic.json_schema.GenerateJsonSchema | ||
| 'definitions': { | ||
| 'DugSection': DugSection.model_json_schema(), | ||
| 'DugVariable': DugVariable.model_json_schema(), | ||
| 'DugStudy': DugStudy.model_json_schema() | ||
| }, | ||
| # We want to validate a list of heterogenous objects: each item in the list may be any of the Dug objects above. | ||
| 'type': 'array', | ||
| 'items': { | ||
| 'oneOf': [ | ||
| {'$ref': '#/definitions/DugSection'}, | ||
| {'$ref': '#/definitions/DugVariable'}, | ||
| {'$ref': '#/definitions/DugStudy'} | ||
| ] | ||
| } | ||
| } | ||
|
|
||
| print(json.dumps(json_schema, indent=2)) | ||
|
|
||
|
|
||
| if __name__ == '__main__': | ||
| export_ddm_as_json_schema() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,4 +6,5 @@ markers = | |
| api: mark a test as an api test | ||
| cli: mark a test as a cli test | ||
| testpaths = | ||
| tests | ||
| tests | ||
| pythonpath = src | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,7 +12,7 @@ MarkupSafe | |
| ormar | ||
| mistune | ||
| pluggy | ||
| pydantic==2.9.2 | ||
| pydantic==2.12.3 | ||
| pyrsistent | ||
| pytest | ||
| pytest-asyncio | ||
|
|
||
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| from pydantic import BaseModel, field_validator | ||
| from typing import List, Optional, Any | ||
|
|
||
| class GetFromIndex(BaseModel): | ||
| size: int = 0 | ||
|
|
||
| class SearchConceptQuery(BaseModel): | ||
| query: str | ||
| offset: int = 0 | ||
| size: int = 20 | ||
| concept_types: list = None | ||
|
|
||
| class SearchVariablesQuery(BaseModel): | ||
| query: str | ||
| concept: str = "" | ||
| offset: int = 0 | ||
| size: int = 1000 | ||
|
|
||
| class FilterGrouped(BaseModel): | ||
| key: str | ||
| value: List[Any] | ||
| class SearchVariablesQueryFiltered(SearchVariablesQuery): | ||
| filter: List[FilterGrouped] = [] | ||
|
|
||
| class SearchKgQuery(BaseModel): | ||
| query: str | ||
| unique_id: str | ||
| index: str = "kg_index" | ||
| size:int = 100 | ||
|
|
||
| class SearchElementQuery(BaseModel): | ||
| query: str = None | ||
| parent_ids: Optional[List] = None | ||
| element_ids: Optional[List] = None | ||
| concept: Optional[str] = None | ||
| size: Optional[int] = 100 | ||
| offset: Optional[int] = 0 | ||
|
|
||
| @field_validator("parent_ids", "element_ids", mode="before") | ||
| @classmethod | ||
| def drop_empty_strings(cls, v): | ||
| if v is None: | ||
| return v | ||
| return [item for item in v if item not in ("", None)] | ||
|
|
||
| class VariableIds(BaseModel): | ||
| """ | ||
| List of variable IDs | ||
| """ | ||
| ids: Optional[List[str]] = [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| from dug.core.parsers._base import * | ||
| from pydantic import BaseModel, model_serializer | ||
| from typing import Optional, Any | ||
|
|
||
|
|
||
| class ElasticResultMetaData(BaseModel): | ||
| total_count: int | ||
| offset: int | ||
| size: int | ||
|
|
||
|
|
||
| class ElasticDugElementResult(BaseModel): | ||
| # Class for all entities from elastic search, we are going to have score... optionally explanation | ||
| score: float = Field(default=999) | ||
| explanation: dict = Field(default_factory=dict) | ||
| # we are going to ignore concepts... | ||
| concepts: None = Field(default=None, exclude=True) | ||
|
|
||
|
|
||
| class DugAPIResponse(BaseModel): | ||
| results: List[ElasticDugElementResult] | ||
| metadata: Optional[ElasticResultMetaData] = Field(default_factory=dict) | ||
|
|
||
|
|
||
| class ConceptResponse(ElasticDugElementResult, DugConcept): | ||
| identifiers: List[Any] | ||
| concepts: None = Field(default=None, exclude=True) | ||
|
|
||
|
|
||
| class ConceptsAPIResponse(BaseModel): | ||
| metadata: ElasticResultMetaData | ||
| results: List[ConceptResponse] | ||
| concept_types: dict = Field(default="") | ||
|
|
||
|
|
||
| class VariableResponse(ElasticDugElementResult, DugVariable): | ||
| @model_serializer | ||
| def serialize(self): | ||
| response = self.get_response_dict() | ||
| return response | ||
|
|
||
|
|
||
| class VariablesAPIResponse(DugAPIResponse): | ||
| results: List[VariableResponse] | ||
|
|
||
|
|
||
| class StudyResponse(ElasticDugElementResult, DugStudy): | ||
| @model_serializer | ||
| def serialize(self): | ||
| response = self.get_response_dict() | ||
| response.pop('abstract') | ||
| return response | ||
|
|
||
|
|
||
| class StudyAPIResponse(DugAPIResponse): | ||
| results: List[StudyResponse] | ||
|
|
||
|
|
||
| class SectionResponse(ElasticDugElementResult, DugSection): | ||
| @model_serializer | ||
| def serialize(self): | ||
| response = self.get_response_dict() | ||
| return response | ||
|
|
||
|
|
||
| class SectionAPIResponse(DugAPIResponse): | ||
| results: List[SectionResponse] | ||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.