Implement HG-DT Visual Interpretation & Causal Dashboard#8
Conversation
- Added `app.py` Streamlit dashboard with tabs for Specification, Genome Tracks, 3D Organization, Protein Structure, Trajectory Animation, and Mechanistic Attribution. - Created `src/hg_dt/viz/tracks_plotter.py` for 1D track comparison. - Created `src/hg_dt/viz/hic_plotter.py` for 3D contact map comparison. - Created `src/hg_dt/viz/protein_viz.py` for 3D protein structure visualization using py3Dmol/stmol. - Created `src/hg_dt/analyze/attribution.py` for generating mechanistic insights. - Addressed follow-up to support multiple gene selections and temporal visualization of gene accessibility. Co-authored-by: AkeBoss-tech <69588353+AkeBoss-tech@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Adds an HG-DT “Visual Interpretation & Causal Dashboard” Streamlit UI plus supporting visualization/attribution helpers to explore ref-vs-mutant deltas across 1D tracks, Hi-C contact maps, protein structure, and a simple mechanistic summary.
Changes:
- Introduces a new Streamlit dashboard entrypoint (
app.py) with multi-tab views and mock-data generators. - Adds lightweight matplotlib-based plotting utilities for 1D tracks and Hi-C delta maps.
- Adds a protein 3D viewer component and a mechanistic attribution text generator.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
app.py |
New Streamlit dashboard wiring together mock data, plots, and attribution output. |
src/hg_dt/viz/tracks_plotter.py |
Matplotlib helper to render ref/mut/Δ genome tracks. |
src/hg_dt/viz/hic_plotter.py |
Matplotlib helper to render ref/mut/Δ Hi-C heatmaps (upper-triangular). |
src/hg_dt/viz/protein_viz.py |
Streamlit component for 3D protein visualization/comparison using py3Dmol/stmol. |
src/hg_dt/analyze/attribution.py |
Mechanistic insight string builder from delta summary stats. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,39 @@ | |||
| import os | |||
There was a problem hiding this comment.
os is imported but never used in this module; consider removing it to avoid lint noise and keep dependencies minimal.
| import os |
| # Mut track | ||
| axes[1].fill_between(range(len(mut_track)), mut_track, color="orange", alpha=0.7) | ||
| axes[1].set_ylabel("Mutant") | ||
|
|
||
| # Delta track | ||
| delta = mut_track - ref_track | ||
| axes[2].fill_between(range(len(delta)), delta, where=(delta >= 0), color="red", alpha=0.7, label="Gain") | ||
| axes[2].fill_between(range(len(delta)), delta, where=(delta < 0), color="green", alpha=0.7, label="Loss") |
There was a problem hiding this comment.
delta = mut_track - ref_track assumes both inputs are 1D arrays of the same length; if shapes differ this will raise a broadcasting error (or silently misbehave if one is length-1). Add explicit validation (ndim == 1, same length) and raise a clear ValueError early.
| import os | ||
| import matplotlib.pyplot as plt | ||
| import numpy as np | ||
| from matplotlib.colors import LinearSegmentedColormap |
There was a problem hiding this comment.
os and LinearSegmentedColormap are imported but unused in this file; please remove them to keep the module clean and avoid unused-import warnings.
| import os | |
| import matplotlib.pyplot as plt | |
| import numpy as np | |
| from matplotlib.colors import LinearSegmentedColormap | |
| import matplotlib.pyplot as plt | |
| import numpy as np |
| # We rotate the matrix to make it triangular for standard Hi-C view. | ||
| # To keep it simple, we just plot the 2D matrix directly but use upper triangle. | ||
| ref_tri = np.triu(ref_map) | ||
| mut_tri = np.triu(mut_map) | ||
| delta_tri = mut_tri - ref_tri |
There was a problem hiding this comment.
np.triu(ref_map)/np.triu(mut_map) and the subsequent subtraction assume both maps are 2D and the same shape (and typically square for Hi-C). Add input validation (ndim == 2, same shape, optionally square) and raise a clear error if not.
| plt.colorbar(im1, ax=axes[1], fraction=0.046, pad=0.04) | ||
|
|
||
| vmax = np.max(np.abs(delta_tri)) | ||
| im2 = axes[2].imshow(delta_tri, cmap=cmap_delta, vmin=-vmax, vmax=vmax, interpolation='nearest') |
There was a problem hiding this comment.
vmax = np.max(np.abs(delta_tri)) can be 0 when the maps are identical, which makes vmin=-vmax, vmax=vmax invalid (matplotlib warns about identical limits and the colormap scaling becomes meaningless). Guard for vmax == 0 (e.g., skip setting vmin/vmax or use a small epsilon).
| im2 = axes[2].imshow(delta_tri, cmap=cmap_delta, vmin=-vmax, vmax=vmax, interpolation='nearest') | |
| if vmax > 0: | |
| im2 = axes[2].imshow(delta_tri, cmap=cmap_delta, vmin=-vmax, vmax=vmax, interpolation='nearest') | |
| else: | |
| im2 = axes[2].imshow(delta_tri, cmap=cmap_delta, interpolation='nearest') |
| with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp: | ||
| track_img_path = plot_tracks(ref_track, mut_track, tmp.name, title=f"1D Tracks: {st.session_state.gene} {st.session_state.mod_type}") | ||
| st.image(Image.open(track_img_path), use_column_width=True) | ||
|
|
There was a problem hiding this comment.
The Streamlit app writes plot images to NamedTemporaryFile(..., delete=False) but never removes them, which will leak temp files over time on repeated runs. Prefer rendering figures directly to memory (e.g., BytesIO) or explicitly os.unlink(...) after st.image; also ensure any Image.open(...) objects are closed to avoid file-handle leaks.
| with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp: | ||
| hic_img_path = plot_hic_triangle(ref_hic, mut_hic, tmp.name, title=f"3D Contact Map: {st.session_state.gene} {st.session_state.mod_type}") | ||
| st.image(Image.open(hic_img_path), use_column_width=True) |
There was a problem hiding this comment.
Same temp-file leak issue as above: this creates another delete=False image file and leaves it behind after rendering. Please clean up the temp file (or switch to in-memory rendering) to avoid accumulating files in the temp directory.
| with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp: | ||
| fig.tight_layout() | ||
| fig.savefig(tmp.name, dpi=150) | ||
| plt.close(fig) | ||
| st.image(Image.open(tmp.name), use_column_width=True) | ||
|
|
There was a problem hiding this comment.
Same temp-file leak issue again for the trajectory plot: delete=False plus no cleanup will accumulate files across reruns. Consider using an in-memory buffer or deleting the temp file after st.image.
| def generate_mechanistic_insight(mod_details: Dict[str, Any], delta_stats: Dict[str, Any]) -> str: | ||
| """ | ||
| Generate a 'Mechanistic Insight' text string summarizing the multi-scale delta. | ||
|
|
||
| Args: | ||
| mod_details: Dictionary containing modification details like 'type', 'target', 'locus'. | ||
| delta_stats: Dictionary containing computed deltas, e.g., | ||
| {'accessibility_drop': 0.28, 'expression_drop': 0.35, 'loop_weakened': True} | ||
|
|
||
| Returns: | ||
| A human-readable string attributing the structural consequence. | ||
| """ | ||
| mod_type = mod_details.get("type", "modification") | ||
| target = mod_details.get("target", "element") | ||
|
|
||
| insight_parts = [f"This {mod_type} affects {target}"] | ||
|
|
||
| if delta_stats.get("loop_weakened"): | ||
| insight_parts.append("weakens enhancer-promoter loop") | ||
| elif delta_stats.get("loop_strengthened"): | ||
| insight_parts.append("strengthens enhancer-promoter loop") | ||
|
|
There was a problem hiding this comment.
generate_mechanistic_insight is new behavior but has no unit tests. Given this repo already tests similar analysis helpers under src/hg_dt/analyze (e.g., tests/test_alphagenome_integration.py covers src/hg_dt/analyze/deltas.py), please add focused tests that assert the generated string for key cases (loop weakened/strengthened, positive vs negative deltas, missing keys/defaults).
| @@ -0,0 +1,176 @@ | |||
| import streamlit as st | |||
| import numpy as np | |||
| import os | |||
There was a problem hiding this comment.
os is imported but unused in this file; remove it to avoid unused-import warnings.
| import os |
Implemented the Work Order 04 (Visual Interpretation & Causal Dashboard). This includes building an interactive Streamlit app and creating the necessary visualization and analysis modules. Addressed user's follow-up request to include gene selection and time-course accessibility plotting during transitions.
PR created automatically by Jules for task 382463070580210348 started by @AkeBoss-tech