Not necessarily a Haven issue (could be done in Tiled or tiled-export), but putting this here for discussion.
The xspress EPICS driver saves several additional NDAttributes to the HDF5 file, one of which is a deatime-factor that needs to be applied to the data. The data-keys coming from our ophyd-async implementation look something like:
- vortex (100, 4, 4096)
- vortex-dt_corrected (100, 4, 4096)
- vortex-element0-dt_factor (100,)
- vortex-element1-dt_factor (100,)
- vortex-element2-dt_factor (100,)
- vortex-element3-dt_factor (100,)
- vortex-element0-dt_percent (100,)
- …
The question is when, where, and how do we apply this correction.
Related, is deadtime correction fundamentally different that other kinds of data correction, like correcting for acquisition time or I0?
Requirements & Constraints
- Be able to access both corrected and uncorrected data
- Corrected data by default
- Does this need to be true for exported HDF5 files?
- Aggregated data (e.g. summed ROIs) should use corrected data
- Minimal (no?) data processing during acquisition
Candidate Solutions
Run Browser Knows about Xspress Detectors
We could write a read_xspress_xrf_data(scan_path: string, with_dt_correction: bool=True) -> NDArray function that does the correction. In order to call this function, the run browser (or Tiled) needs to identify that a given dataset comes from an xspress detector and not some other area detector, so we would need a heuristic to identify xspress-based area detector data, and to identify the associated datasets containing the deadtime data.
The difficult part here is how do we know that this is an xspress detector, and other related questions like what happens if the data keys change?
Run Browser Gets Info about Xspress Detectors from Metadata
The XspressDetector() ophyd device could add a piece of metadata during acquisition describing how to apply a dead time correction and from which other datasets. The run browser would look for this metadata and if it's there, use it to apply a deadtime correction.
This solution requires the analysis tools to make fewer guesses about the data, but it also requires each kind of area detector to declare during acquisition what the analysis steps should be. Kind of a middle ground IMO.
Run Browser Doesn't Know About the Xspress
The run browser could provide a widget to select a second data set by name and use that to correct the main AD array: read_array_data(scan_path: string, correction_path: string) -> NDArray.
This solution requires no knowledge about the type of dataset it is, and doesn't add steps to the acquisition-side of the experiment. This means we could correct other things with the same mechanism, like converting counts to count rates if the acquisition time varies (fly scans?), or whatever other requests our users have. We already do this for applying an I0 reference when plotting, and so we could potentially get two birds stoned at once.
The trade-off is that this only works if the primary data array and the correction data have compatible shapes, which is currently not possible in EPICS. We could offload this to the HDFAdapter but that puts us in the same position but one step removed.
We also need to answer the question of how sophisticated this should get before we just say users need to do their bespoke correction steps on their own in Jupyter or something.
Tiled vs Run-Browser/Tiled-Export
As an aside, we could also customize the HDFAdapter used by Tiled to read the arrays from disk to implement any of these options. As far as I can tell, this doesn't fundamentally change the steps involved, so "in the run browser" or "in export-runs" can be replaced with "in Tiled" and the landscape looks similar (unless I'm missing some other idea). It's not clear to me how we would make both the corrected and uncorrected data available from Tiled.
Not necessarily a Haven issue (could be done in Tiled or tiled-export), but putting this here for discussion.
The xspress EPICS driver saves several additional NDAttributes to the HDF5 file, one of which is a deatime-factor that needs to be applied to the data. The data-keys coming from our ophyd-async implementation look something like:
The question is when, where, and how do we apply this correction.
Related, is deadtime correction fundamentally different that other kinds of data correction, like correcting for acquisition time or I0?
Requirements & Constraints
Candidate Solutions
Run Browser Knows about Xspress Detectors
We could write a
read_xspress_xrf_data(scan_path: string, with_dt_correction: bool=True) -> NDArrayfunction that does the correction. In order to call this function, the run browser (or Tiled) needs to identify that a given dataset comes from an xspress detector and not some other area detector, so we would need a heuristic to identify xspress-based area detector data, and to identify the associated datasets containing the deadtime data.The difficult part here is how do we know that this is an xspress detector, and other related questions like what happens if the data keys change?
Run Browser Gets Info about Xspress Detectors from Metadata
The
XspressDetector()ophyd device could add a piece of metadata during acquisition describing how to apply a dead time correction and from which other datasets. The run browser would look for this metadata and if it's there, use it to apply a deadtime correction.This solution requires the analysis tools to make fewer guesses about the data, but it also requires each kind of area detector to declare during acquisition what the analysis steps should be. Kind of a middle ground IMO.
Run Browser Doesn't Know About the Xspress
The run browser could provide a widget to select a second data set by name and use that to correct the main AD array:
read_array_data(scan_path: string, correction_path: string) -> NDArray.This solution requires no knowledge about the type of dataset it is, and doesn't add steps to the acquisition-side of the experiment. This means we could correct other things with the same mechanism, like converting counts to count rates if the acquisition time varies (fly scans?), or whatever other requests our users have. We already do this for applying an I0 reference when plotting, and so we could potentially get two birds stoned at once.
The trade-off is that this only works if the primary data array and the correction data have compatible shapes, which is currently not possible in EPICS. We could offload this to the HDFAdapter but that puts us in the same position but one step removed.
We also need to answer the question of how sophisticated this should get before we just say users need to do their bespoke correction steps on their own in Jupyter or something.
Tiled vs Run-Browser/Tiled-Export
As an aside, we could also customize the HDFAdapter used by Tiled to read the arrays from disk to implement any of these options. As far as I can tell, this doesn't fundamentally change the steps involved, so "in the run browser" or "in export-runs" can be replaced with "in Tiled" and the landscape looks similar (unless I'm missing some other idea). It's not clear to me how we would make both the corrected and uncorrected data available from Tiled.