Skip to content

Not enough RAM for (md.Data(name=system.name)[-1]).to_xarray() #90

@DivineMassacre

Description

@DivineMassacre

Hello,

I encountered a problem with RAM overflow when using something like:

system = func(parameters)
drive = md.Data(name=system.name)[-1]
xarray = drive.to_xarray()

where func() calls function with system initialisation and TimeDrive with particular parameters.

For a small system there is no problem to convert drive data to xarray.DataArray for further analysis or export to other formats. But when a system is large RAM overflows and conversion to xarray.DataArray object is impossible even using chunking and Dask.

For example, I have 64GB RAM and my system dimensions are 30x30 um laterally (XY) with 20 nm thickness (Z) and the cell sizes are 25x25x20 nm (XYZ). TimeDrive is 5 ns long with 2 ps time step (2500 files).

The main reason to use xarray for analysis in my case is to implement a spatially weighted mean for magnetisation components in lateral dimensions instead of spatially uniform mean provided by drive.table.data[].values. Particularly, I tried to use xarray chunking with Dask virtual cluster client initialised to obtain a spatially weighted mean with 2D Gaussian function:

CHUNK_SIZES = {
    't': 20,     
    'x': 750,  
    'y': 750,  
    'z': 1,
    'vdims': 3
}

system = func(parameters)
ds = (md.Data(name=system.name)[-1]).to_xarray().chunk(CHUNK_SIZES)

weights_2d = np.exp(
    -0.5 * (
        ((ds.x - x0) / sigmax)**2 + 
        ((ds.y - y0) / sigmay)**2
    )
).chunk({'x': 750, 'y': 750})
result = ds.weighted(weights_2d).mean(dim=['x', 'y'])

I checked that my code works when the cell sizes are increased to 250x250x20 nm. But with 25x25x20 nm it overflows likely at the step ds = (md.Data(name=system.name)[-1]).to_xarray().chunk(CHUNK_SIZES). So, I think chunking is useless to solve this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions