-
Notifications
You must be signed in to change notification settings - Fork 19
Description
We are experiencing issues using Seurat RunPCA function which we traced to the irlba call. With increasing PCs requested (translating to increasing nv and work values) we start to get inconsistent PCs. We figured out that even with the same nv value, using larger work values leads to this behavior. You can find a sample code to reproduce the problem and an input matrix.
A = as.matrix(read.csv("/path/to/input_matrix.csv"))
log = sapply(seq(15, 25, by = 1), function(x) { set.seed(42) tmp = irlba::irlba(A = A, nv = 10, work = x) cat(paste0("Current work value: ", x, " - d values: [", paste0(tmp$d, collapse = ", "), "]\n")) })
which gives the following output:
Current work value: 15 - d values: [364.515240273598, 230.851108439113, 198.782360784733, 192.484410005171, 162.868907525397, 130.710484581722, 106.207022225521, 99.8805706613859, 94.8473720662265, 94.2177890885707]
Current work value: 16 - d values: [364.515240273598, 230.851108439113, 198.782360784733, 192.48441000517, 162.868907525397, 130.710484581722, 106.207022225521, 99.8805706613858, 94.8473720655719, 94.2177890013402]
Current work value: 17 - d values: [364.515240273598, 230.851108439113, 198.782360784733, 192.48441000517, 162.868907525396, 130.710484581722, 106.207022225521, 99.8805706613859, 94.8473720663312, 94.2177891453379]
Current work value: 18 - d values: [364.515240273597, 230.851108439113, 198.782360784733, 192.48441000517, 162.868907525397, 130.710484581721, 106.207022225521, 99.8805706613861, 94.8473720663872, 94.2177891570991]
Current work value: 19 - d values: [364.515240273598, 230.851108439113, 198.782360784733, 192.48441000517, 162.868907525396, 130.710484581722, 106.20702222552, 99.8805706613862, 94.8473720662433, 94.2177891383719]
Current work value: 20 - d values: [364.515240273597, 230.851108439113, 198.782360784733, 192.48441000517, 162.868907525397, 130.710484581721, 106.20702222552, 99.8805706613856, 94.8473720661785, 94.2177891316728]
Current work value: 21 - d values: [364.515240273598, 230.851108439112, 198.782360784733, 192.48441000517, 162.868907525397, 130.710484581722, 106.20702222552, 99.8805706613861, 94.847372066395, 94.2177891611964]
Current work value: 22 - d values: [416.015144115659, 414.710365588711, 389.658535862618, 364.515240273598, 319.68042641293, 260.294111746797, 230.851108439114, 205.688142252177, 198.782360784734, 192.484410005169]
Current work value: 23 - d values: [364.515240273598, 230.851108439113, 198.782360784733, 192.769425656996, 192.48441000517, 162.868907525397, 162.1301500304, 153.485449556037, 130.710484581723, 129.960818263765]
Current work value: 24 - d values: [364.515240273597, 268.676822519523, 230.851108439113, 209.914356761244, 198.782360784733, 197.549569951056, 194.709944290627, 192.48441000517, 189.414109298803, 162.868907525397]
Current work value: 25 - d values: [364.515240273598, 261.850940373684, 259.812079766646, 230.851108439112, 222.914990119514, 198.782360784733, 192.48441000517, 188.158295229676, 186.037081366921, 162.868907525396]
As you can see once the work value reaches 22, the results start to get inconsistent. What could be the problem here? We are testing this using a singularity container that has irlba installed. We only see this behaviour in one of our compute nodes but not in the others using the same container and both nodes are build using the same image/kernel. The sessionInfo() is the following and the same for both nodes:
sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.4.0 Matrix_1.7-0 grid_4.4.0 irlba_2.3.5.1 lattice_0.22-6
You can find the link to the input csv file here: input_matrix.csv Please let me know if you need any other information.