-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hello! I'm wondering if you would consider implementing the G statistic as in Magwene et al. 2011 (attached)? It's widely used on pool-seq data to find loci underlying trait variation in both laboratory crosses and natural accessions (e.g. Gould et al., 2017, attached). It also includes read depth information so as to account for variation in the uncertainty of allele frequency estimates across the genome. I'm not an expert in these statistics and I don't know if the pool-sequencing corrected Fst from grenedalf accomplishes the same thing.
To calculate G statistic for my current work, I've been doing variant calling with SNAPE-pooled and then running Billie Gould's script from her paper (https://bitbucket.org/billiegould/genomics_tools/src/master/SNAPEtools/G_calcSNAPE.py) and then using R subsetting to filter by read depth criteria. SNAPE-pooled isn't being maintained anymore and needs to run on one chromosome at a time, and I'd love to have fewer steps to string together when calculating G statistic.
Magwene 2011.pdf
Gould - Molecular Ecology - 2016 - Pooled ecotype sequencing reveals candidate genetic mechanisms for adaptive.pdf
Thanks so much for your time! Of course I completely understand if you can't prioritize the G statistic, but I appreciate your consideration. Please let me know if there's any other information I can provide.
Best,
Madison