What is ASAFE?
The ASAFE EM algorithm provides maximum likelihood estimates of ancestry-specific allele frequencies at a bi-allelic marker, given local ancestries and genotypes at the marker. It deals with uncertainty in the phase (i.e. order) of a local ancestry pair relative to a genotype.
ASAFE was motivated by genome-wide association studies of Hispanics performed by the University of Washington Genetic Analysis Center as part of the Hispanic Community Health Study, and applies to any 3-way admixed diploid population, not just humans.
15 min talk (with commentary) , 15 min talk (no commentary) : I presented these at the XXVIIIth International Biometric Conference/WNAR combined conference 2016. Includes some introduction to genetics.
How does ASAFE fit into a genetic analysis workflow? Show me the picture.
In the top folder of the ASAFE R package is a file, ASAFE_Visual.pdf, illustrating how ASAFE fits into a genetic analysis workflow:
Sharon Browning has some code that does the steps in the diagram involving phasing of genotypes and running of RFMIX. It is the part of the following script above the line that says "Apply masking": http://faculty.washington.edu/sguy/local_ancestry_pipeline/rfmix_mds_pipeline. I've extracted the relevant part of the script here:
How do I use the ASAFE R Package?
To download the package, click either of the two "Download" blue links on the upper right of this webpage, or go to https://github.com/BiostatQian/ASAFE and click the green "Clone or download" link. In the main ASAFE package folder, see the vignette inst/doc/ASAFE.pdf for instructions on how to use the package.
A smaller version of this package (which has the same R code and unit tests, but not information needed to reproduce the paper) is on Bioconductor: https://bioconductor.org/packages/ASAFE/.
Changes and Feedback
If your question is not answered on this page or in the vignette, feel free to contact me with package-related questions.
[Changes made June 4, 2016]
(1) Removed function em() from vignette, because a user would likely only be interested in using function algorithm_1snp(), which calls em().
(2) Change variable names in estep.R, mstep.R, and em.R to match the supplement.
There's a typo in the supplement. Tables are called "Table 1", "Table 2", "Supplementary Table 1", and "Supplementary Table 2", instead of Tables 1, 2, 3, and 4. I've changed comments in the code to hopefully make the distinction amongst tables clear.
Qian Sophia Zhang (firstname.lastname@example.org)