KING Tutorial: Gene Mapping
KING is a toolset to explore genotype data from a genome-wide association study (GWAS) or a sequencing project.
KING can be used to map genes for diseases and complex traits.
GENERAL INPUT FILES
The input files need to be in PLINK binary format.
Besides the standard PLINK binary format (ex.fam, ex.bim, ex.bed), two other files can be specified,
including ex.phe for phenotypes and ex.cov for covariates.
KING searches for all 5 files automatically even though only one file (ex.bed) needs to be specified in the command line.
GENOME-WIDE ASSOCIATION SCAN
Two association analyses are available in KING at the moment: TDT and multi-trait score test (mtscore). Examples are:
prompt> king -b ex.bed --tdt
prompt> king -b ex.bed --cov, --mtscore --maxP 5E-8 --invnorm
--tdt implements the well-known Transmission/Disequilibrium Test for family data that consist of parent-affected child trios.
--mtscore implements a score test for association between a SNP and a quantitative trait.
Although the statistics is standard, the computing is very efficient for a lot of traits, e.g., in eQTL/pQTL/meQTL/mQTL analysis,
where association needs to be examined exhaustively between each of the 10,000s of traits and 100,0000s of SNPs.
The computational time is usually a few minutes to carry out 10,000s of GWAS scans (on a few hundred samples).
The example above applies an inverse normal transformation to all traits prior to the GWAS scan.
To save time to write to disks, only association results with P value < 5E-8 are printed out, including both cis- and trans- effect associations.
The following parameters can also be specified:
--prefix specifies the name of the file that stores GWAS scan results. "king" is used as default.
--cpus specifies the number of CPU cores to be used in the parallel computing. If not specified, the default number is half of the total number of (logical) cores.
--invnorm carries out inverse normal transformation for quantitative traits prior to association analysis.
--maxP specifies the maximum P values to print out in the output files.
--trait specifies the trait names to be analyzed in the association analysis.
--covariate specifies the covariate names to be adjusted in the association analysis.
Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM
(2010) Robust relationship inference in genome-wide association studies.
Last updated: October 24, 2017 by Wei-Min Chen