DATE: 20-Jan-2020 README for: "Adaptive evolution in a conifer hybrid zone is driven by a mosaic of introgressed and standing genetic variants" (COMMSBIO-20-0007-T) This directory lists all the custom scripts that were used in the analyses conducted in COMMSBIO-20-0007-T. For some of them, the input files are also provided. The bash scripts usually used to submit the jobs are not provided. Instances where the program or software name is listed in the manuscript and the setting used are mentioned or the default settings were used are also not provided. #############If you use these script:################## Modify the path to fit yours Install all the needed libraries (for R) Lastly, remember to cite us if you use the scripts as is. ########################################################## **************Sub-directory structure******************** SNPfiltering_AdaptIntro.ipynb: Script for conducting custom filtering of SNPs identified through dDocent. ##InData: This sub-directory contains a series of input files that are useful for the analyses mentioned in this directory. NOTE: It does not contain all input files, as they were generated as a part of other programs or scripts. 98PopIDs: This file contains a list of the 98 populations that were designated as hybrids and hence used throughout this paper periphery_climate1981-2010.txt: This file contains climate data for the hybrid zone populations. (obtained from ClimateWNA) SoilGrid1km.txt: Contains soil data for the hybrid zone populations. (Obtained from SoilGrid) pureLP_Normal_1981_2010.csv: Contains climate data for the pure P. flexilis populations. pureSWWP_Normal_1981_2010.csv: Contains climate data for the pure P. strobiformis populations. pureLP_soil1km.csv: Contains soil data for pure P. flexilis populations. pureSWWP_soil1km.csv: Contains soil data for pure P. strobiformis populations. PopInd.txt: Population ID and the likely group they belong to. (note: this classification was done prior to running NGSAdmix) coords_chpt2.txt: Site ID code and location information for all populations used in this study. SNPdataset.txt: Contains the full SNP dataset used in this paper in 012 format, following the procedure in SNPfiltering_AdaptIntro.ipynb. ##Bayenv: This sub-directory contains scripts used to combine the output from multiple chains of Bayenv (the GEA part of it), to get a stringent set of outlier SNPs. Figure 2 provides a representation of this. It assumes that you already have the output from Bayenv runs. Have to make sure that the file naming is similar. Some of the files needed for the script are provided, and others that are not provided are listed in the script. Bayenv_forGit.R baySNP_IDs (This is a list of IDs that bayenv assigns to the input SNPs while performing the runs). ##RDA: This sub-directory contains script for conducting the RDA variance partioning and several input files. The output is represented in Table 1. allelefreq_110pops.txt: Minor allele frequency estimates by loci and by population for the US range of Pinus strobiformis (note: not all are hybrids) eigenVect_out251500_Bay.txt: First eigenvector estimates for all hybrid populations generated from the var-cov matrix estimation in bayenv. NGSK2_nomaf_ordered_1122.meanQ: Estimates of meanQ score for all individuals used in this study, obtained from running NGSAdmix at K=2. RDA.R: Script for performing RDA as used in this paper. ##LD: This sub-directory contains scripts for conducting LD based analyses represented in various panels of Figure 3. Some of the steps here are very time consuming due to the estimation of pwLD and hence were performed on a GridEngine. I mention this specifically within the script. LDnr_archive.Rmd: Script for conducting LD based network analyses. OhtaD_pwEst.Rmd: Script for conducting PW estimation of Ohta's D using SNPs identified as outliers via. Bayenv. ##AdapIntro: This sub-directory contains a series of scripts that were used to assess adaptive introgression. These include the script used for the FE analyses represented in Figure 4 and the analyses presented in Appendix S2. More information, specific to the input files needed for both these scripts are mentioned in the files themselves. AF_LP_EnvDiff.R AdapIntro_FE.Rmd