This 2021 update was prompted by the discovery of an incorrect barcode data file for the MacGillivray's / Mourning Warbler GBS data. The files have now been completely re-processed, correcting the error. These updated versions of the original Dryad archive files are provided in this update: scripts: converting GBS reads to genotypes This text file ("warbler_genomics_processing_scripts_update2021.txt") contains scripts and notes for the steps used in converting raw Illumina GBS sequencing reads to individual genotypes (at both variant and invariant sites) across the genome. The resulting genotype files (in "012NA" format) were then used as input into R, for the rest of the analysis and production of figures. warbler_genomics_processing_scripts_update2021.txt custom R functions file This R file ("genomics_R_functions_V2.R") contains functions written by Darren Irwin originally for the analysis of Greenish Warbler GBS variation (in Irwin et al. 2016, Molecular Ecology) and modified more recently for the analysis of 3 North American warbler species groups (in Irwin et al. in review, Molecular Ecology). These functions are designed to work more broadly for any dataset in similar input format. To reproduce the analysis in the paper, the main R script file ("warbler_GBS_analysis_script_for_Dryad.R") should be run; it calls functions in the present file. genomics_R_functions_V2.R warbler GBS analysis R script This file ("warbler_GBS_analysis_script_2021_reanalysis.R") contains the main R scripts used to conduct the analysis and produce the figures. It uses as input the "012NA" files (and associated files) produced as described in the "warbler_genomics_processing_scripts.txt" file. Note that the metadata files "warbler.Fst_groups_14each.txt" (also contained in this Dryad package) is also needed. Also crucial is a file of R functions ("genomics_R_functions_V2.R") written especially for this analysis, but designed to work more generally; these functions are called by this script file. warbler_GBS_analysis_script_2021_reanalysis.R warbler.Fst_groups_14each_correct This file is required for conducting the 117-sample analysis (14 individuals per population, except 5 for goldmani) using the R script provided in this package. The file provides the names of each individual, the location code, the "group" (basically the specific or subspecific name) and "Fst_group" (the code used for defining groups in the Fst analysis, and for colouring the figures), and the "plot_order" (not used in the present paper). warbler.Fst_groups_14each_correct.txt warbler genotypes in "012NA" format This folder contains genotypic information used in the 117-sample analysis (14 individuals for each population, plus 5 for goldmani). For each chromosome, there is a file containing the genotypes (ending in "012NA"), a file containing the list of individuals (ending in "012.indv"), and a file containing the list of positions on the chromosome (ending in "012.pos"). For details of how these files were produced see the file "warbler_genomics_processing_scripts.txt", also provided in this Dryad package. These files are ready to be used for subsequent analysis and presentation in the R scripts supplied in this package. The metadata file ("warbler.Fst_groups_14each_correct.txt"; also in this package) is also needed in the R processing. warbler_GBS_14each_012NA_files_update2021.tar.gz SiteStats and WindowStats R files This folder contains files containing locus-based statistics ("SiteStats") and window-based statistics ("WindowStats") for each chromosome. These files can be produced by the R script (in this package), and they can also be used by that script (whether the script saves and/or loads SiteStats and WindowStats files can be adjusted in that script using the setting for "calculate_or_load_stats" and the related settings below that). Producing these files can take days of processing time; I have included them here so you can produce most of the figures in the paper, by running the R script below the heading "GENOME-WIDE plots". That script will call the appropriate files (as long as you have designated a path/folder structure that matches the R script). Separate WindowStats files are included for window sizes of 10000 (the main analysis in the paper) and 5000 (referred to briefly in the paper, with one figure in the supplement). SiteStats_and_WindowStats_files_update2021.tar.gz genotypes at SNPs only across the whole genome This folder contains genotypic information for variant sites only (no invariant sites) across the whole genome, among all individuals in the study. The folder contains a file containing the genotypes (ending in "012NA"), a file containing the list of individuals (ending in "012.indv"), and a file containing the list of positions on the chromosome (ending in "012.pos"). For details of how these files were produced see the file "warbler_genomics_processing_scripts.txt", also provided in this Dryad package. These files are ready to be used for subsequent analysis and presentation in the R scripts supplied in this package. The metadata file ("warbler.Fst_groups_14each_correct.txt"; also in this package) will be needed in the R processing. warbler_GBS_14each_SNPs_only_whole_genome_012NA_files.tar.gz
Detailed evaluations of genomic variation between sister species often reveal distinct chromosomal regions of high relative differentiation (i.e., “islands of differentiation” in FST), but there is much debate regarding the causes of this pattern. We briefly review the prominent models of genomic islands of differentiation and compare patterns of genomic differentiation in three closely related pairs of New World warblers with the goal of evaluating support for the four models. Each pair (MacGillivray's/mourning warblers; Townsend's/black-throated green warblers; and Audubon's/myrtle warblers) consists of forms that were likely separated in western and eastern North American refugia during cycles of Pleistocene glaciations and have now come into contact in western Canada, where each forms a narrow hybrid zone. While there are a few differentiation peaks shared between the species pairs, substantial differences between pairs in which regions have high FST suggest differing selective forces and/or differing genomic responses to similar selective forces among the three pairs. Across most of the genome, levels of within-group nucleotide diversity (πWithin) are almost as large as levels of between-group nucleotide distance (πBetween) within each pair, suggesting recent common ancestry and/or gene flow. In all three pairs, a pattern of high‐FST regions having lower πBetween (compared to moderate‐FST regions) suggests that selective sweeps spread between geographically differentiated groups, followed by local differentiation. This “sweep-before-differentiation” model is consistent with signatures of gene flow within the yellow-rumped warbler species complex. These findings add to our growing understanding of speciation as a complex process that can involve phases of adaptive introgression among partially differentiated populations.
Please see the 2018 paper and the 2021 Corrigendum, which together describes this in detail. Additional detail is provided in this file: warbler_genomics_processing_scripts_update2021.txt