View on GitHub

barley-agroclimatic-association

Association with high-resolution climate data reveals selection footprints in genomes of barley landraces across the Iberian peninsula

This repository contains data files and source code used in a project in which we looked for genome-wide association with high-resolution climate data of the Iberian Peninsula with the goal of discovering selection footprints in the genomes of barley landraces from the Spanish Barley Core Collection. The results of the project are summarized at https://doi.org/10.1111/mec.15009

URL: eead-csic-compbio.github.io/barley-agroclimatic-association

Authors

B Contreras-Moreira (1,2), R Serrano-Notivoli (1), NE Mohamed (1), CP Cantalapiedra (1), S Begueria (1), AM Casas (1), E Igartua (1)

  1. Estacion Experimental de Aula Dei-CSIC, Zaragoza, Spain
  2. Fundacion ARAID, Zaragoza, Spain

**Legend.** Flowchart of the analyses carried out in this work.

There are several R markdown documents describing the selection of agroclimatic variables, mapping and diverse protocols for association analyses:

filename summary
HOWTOclimate Preparation and selection of climate variables
HOWTOstructure Analysis of population structure of Spanish barleys
HOWTORDA Redundancy Analysis
HOWTOLD Linkage Disequilibrium
HOWTOsnps Association between SNPs and climate variables (Bayenv2)
HOWTOsnpsLFMM Association between SNPs and climate variables (LFMM)
HOWTOXtX XtX subpopulation differentiation (Bayenv2)
HOWTOXtX_BAYPASS XtX subpopulation differentiation (BayPass)

Data files

The table below describes some data files used in this work. You can find them in the repository, mainly at maps/, and raw/ folders, if you click on View on GitHub:

filename description
raw/barley_climate_updated.tsv Values of agro-climatic and environmental variables at the geographical collection points of the barley accessions.
raw/barley_climate_pca_scores.tsv Values of principal components of the agro-climatic and environmental variables at the geographical collection points of the barley accessions.
maps/climatologies_5km.RData 5x5 km grids of agro-climatic and environmental (lat, lon, alt) variables over Spain, required for producing the maps in HOWTOclimate.
maps/climatologies_5km_pca.RData 5x5 km grids of principal components of the agro-climatic and environmental variables over Spain, required for producing the maps in HOWTOclimate
matrices/SBCCmatrix_nr_mean.txt Non-redundant covariance matrix obtained by averaging 10 Bayenv2 replicates.
raw/SBCC_Kinship.full.tsv Tab-separated file assigning SBCC landraces to 4 subpopulations.
raw/9920_SNPs_SBCC_50K.tsv Tab-separated matrix with SBCC biallelic SNPs.
raw/9920_SNPs_SBCC_bp_map2017.curated.tsv Tab-separated file with physical positions of SNPs assigned by BARLEYMAP
raw/9920_SNPs_SBCC_cM_map2017.curated.tsv Tab-separated file with genetic positions of SNPs assigned by BARLEYMAP

Bayenv demo

A demo dataset to learn how to run bayenv2 locally is available at bayenv/BAYENV_EXAMPLE.tgz. It can be downloaded in the terminal with: wget https://eead-csic-compbio.github.io/barley-agroclimatic-association/bayenv/BAYENV_EXAMPLE.tgz

Downloads

It is possible to get the complete dataset, source code and documentation in two ways:

Dependencies

Besides bayenv2, LFMM and a few Perl scripts, included in this repository, these protocols require a few R packages which must be installed to reproduce the results:

LDcorSV, ape, calibrate, cluster, corrplot, dendextend, devtools, dplyr, ggplot2, gplots, grid, knitr. maptools, pracma, qqman, raster, vegan.

The BayPass XtX protocol requires the system installation of BayPass.