Click here to view this collection in the new DAP user interface
Allele Frequencies between worldwide domestic sheep and Asiatic Mouflon
Supplementary Data9: Allele Frequencies for 14 million SNPs MAF>0.05
A total of 70 animals were sampled from 43 domestic breeds and subjected to genome sequencing. These comprise 46 animals selected from an earlier SNP array based global survey of breed diversity 45 and another six animals used for SNP discovery, construction of the SNP50 BeadChip and CNV detection. The final group of 18 individuals have not been examined before. Breeds were drawn from Asia (12), Africa (6), the Middle East (13), the Americas (8), the United Kingdom (8) and continental Europe (23). Whole genome sequence data for 19 Asian mouflon (Ovis orientalis) was collected and made available by the NEXTGEN project (http://nextgen.epfl.ch/). Fastq files were downloaded from the ENA public repository (http://www.ebi.ac.uk:/ena/data/view/ERP001583) and processed as described below for the domestic sheep genomes.
Genome sequencing, variant detection and annotation.
Paired-end short insert libraries were constructed using 5 ug of genomic DNA and sequenced on the Illumina HiSeq 2000 platform. Reads were mapped against the sheep reference assembly v3.1 using BWA aligner v0.7.12 (bwa aln + bwa sampe, default parameters). Animals were sequenced to an average median depth of 11.8 x (8.4-17.2 x) (Supplementary Table Data 1). Duplicate reads were removed using Picard tools (http://broadinstitute.github.io/picard/), and local realignment around INDELS was performed using GATK v3.2.. Variant detection and SNP diversity analyses were performed using SAMTOOLS 1.2.1 mpileup and annotated using VCFTools v0.1.14. After obtaining genotype calls for a total of 89 samples the following filters were applied using a combination of VCFtools and in-house scripts: i) SNP were retained in positions with read depth between 5x and twice the average depth per sample; ii) minimum mapping quality of 30 and base quality of 20 were applied; iii) SNP within 5bp of INDELS were removed; iv) for SNP pairs separated by less than 4bp, the lower quality variant was excluded; v) tri-allelic variants were removed; vi) SNP called in less than 90% of animals were excluded and vii) SNP displaying an excess of heterozygosity were excluded (--hwe 0.001). This defined a set of 28,100,631 SNP across domestic (67) and mouflon (17) genomes. A total of five low coverage animals were excluded (3 domestic and 2 mouflon). PLINK v1.9 was used to perform genetic diversity estimates and PCA (https://www.cog-genomics.org/plink2). The variant effect predictor tool from ensembl (version 78) was used to identify 24 separate SNP classifications, including coding, missense and non-synonymous substitutions, intron and intergenic, in relation to the gene models annotated on reference assembly OARv3.1 .
Allele frequency (AF) was estimated for each SNP separately for domestic and wild sheep genomes using PLINK V1.9 (--freq –within)
Marina Naval-Sanchez and James Kijas
Affiliation:CSIRO Agriculture & Food, 306 Carmody Road, St. Lucia, 4067, QLD, Australia
Creative Commons Attribution 4.0 International Licence
Naval Sanchez, Marina; Kijas, James (2018): Allele Frequencies between worldwide domestic sheep and Asiatic Mouflon. v1. CSIRO. Data Collection.
All Rights (including copyright) CSIRO 2018.
The metadata and files (if any) are available to the public.
OCE Post Doc - Conseq Animal Domesticati
Domestication fundamentally reshaped animal morphology, physiology and behaviour, offering the opportunity to investigate the molecular processes driving evolutionary change. Here, we assess sheep domestication and artificial selection by comparing genome sequence from 43 modern breeds (Ovis aries) and their Asian mouflon ancestor (O. orientalis) t... moreo identify selection sweeps. Next, we provide a comparative functional annotation of the sheep genome, validated using experimental ChIP-Seq of sheep tissue. Using these annotations, we evaluate the impact of selection and domestication on regulatory sequences and find that sweeps are significantly enriched for protein coding genes, proximal regulatory elements of genes and genome features associated with active transcription. Finally, we find individual sites displaying strong allele frequency divergence are enriched for the same regulatory features. Our data demonstrates that remodelling of gene expression is likely to have been one of the evolutionary forces that drove phenotypic diversification of this common livestock species. less
Marina Naval Sanchez
Others were also interested in