Download the data used in RiceVarMap v2.0

Raw Genotypes:

chr01  chr02  chr03  chr04  chr05  chr06  chr07  chr08  chr09  chr10  chr11  chr12 

Imputed Genotypes:

Tab-separated genotype files

chr01  chr02  chr03  chr04  chr05  chr06  chr07  chr08  chr09  chr10  chr11  chr12 

VCF format

Note: For variants that showed up as missing genotypes in a large number of high-coverage sequenced varieties (depth > 12×), we assigned the corresponding genotypes in these varieties as “DEL” for the situation indicates that these varieties probably have large deletions at these positions compared with the reference genome, which are difficult to identify using GATK. You can choose to download files with or without "DEL" depending on your needs.

With 'DEL':

rice4k_geno_add_del.vcf.gz

Without 'DEL':

rice4k_geno_no_del.vcf.gz

Plink format

With 'DEL':

rice4k_geno_add_del.bed  rice4k_geno_add_del.bim  rice4k_geno_add_del.fam 

Without 'DEL':

rice4k_geno_no_del.bed  rice4k_geno_no_del.bim  rice4k_geno_no_del.fam 

Phenotypes:

Phenotype data

Raw sequence data:

The raw sequence data can be downloaded from the NCBI or EBI European Nucleotide Archive under accession numbers PRJNA171289, ERP000106, ERP000729 and PRJEB6180.

Short tandem repeats (STRs):

STRs consisting of repetitive 1–6 bp DNA sequence motifs represent a significant fraction of polymorphic variations in eukaryotic genomes. STRs are typically characterized as extremely unstable and hypervariable, with average mutation rates approximately 10 to 104-fold higher than the estimated rates in other parts of the genome. The vast majority of STR mutations are length polymorphisms that are thought to arise primarily due to replication-associated strand slippage. Due to their unique characteristics, STRs have been extensively used as molecular markers for population genetic analysis and genetic mapping.

VCF format

STRs_rice3k.vcf.gz

Reference:

  • Tan, X. et al. Comprehensive analysis of STR variations and their impact on gene expression in rice population (prepared).
  • Zhao, H. et a. An inferred functional impact map of genetic variants in rice. Mol. Plant 14, 1584–1599 (2021).
  • Zhao, H. et al. RiceVarMap: a comprehensive database of rice genomic variations. Nucleic Acids Res. 43, D1018–D1022 (2015).


The data originated from the research group led by Dr. Xie Xianrong at South China Agricultural University.