Download the data used in RiceVarMap v2.0

Raw Genotypes:

chr01  chr02  chr03  chr04  chr05  chr06  chr07  chr08  chr09  chr10  chr11  chr12 

Imputed Genotypes:

Tab-separated genotype files

chr01  chr02  chr03  chr04  chr05  chr06  chr07  chr08  chr09  chr10  chr11  chr12 

VCF format

Note: For variants that showed up as missing genotypes in a large number of high-coverage sequenced varieties (depth > 12×), we assigned the corresponding genotypes in these varieties as “DEL” for the situation indicates that these varieties probably have large deletions at these positions compared with the reference genome, which are difficult to identify using GATK. You can choose to download files with or without "DEL" depending on your needs.

With 'DEL':

rice4k_geno_add_del.vcf.gz

Without 'DEL':

rice4k_geno_no_del.vcf.gz

Plink format

With 'DEL':

rice4k_geno_add_del.bed  rice4k_geno_add_del.bim  rice4k_geno_add_del.fam 

Without 'DEL':

rice4k_geno_no_del.bed  rice4k_geno_no_del.bim  rice4k_geno_no_del.fam 

Phenotypes:

Phenotype data

Raw sequence data:

The raw sequence data can be downloaded from the NCBI or EBI European Nucleotide Archive under accession numbers PRJNA171289, ERP000106, ERP000729 and PRJEB6180.

Results of variant effect annotation:

In this section, we present the annotation results of variant effects, which were generated using SnpEff, CooVar, PolyPhen-2, and SIFT. The data are stored in HDF5 format, with chromosomes designated as key values (e.g., “chr01”), and can be accessed using the pandas.read_hdf function in Python. For convenience, the results are also provided in CSV format to facilitate broader accessibility and downstream analyses.

chr01_snpeff_coovar_polyphen_sift_merge_anno.h5   chr01_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr02_snpeff_coovar_polyphen_sift_merge_anno.h5   chr02_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr03_snpeff_coovar_polyphen_sift_merge_anno.h5   chr03_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr04_snpeff_coovar_polyphen_sift_merge_anno.h5   chr04_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr05_snpeff_coovar_polyphen_sift_merge_anno.h5   chr05_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr06_snpeff_coovar_polyphen_sift_merge_anno.h5   chr06_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr07_snpeff_coovar_polyphen_sift_merge_anno.h5   chr07_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr08_snpeff_coovar_polyphen_sift_merge_anno.h5   chr08_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr09_snpeff_coovar_polyphen_sift_merge_anno.h5   chr09_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr10_snpeff_coovar_polyphen_sift_merge_anno.h5   chr10_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr11_snpeff_coovar_polyphen_sift_merge_anno.h5   chr11_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

chr12_snpeff_coovar_polyphen_sift_merge_anno.h5   chr12_snpeff_coovar_polyphen_sift_merge_anno.csv.gz

Short tandem repeats (STRs):

STRs consisting of repetitive 1–6 bp DNA sequence motifs represent a significant fraction of polymorphic variations in eukaryotic genomes. STRs are typically characterized as extremely unstable and hypervariable, with average mutation rates approximately 10 to 104-fold higher than the estimated rates in other parts of the genome. The vast majority of STR mutations are length polymorphisms that are thought to arise primarily due to replication-associated strand slippage. Due to their unique characteristics, STRs have been extensively used as molecular markers for population genetic analysis and genetic mapping.

VCF format

STRs_rice3k.vcf.gz

Reference:

  • Tan, X. et al. Comprehensive analysis of STR variations and their impact on gene expression in rice population(2025) .https://doi.org/10.1016/j.jgg.2025.03.005.
  • Zhao, H. et a. An inferred functional impact map of genetic variants in rice. Mol. Plant 14, 1584–1599 (2021).
  • Zhao, H. et al. RiceVarMap: a comprehensive database of rice genomic variations. Nucleic Acids Res. 43, D1018–D1022 (2015).


The data originated from the research group led by Dr. Xie Xianrong at South China Agricultural University.