Welcome to Rice Variation Map, a comprehensive database of rice genomic variations.

Database contents:

RiceVarMap provides comprehensive information of 6,551,358 single nucleotide polymorphisms (SNPs) and 1,214,627 insertions/deletions (INDELs) identified from sequencing data of 1,479 rice accessions. The SNP genotypes of all accessions were imputed and evaluated, resulting in an overall missing data rate of 0.42% and an estimated accuracy greater than 99%. The SNP/INDEL genotypes of all accessions are available for online queries and downloading. Users can search SNPs/INDELs by identifiers of the SNPs/INDELs, genomic regions, gene identifiers and keywords of gene annotation. Allele frequencies within various sub-populations and the effects of the variation that may alter the protein sequence of a gene are also listed for each SNP/INDEL. The database provides a tool to compare any two accessions and identify the polymorphisms between them. The database also provides geographical details and phenotype images for various rice accessions. In particular, the database provides tools to construct haplotype networks and design PCR-primers by taking into account surrounding known genomic variations.

Data source:

Currently, we collected sequencing data from two sets of rice germplasms consisting of totally 1,479 accessions of cultivated rice (Oryza sativa L.):
The first set of germplasm consisted of 529 accessions selected to represent both the usefulness in rice improvement and the genetic diversity in the cultivated species. We sequenced the 529 accessions using the Illumina HiSeq 2000 in the form of 90-bp paired-end reads to generate high quality sequences of more than one gigabase per accession (>2.5x per genome, total 6.7 billion reads). These raw data is available in NCBI with BioProject accession number PRJNA171289. Actually, we sequenced 533 accessions in this project. After initial analysis, three accessions (C126, W196 and W232) were found with excessive heterozygosity and one (W190) with low mapping rate, these four accessions were excluded in further analysis.
The second set of germplasm was 950 rice accessions sequenced by Huang et al. (2012, Nat. Genet. 44:32-39) that were downloaded from the EBI European Nucleotide Archive (accession number ERP000106 and ERP000729), which consisted of 4.6 billion 73-bp paired-end reads (~1x per genome).
Together these two sets of germplasms included both landraces and improved varieties from 73 countries. These two sets of sequences provided approximately 2400-fold coverage of the rice genome.

Data processing:

Reads were aligned to rice reference genome (Nipponbare, MSU version 6.1) using software BWA. SNPs/INDELs were identified using SAMtools and BCFtools. Synonymous/non-synonymous SNPs and SNPs/INDELs with large-effect changes were annotated based on gene models of the annotation version 6.1 of Nipponbare from MSU using SNP effector. We then performed imputation using an in-house modified k nearest neighbour algorithm. The details of data processing and evaluation is described in Notes and Data Evaluation page.

Acknowledgements:

This work was supported by grants from the National High Technology Research and Development Program of China (863 Program: 2012AA10A304 and 2014AA10A602), the National Natural Science Foundation of China (31100962, 31123009 and J1103510), the Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20110146120013), and the Fundamental Research Funds for the Central Universities (2011PY068).

Comments or Questions?

For any questions please contact Hu Zhao (zhaohu@webmail.hzau.edu.cn).

Recommended browsers:

The recommended browsers are Chrome, Firefox, Safari and Internet Explorer (IE8 or later, IE7 and earlier have poorer support and may give a lesser experience).

Citations:

Researchers who wish to use RiceVarMap are encouraged to refer to our publication or more:

Zhao, H., Yao, W., Ouyang, Y., Yang, W., Wang, G., Lian, X., Xing, Y., Chen, L. and Xie, W. (2014) RiceVarMap: a comprehensive database of rice genomic variations. Nucleic Acids Research. doi: 10.1093/nar/gku894