Note: For variants that showed up as missing genotypes in a large number of high-coverage sequenced varieties (depth > 12×), we assigned the corresponding genotypes in these varieties as “DEL” for the situation indicates that these varieties probably have large deletions at these positions compared with the reference genome, which are difficult to identify using GATK. You can choose to download files with or without "DEL" depending on your needs.
With 'DEL':
Without 'DEL':
With 'DEL':
rice4k_geno_add_del.bed rice4k_geno_add_del.bim rice4k_geno_add_del.fam
Without 'DEL':
rice4k_geno_no_del.bed rice4k_geno_no_del.bim rice4k_geno_no_del.fam
The raw sequence data can be downloaded from the NCBI or EBI European Nucleotide Archive under accession numbers PRJNA171289, ERP000106, ERP000729 and PRJEB6180.
STRs consisting of repetitive 1–6 bp DNA sequence motifs represent a significant fraction of polymorphic variations in eukaryotic genomes. STRs are typically characterized as extremely unstable and hypervariable, with average mutation rates approximately 10 to 104-fold higher than the estimated rates in other parts of the genome. The vast majority of STR mutations are length polymorphisms that are thought to arise primarily due to replication-associated strand slippage. Due to their unique characteristics, STRs have been extensively used as molecular markers for population genetic analysis and genetic mapping.