Rice Genome Variant ID Conversion (IRGSP-1.0 / ZS97RS3 / MH63RS3)

Using variation ID to convert coordinate:

You can input IRGSP-1.0 / ZS97RS3 / MH63RS3 variation ID, e.g. vg0100001641 / vz0100014211 / vm0100006204)

Using chromosome and position to convert coordinate:

Genome:
Chromosome:
Position:


Information

This page enables coordinate- and variant ID–based conversion across multiple rice reference genomes, including IRGSP-1.0 (Japonica reference; Kawahara et al., 2013) and ZS97RS3 / MH63RS3 (Indica references; Song et al., 2021). It allows users to convert variant identifiers or genomic coordinates from one reference genome to homologous positions in other rice genomes.

Variant ID-Based Conversion: Users may input a variant identifier (e.g. IDs prefixed with vg, vz, or vm). The system automatically infers the source reference genome from the identifier prefix and parses the associated chromosome and genomic position. If a precomputed coordinate mapping is available in the database, the result is returned directly. Otherwise, coordinate conversion is performed dynamically using sequence-based alignment.

Coordinate-Based Conversion: Alternatively, users can specify a genomic coordinate by selecting a reference genome and providing a chromosome and position. The input coordinate is converted to homologous positions in the remaining reference genomes based on local sequence similarity, without requiring a variant identifier.

Sequence Windows for Coordinate Mapping: Coordinate conversion is based on short anchor sequences centered on the query position. Three complementary window configurations are used to improve robustness against local insertions, deletions, and assembly-specific differences:

  • Up 100 bp - 100 bp upstream of the query position
  • Up & Down 50 bp - symmetric 50 bp flanking window
  • Down 100 bp - 100 bp downstream of the query position

Alignment Tools and Conversion Strategy: Sequence-based coordinate conversion is performed using high-stringency nucleotide sequence alignment. Large-scale coordinate mappings stored in the database were generated offline using minimap2 (Li et al, 2018), whereas on-the-fly conversion for uncached variants or user-defined coordinates is performed using BLAT. Despite differences in implementation, both approaches employ the same anchor-based strategy and filtering criteria.

Only alignments satisfying all of the following conditions are retained: (i) high sequence identity, (ii) full-length coverage of the query anchor sequence, and (iii) absence of insertions or deletions within the aligned region. When multiple valid alignments are detected, the most recent valid hit is used to determine the projected coordinate, ensuring consistent behavior across repetitive or duplicated regions.

Precomputed Mapping and On-the-Fly Conversion: To improve query performance, coordinate mappings for large numbers of variants were precomputed offline using minimap2 and stored in the database. During user queries, the system first checks for an existing mapping. If no cached record is found, BLAT-based alignment is performed dynamically using the same window-based rules, and the resulting coordinates may subsequently be cached for future queries.

Converted coordinates indicate homologous genomic positions inferred from local sequence similarity. The presence of a mapped position does not imply that a corresponding variant exists or has been identified in the target genome. This framework supports coordinate-level comparison across genomes and is applicable to any genomic locus, independent of variant annotation status.