A LOCAL-GLOBAL GENE COMPARISON FOR ORTHOLOG DETECTION IN TWO CLOSELY RELATED EUKARYOTES SPECIES
Keywords:
Ortholog Detection Algorithms, Similarity Measures, Bipartite Graph PartitioningAbstract
Ortholog detection has included the comparison of different gene features to build a phylogenetic tree or the initial genome
correspondence graph. Many pre-processing procedures have been applied to prune graph structures before the clustering of
potential orthologs. Then, some post-processing improvements have contributed in (>90%) of precision. Although, some algorithms yield high levels of precision, it is still the main target for comparative genomics community. In this paper, we present an ortholog detection algorithm which combines sequence homology, length and global genomes rearrangements into a novel local-global gene dissimilarity measure for the comparison of two closely related eukaryotes species. We use Locally Collinear Blocks reported by the “Multiple Alignment of Conserved Genomic Sequence with Rearrangements” software (MAUVE) to take into account global genome rearrangements. We build a weighted undirected complete bipartite graph representing the comparison of the two genomes with the global gene dissimilarity measure. The pre-processing step eliminates all edges with weight over 20% of the minimum weight. Next, we resolve ambiguities by keeping matches within synteny blocks. Finally, in the clustering process we search for Best Unambiguous Subsets representing homology groups and pairs of orthologs. We present an experiment with S. Cerevisiae and S. Bayanus with 98.45% of true classifications.


