Statistical approaches to detecting and analyzing tandem repeats in genomic sequences.
-
Anisimova M
Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW) , Wädenswil , Switzerland.
-
Pečerska J
Department of Biosystems Science and Engineering, ETH Zürich , Basel , Switzerland ; Department of Computer Science, ETH Zürich , Zürich , Switzerland.
-
Schaper E
Department of Computer Science, ETH Zürich , Zürich , Switzerland ; Vital-IT Competency Center, Swiss Institute for Bioinformatics , Lausanne , Switzerland.
Published in:
- Frontiers in bioengineering and biotechnology. - 2015
English
Tandem repeats (TRs) are frequently observed in genomes across all domains of life. Evidence suggests that some TRs are crucial for proteins with fundamental biological functions and can be associated with virulence, resistance, and infectious/neurodegenerative diseases. Genome-scale systematic studies of TRs have the potential to unveil core mechanisms governing TR evolution and TR roles in shaping genomes. However, TR-related studies are often non-trivial due to heterogeneous and sometimes fast evolving TR regions. In this review, we discuss these intricacies and their consequences. We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks. Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor. The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.
-
Language
-
-
Open access status
-
gold
-
Identifiers
-
-
Persistent URL
-
https://sonar.ch/global/documents/55335
Statistics
Document views: 56
File downloads: