Lower bounds on multiple sequence alignment using exact 3-way alignment
Genre
Journal ArticleDate
2007-04-30Author
Colbourn, CJKumar, S
Subject
AlgorithmsBase Sequence
DNA
Molecular Sequence Data
Reproducibility of Results
Sensitivity and Specificity
Sequence Alignment
Sequence Analysis, DNA
Permanent link to this record
http://hdl.handle.net/20.500.12613/5623
Metadata
Show full item recordDOI
10.1186/1471-2105-8-140Abstract
Background: Multiple sequence alignment is fundamental. Exponential growth in computation time appears to be inevitable when an optimal alignment is required for many sequences. Exact costs of optimum alignments are therefore rarely computed. Consequently much effort has been invested in algorithms for alignment that are heuristic, or explore a restricted class of solutions. These give an upper bound on the alignment cost, but it is equally important to determine the quality of the solution obtained. In the absence of an optimal alignment with which to compare, lower bounds may be calculated to assess the quality of the alignment. As more effort is invested in improving upper bounds (alignment algorithms), it is therefore important to improve lower bounds as well. Although numerous cost metrics can be used to determine the quality of an alignment, many are based on sum-of-pairs (SP) measures and their generalizations. Results: Two standard and two new methods are considered for using exact 2-way and 3-way alignments to compute lower bounds on total SP alignment cost; one new method fares well with respect to accuracy, while the other reduces the computation time. The first employs exhaustive computation of exact 3-way alignments, while the second employs an efficient heuristic to compute a much smaller number of exact 3-way alignments. Calculating all 3-way alignments exactly and computing their average improves lower bounds on sum of SP cost in v-way alignments. However judicious selection of a subset of all 3-way alignments can yield a further improvement with minimal additional effort. On the other hand, a simple heuristic to select a random subset of 3-way alignments (a random packing) yields accuracy comparable to averaging all 3-way alignments with substantially less computational effort. Conclusion: Calculation of lower bounds on SP cost (and thus the quality of an alignment) can be improved by employing a mixture of 3-way and 2-way alignments. © 2007 Colbourn and Kumar; licensee BioMed Central Ltd.Citation to related work
Springer Science and Business Media LLCHas part
BMC BioinformaticsADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.eduae974a485f413a2113503eed53cd6c53
http://dx.doi.org/10.34944/dspace/5605
Scopus Count
Collections
Related items
Showing items related by title, author, creator and subject.
-
Evolutionary interactions between N-Linked glycosylation sites in the HIV-1 envelopePoon, AFY; Lewis, FI; Kosakovsky Pond, SL; Frost, SDW; Pond, Sergei L. Kosakovsky|0000-0003-4817-4029 (2007-01-01)The addition of asparagine (N)-linked polysaccharide chains (i.e., glycans) to the gp120 and gp41 glycoproteins of human immunodeficiency virus type 1 (HIV-1) envelope is not only required for correct protein folding, but also may provide protection against neutralizing antibodies as a "glycan shield." As a result, strong host-specific selection is frequently associated with codon positions where nonsynonymous substitutions can create or disrupt potential N-linked glycosylation sites (PNGSs). Moreover, empirical data suggest that the individual contribution of PNGSs to the neutralization sensitivity or infectivity of HIV-1 may be critically dependent on the presence or absence of other PNGSs in the envelope sequence. Here we evaluate how glycan-glycan interactions have shaped the evolution of HIV-1 envelope sequences by analyzing the distribution of PNGSs in a large-sequence alignment. Using a "covarion"-type phylogenetic model, we find that the rates at which individual PNGSs are gained or lost vary significantly over time, suggesting that the selective advantage of having a PNGS may depend on the presence or absence of other PNGSs in the sequence. Consequently, we identify specific interactions between PNGSs in the alignment using a new paired-character phylogenetic model of evolution, and a Bayesian graphical model. Despite the fundamental differences between these two methods, several interactions are jointly identified by both. Mapping these interactions onto a structural model of HIV-1 gp120 reveals that negative (exclusive) interactions occur significantly more often between colocalized glycans, while positive (inclusive) interactions are restricted to more distant glycans. Our results imply that the adaptive repertoire of alternative configurations in the HIV-1 glycan shield is limited by functional interactions between the N-linked glycans. This represents a potential vulnerability of rapidly evolving HIV-1 populations that may provide useful glycan-based targets for neutralizing antibodies. © 2007 Poon et al.
-
A cyclic nucleotide-gated channel mutation associated with canine daylight blindness provides insight into a role for the S2 segment Tri-Asp motif in channel biogenesisTanaka, N; Delemotte, L; Klein, ML; Komáromy, AM; Tanaka, JC (2014-02-21)Cone cyclic nucleotide-gated channels are tetramers formed by CNGA3 and CNGB3 subunits; CNGA3 subunits function as homotetrameric channels but CNGB3 exhibits channel function only when co-expressed with CNGA3. An aspartatic acid (Asp) to asparagine (Asn) missense mutation at position 262 in the canine CNGB3 (D262N) subunit results in loss of cone function (daylight blindness), suggesting an important role for this aspartic acid residue in channel biogenesis and/or function. Asp 262 is located in a conserved region of the second transmembrane segment containing three Asp residues designated the Tri-Asp motif. This motif is conserved in all CNG channels. Here we examine mutations in canine CNGA3 homomeric channels using a combination of experimental and computational approaches. Mutations of these conserved Asp residues result in the absence of nucleotide-activated currents in heterologous expression. A fluorescent tag on CNGA3 shows mislocalization of mutant channels. Co-expressing CNGB3 Tri-Asp mutants with wild type CNGA3 results in some functional channels, however, their electrophysiological characterization matches the properties of homomeric CNGA3 channels. This failure to record heteromeric currents suggests that Asp/Asn mutations affect heteromeric subunit assembly. A homology model of S1-S6 of the CNGA3 channel was generated and relaxed in a membrane using molecular dynamics simulations. The model predicts that the Tri-Asp motif is involved in non-specific salt bridge pairings with positive residues of S3/S4. We propose that the D262N mutation in dogs with CNGB3-day blindness results in the loss of these inter-helical interactions altering the electrostatic equilibrium within in the S1-S4 bundle. Because residues analogous to Tri-Asp in the voltage-gated Shaker potassium channel family were implicated in monomer folding, we hypothesize that destabilizing these electrostatic interactions impairs the monomer folding state in D262N mutant CNG channels during biogenesis. © 2014 Tanaka et al.
-
Organization and differential expression of the GACA/GATA tagged somatic and spermatozoal transcriptomes in Buffalo Bubalus bubalisSrivastava, J; Premi, S; Kumar, S; Ali, S; Kumar, Sudhir|0000-0002-9918-8212 (2008-03-20)Background: Simple sequence repeats (SSRs) of GACA/GATA have been implicated with differentiation of sex-chromosomes and speciation. However, the organization of these repeats within genomes and transcriptomes, even in the best characterized organisms including human, remains unclear. The main objective of this study was to explore the buffalo transcriptome for its association with GACA/GATA repeats, and study the structural organization and differential expression of the GACA/GATA repeat tagged transcripts. Moreover, the distribution of GACA and GATA repeats in the prokaryotic and eukaryotic genomes was studied to highlight their significance in genome evolution. Results: We explored several genomes and transcriptomes, and observed total absence of these repeats in the prokaryotes, with their gradual accumulation in higher eukaryotes. Further, employing novel microsatellite associated sequence amplification (MASA) approach using varying length oligos based on GACA and GATA repeats; we identified and characterized 44 types of known and novel mRNA transcripts tagged with these repeats from different somatic tissues, gonads and spermatozoa of water buffalo Bubalus bubalis. GACA was found to be associated with higher number of transcripts compared to that with GATA. Exclusive presence of several GACA-tagged transcripts in a tissue or spermatozoa, and absence of the GATA-tagged ones in lung/heart highlights their tissue-specific significance. Of all the GACA/GATA tagged transcripts, ∼30% demonstrated inter-tissue and/or tissue-spermatozoal sequence polymorphisms. Significantly, ∼60% of the GACA-tagged and all the GATA-tagged transcripts showed highest or unique expression in the testis and/or spermatozoa. Moreover, ∼75% GACA-tagged and all the GATA-tagged transcripts were found to be conserved across the species. Conclusion: Present study is a pioneer attempt exploring GACA/GATA tagged transcriptome in any mammalian species highlighting their tissue, stage and species-specific expression profiles. Comparative analysis suggests the gradual accumulation of these repeats in the higher eukaryotes, and establishes the GACA richness of the buffalo transcriptome. This is envisaged to establish the roles of integral simple sequence repeats and tagged transcripts in gene expression or regulation. © 2008 Srivastava et al; licensee BioMed Central Ltd.