• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of TUScholarShareCommunitiesDateAuthorsTitlesSubjectsGenresThis CollectionDateAuthorsTitlesSubjectsGenres

    My Account

    LoginRegister

    Help

    AboutPeoplePoliciesHelp for DepositorsData DepositFAQs

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Machine Learning Algorithms for Characterization and Prediction of Protein Structural Properties

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Shapovalov_temple_0225E_13894.pdf
    Size:
    6.652Mb
    Format:
    PDF
    Download
    Genre
    Thesis/Dissertation
    Date
    2019
    Author
    Shapovalov, Maxim V
    Advisor
    Vucetic, Slobodan
    Committee member
    Vucetic, Slobodan
    Obradovic, Zoran
    Zhang, Kai
    Dunbrack, Roland L.
    Carnevale, Vincenzo
    Department
    Computer and Information Science
    Subject
    Computer Science
    Bioinformatics
    Clustering
    Loop Modeling
    Machine Learning
    Neural Network
    Protein
    Structural Biology
    Permanent link to this record
    http://hdl.handle.net/20.500.12613/2356
    
    Metadata
    Show full item record
    DOI
    http://dx.doi.org/10.34944/dspace/2338
    Abstract
    Proteins are large biomolecules which are functional building blocks of living organisms. There are about 22,000 protein-coding genes in the human genome. Each gene encodes a unique protein sequence of a typical 100-1000 length which is built using a 20-letter alphabet of amino acids. Each protein folds up into a unique 3D shape that enables it to perform its function. Each protein structure consists of some number of helical segments, extended segments called sheets, and loops that connect these elements. In the last two decades, machine learning methods coupled with exponentially expanding biological knowledge databases and computational power are enabling significant progress in the field of computational biology. In this dissertation, I carry out machine learning research for three major interconnected problems to advance protein structural biology as a field. A separate chapter in this dissertation is devoted to each problem. After the three chapters I conclude this doctoral research with a summary and direction of our future work. Chapter 1 describes design, training and application of a convolutional neural network (SecNet) to achieve 84% accuracy for the 60-year-old problem of predicting protein secondary structure given a protein sequence. Our accuracy is 2-3% better than any previous result, which had only risen 5% in last 20 years. We identified the key factors for successful prediction in a detailed ablation study. A paper submitted for publication includes our secondary-structure prediction software, data set generation, and training and testing protocols [1]. Chapter 2 characterizes the design and development of a protocol for clustering of beta turns, i.e. short structural motifs responsible for U-turns in protein loops. We identified 18 turn types, 11 of which are newly described [2]. We also developed a turn library and cross-platform software for turn assignment in new structures. In Chapter 3 I build upon the results from these two problems and predict geometries in loops of unknown structure with custom Residual Neural Networks (ResNet). I demonstrate solid results on (a) locating turns and predicting 18 types and (b) prediction of backbone torsion angles in loops. Given the recent progress in machine learning, these two results provide a strong foundation for successful loop modeling and encourage us to develop a new loop structure prediction program, a critical step in protein structure prediction and modeling.
    ADA compliance
    For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
    Collections
    Theses and Dissertations

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Temple University Libraries | 1900 N. 13th Street | Philadelphia, PA 19122
    (215) 204-8212 | scholarshare@temple.edu
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.