• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of TUScholarShareCommunitiesDateAuthorsTitlesSubjectsGenresThis CollectionDateAuthorsTitlesSubjectsGenres

    My Account

    LoginRegister

    Help

    AboutPeoplePoliciesHelp for DepositorsData DepositFAQs

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Budgeted Online Kernel Classifiers for Large Scale Learning

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Wang_temple_0225E_10430.pdf
    Size:
    764.5Kb
    Format:
    PDF
    Download
    Genre
    Thesis/Dissertation
    Date
    2010
    Author
    Wang, Zhuang
    Advisor
    Vucetic, Slobodan
    Committee member
    Bai, Li
    Latecki, Longin
    Obradovic, Zoran
    Department
    Computer and Information Science
    Subject
    Computer Science
    Kernel Method
    Large-scale Learning
    Online Learning
    Perceptron
    Support Vector Machine
    Permanent link to this record
    http://hdl.handle.net/20.500.12613/3777
    
    Metadata
    Show full item record
    DOI
    http://dx.doi.org/10.34944/dspace/3759
    Abstract
    In the environment where new large scale problems are emerging in various disciplines and pervasive computing applications are becoming more common, there is an urgent need for machine learning algorithms that could process increasing amounts of data using comparatively smaller computing resources in a computational efficient way. Previous research has resulted in many successful learning algorithms that scale linearly or even sub-linearly with sample size and dimension, both in runtime and in space. However, linear or even sub-linear space scaling is often not sufficient, because it implies an unbounded growth in memory with sample size. This clearly opens another challenge: how to learn from large, or practically infinite, data sets or data streams using memory limited resources. Online learning is an important learning scenario in which a potentially unlimited sequence of training examples is presented one example at a time and can only be seen in a single pass. This is opposed to offline learning where the whole collection of training examples is at hand. The objective is to learn an accurate prediction model from the training stream. Upon on repetitively receiving fresh example from stream, typically, online learning algorithms attempt to update the existing model without retraining. The invention of the Support Vector Machines (SVM) attracted a lot of interest in adapting the kernel methods for both offline and online learning. Typical online learning for kernel classifiers consists of observing a stream of training examples and their inclusion as prototypes when specified conditions are met. However, such procedure could result in an unbounded growth in the number of prototypes. In addition to the danger of the exceeding the physical memory, this also implies an unlimited growth in both update and prediction time. To address this issue, in my dissertation I propose a series of kernel-based budgeted online algorithms, which have constant space and constant update and prediction time. This is achieved by maintaining a fixed number of prototypes under the memory budget. Most of the previous works on budgeted online algorithms focus on kernel perceptron. In the first part of the thesis, I review and discuss these existing algorithms and then propose a kernel perceptron algorithm which removes the prototype with the minimal impact on classification accuracy to maintain the budget. This is achieved by dual use of cached prototypes for both model presentation and validation. In the second part, I propose a family of budgeted online algorithms based on the Passive-Aggressive (PA) style. The budget maintenance is achieved by introducing an additional constraint into the original PA optimization problem. A closed-form solution was derived for the budget maintenance and model update. In the third part, I propose a budgeted online SVM algorithm. The proposed algorithm guarantees that the optimal SVM solution is maintained on all the prototype examples at any time. To maximize the accuracy, prototypes are constructed to approximate the data distribution near the decision boundary. In the fourth part, I propose a family of budgeted online algorithms for multi-class classification. The proposed algorithms are the recently proposed SVM training algorithm Pegasos. I prove that the gap between the budgeted Pegasos and the optimal SVM solution directly depends on the average model degradation due to budget maintenance. Following the analysis, I studied greedy multi-class budget maintenance methods based on removal, projection and merging of SVs. In each of these four parts, the proposed algorithms were experimentally evaluated against the state-of-art competitors. The results show that the proposed budgeted online algorithms outperform the competitive algorithm and achieve accuracy comparable to non-budget counterparts while being extremely computationally efficient.
    ADA compliance
    For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
    Collections
    Theses and Dissertations

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Temple University Libraries | 1900 N. 13th Street | Philadelphia, PA 19122
    (215) 204-8212 | scholarshare@temple.edu
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.