• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of TUScholarShareCommunitiesDateAuthorsTitlesSubjectsGenresThis CollectionDateAuthorsTitlesSubjectsGenres

    My Account

    LoginRegister

    Help

    AboutPeoplePoliciesHelp for DepositorsData DepositFAQs

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    DESIGN OF A KEYWORD SPOTTING SYSTEM USING MODIFIED CROSS-CORRELATION IN THE TIME AND THE MFCC DOMAIN

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Anifowose_temple_0225M_10918.pdf
    Size:
    1.389Mb
    Format:
    PDF
    Download
    Genre
    Thesis/Dissertation
    Date
    2012
    Author
    Anifowose, Olakunle
    Advisor
    Yantorno, Robert E.
    Committee member
    Picone, Joseph
    Silage, Dennis
    Department
    Electrical and Computer Engineering
    Subject
    Electrical Engineering
    Cross-correlation
    Matching Score
    Mfcc
    Time
    Zrr
    Permanent link to this record
    http://hdl.handle.net/20.500.12613/694
    
    Metadata
    Show full item record
    DOI
    http://dx.doi.org/10.34944/dspace/676
    Abstract
    Abstract A Keyword Spotting System (KWS) is a system that recognizes predefined keywords in spoken utterances or written documents. The objective is to obtain the highest possible keyword detection rate without increasing the number of false detections in a system. The common approach to keyword spotting is the use of a Hidden Markov Model (HMM). These are usually complex systems which require training speech data. The Typical HMM approach uses garbage templates or HMM models to match non-keyword speech and non-speech sounds. The purpose of this research is to design a simple Keyword Spotting System. The system will be designed to spot English words and should be easily adaptable to other languages There are many challenges in designing a keyword spotting system such as variations in speech like pitch, loudness, timbre that make recognition difficult. There can be wide variations in utterances even from the same speaker. In this research, the use of cross-correlation, as an alternative means for detecting keywords in an utterance, was investigated. This research also involves the modeling of a global keyword using a quantized dynamic time warping algorithm, which can function effectively with multi-speakers. The global keyword is an aggregation of the features from several occurrences of the same keyword. This research also investigates the effect of pitch normalization on keyword detection. The use of cross-correlation as a method for keyword spotting was investigated in both the time and MFCC domain. In the time domain the global keyword was cross-correlated with a pitch-normalized utterance. A zero lag ratio (the ratio of the power around the zero lag obtained from a cross correlation to the power in the rest of the signal is computed) was computed for each speech frame, a threshold was then used to determine if the keyword is present. For the MFCC domain the MFCC features of each keyword were computed, normalized and cross-correlated with the normalized MFCC features of portions of the utterance of the same size as the keyword. Cross-correlation of MFCC features of the keyword with that of each portion of the utterance yields a single value between 0-1. The portion with the highest value is usually the location of the keyword. Results in the time domain varied from keyword to keyword, some words showed a 60% hit rate while the average obtained from various keywords from the Call Home database had an average of 41%. Cross-correlation of the keywords and utterance in the MFCC domain yielded a 66% hit rate in test conducted on all different keywords in the Call Home and Switchboard corpus. The system accuracy is keyword dependent with some keywords having an 85% hit rate
    ADA compliance
    For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
    Collections
    Theses and Dissertations

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Temple University Libraries | 1900 N. 13th Street | Philadelphia, PA 19122
    (215) 204-8212 | scholarshare@temple.edu
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.