• VISUAL AND SEMANTIC KNOWLEDGE TRANSFER FOR NOVEL TASKS

      Guo, Yuhong; Shi, Justin Y.; Vucetic, Slobodan; Dragut, Eduard Constantin; Du, Liang (Temple University. Libraries, 2019)
      Data is a critical component in a supervised machine learning system. Many successful applications of learning systems on various tasks are based on a large amount of labeled data. For example, deep convolutional neural networks have surpassed human performance on ImageNet classification, which consists of millions of labeled images. However, one challenge in conventional supervised learning systems is their generalization ability. Once a model is trained on a specific dataset, it can only perform the task on those \emph{seen} classes and cannot be used for novel \emph{unseen} classes. In order to make the model work on new classes, one has to collect and label new data and then re-train the model. However, collecting data and labeling them is labor-intensive and costly, in some cases, it is even impossible. Also, there is an enormous amount of different tasks in the real world. It is not applicable to create a dataset for each of them. These problems raise the need for Transfer Learning, which is aimed at using data from the \emph{source} domain to improve the performance of a model on the \emph{target} domain, and these two domains have different data or different tasks. One specific case of transfer learning is Zero-Shot Learning. It deals with the situation where \emph{source} domain and \emph{target} domain have the same data distribution but do not have the same set of classes. For example, a model is given animal images of `cat' and `dog' for training and will be tested on classifying 'tiger' and 'wolf' images, which it has never seen. Different from conventional supervised learning, Zero-Shot Learning does not require training data in the \emph{target} domain to perform classification. This property gives ZSL the potential to be broadly applied in various applications where a system is expected to tackle unexpected situations. In this dissertation, we develop algorithms that can help a model effectively transfer visual and semantic knowledge learned from \emph{source} task to \emph{target} task. More specifically, first we develop a model that learns a uniform visual representation of semantic attributes, which help alleviate the domain shift problem in Zero-Shot Learning. Second, we develop an ensemble network architecture with a progressive training scheme, which transfers \emph{source} domain knowledge to the \emph{target} domain in an end-to-end manner. Lastly, we move a step beyond ZSL and explore Label-less Classification, which transfers knowledge from pre-trained object detectors into scene classification tasks. Our label-less classification takes advantage of word embeddings trained from unorganized online text, thus eliminating the need for expert-defined semantic attributes for each class. Through comprehensive experiments, we show that the proposed methods can effectively transfer visual and semantic knowledge between tasks, and achieve state-of-the-art performances on standard datasets.