Efficient human-machine interaction for cooperative learning of shared categories

Details zum Projekt

Projektbeginn (Monat / Jahr):
Projektende (Monat / Jahr):



Künstliche Intelligenz

Forschungsfelder der Hochschule

DFG Forschungsfelder

Lehr- und Forschungsbereiche der amtlichen Statistik


Successful collaboration between a system and a human requires shared visual concepts to reference objects in the environment. When a system is delivered, its pre-trained concepts will only match partially with the concepts required to solve user-specific tasks in an unfamiliar environment. Therefore, the system must be able to learn new concepts and adapt existing ones over time. The goal of this project is to research a communication strategy for efficient interactive learning of visual categories and to implement it in an online system. In the literature a system for online learning of visual categories was proposed, where the human rotates an object in front of a camera and provides the corresponding shape and color category labels via speech. The system learns to represent each category individually by selecting typical prototypes and relevant feature dimensions. For a new input object the system names the detected categories and the user can correct the system output. This minimal form of supervision by the user and output by the system makes interactive learning quite difficult. So the visual support of a category named by the user might only be a part of an object which is potentially occluded in some of its views. Also, the set of categories is usually not homogeneous, but categories can be of different type and relate to each other. Because of  this, the user can help the system to learn a concept by more specifically directing its attention to relevant parts  an object, e.g. via pointing gestures, occlusion , gaze direction, and also by providing meta-data on the categories and their interactions. From the opposite perspective, a more verbose system output could help the user to better judge and control the learning process and finally to develop trust in the system. So the system should be able to intuitively visualize its current representation of a category, e.g. by showing positive and negative examples and highlighting relevant parts, or even to actively ask the user for more examples and explanation. The goal of this research project is to develop a natural, efficient learning process between human and machine together with a flexible underlying representation based on deep learning features to establish new shared concepts and adapt existing ones. The user should provide richer information about the concepts to be learned than just listing their names, e.g. he could provide meta-information on categories or indicate relevant object parts. The system should make use of this meta-information, which might also be pre-defined in ontologies. The system should consider potential sell-occlusion of the visual support of a concept. The system should visualize its understanding of a concept in an intuitive way and actively ask for further information and examples. The ease of teaching new concepts and the successfulness of their application should be measured.

Zuletzt aktualisiert 2021-18-01 um 13:14