Advanced search in
Projects
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
The following results are related to Canada. Are you interested to view more results? Visit OpenAIRE - Explore.
2 Projects, page 1 of 1

  • Canada
  • CHIST-ERA
  • 2019

  • Funder: CHIST-ERA Project Code: M2CR
    Partners: Computer Vision Center, Université de Montréal / LISA, Université du Mans / LIUM

    Communication is one of the necessary condition to develop intelligence in living beings. Humans use several modalities to exchange information: speech, written text, both in many languages, gestures, images, and many more. There is evidence that human learning is more effective when several modalities are used. There is a large body of research to make computers process these modalities, and ultimately, understand human language. These modalities have been, however, generally addressed independently or at most in pairs. However, merging information from multiple modalities is best done at the highest levels of abstraction, which deep learning models are trained to capture. The M2CR project aims at developing a revolutionary approach to combine all these modalities and their respective tasks in one unified architecture, based on deep neural networks, including both a discriminant and a generative component through multiple levels of representation. Our system will jointly learn from resources in several modalities, including but not limited to text of several languages (European languages, Chinese and Arabic), speech and images. In doing so, the system will learn one common semantic representation of the underlying information, both at a channel-specific level and at a higher channel-independent level. Pushing these ideas to the large scale, e.g. training on very large corpora, the M2CR project has the ambition to advance the state-of-the-art in human language understanding (HLU). M2CR will address all major tasks in HLU by one unified architecture: speech understanding and translation, multilingual image retrieval and description, etc. The M2CR project will collect existing multimodal and multilingual corpora, extend them as needed, and make them freely available to the community. M2CR will also define shared tasks to set up a common evaluation framework and ease research for other institutions, beyond the partners of this consortium. All developed software and tools will be open-source. By these means, we hope to help to advance the field of human language.

  • Funder: CHIST-ERA Project Code: IGLU
    Partners: Inria Bordeaux Sud-Ouest / Flowers Team, University of Mons / Numediart Research Institute, University of Zaragoza, Université de Sherbrooke, Université de Lille 1, KTH Royal Institute of Technology

    Language is an ability that develops in young children through joint interaction with their caretakers and their physical environment. At this level, human language understanding could be referred as interpreting and expressing semantic concepts (e.g. objects, actions and relations) through what can be perceived (or inferred) from current context in the environment. Previous work in the field of artificial intelligence has failed to address the acquisition of such perceptually-grounded knowledge in virtual agents (avatars), mainly because of the lack of physical embodiment (ability to interact physically) and dialogue, communication skills (ability to interact verbally). We believe that robotic agents are more appropriate for this task, and that interaction is a so important aspect of human language learning and understanding that pragmatic knowledge (identifying or conveying intention) must be present to complement semantic knowledge. Through a developmental approach where knowledge grows in complexity while driven by multimodal experience and language interaction with a human, we propose an agent that will incorporate models of dialogues, human emotions and intentions as part of its decision-making process. This will lead anticipation and reaction not only based on its internal state (own goal and intention, perception of the environment), but also on the perceived state and intention of the human interactant. This will be possible through the development of advanced machine learning methods (combining developmental, deep and reinforcement learning) to handle large-scale multimodal inputs, besides leveraging state-of-the-art technological components involved in a language-based dialog system available within the consortium. Evaluations of learned skills and knowledge will be performed using an integrated architecture in a culinary use-case, and novel databases enabling research in grounded human language understanding will be released. IGLU will gather an interdisciplinary consortium composed of committed and experienced researchers in machine learning, neurosciences and cognitive sciences, developmental robotics, speech and language technologies, and multimodal/multimedia signal processing. We expect to have key impacts in the development of more interactive and adaptable systems sharing our environment in everyday life.

Advanced search in
Projects
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
The following results are related to Canada. Are you interested to view more results? Visit OpenAIRE - Explore.
2 Projects, page 1 of 1
  • Funder: CHIST-ERA Project Code: M2CR
    Partners: Computer Vision Center, Université de Montréal / LISA, Université du Mans / LIUM

    Communication is one of the necessary condition to develop intelligence in living beings. Humans use several modalities to exchange information: speech, written text, both in many languages, gestures, images, and many more. There is evidence that human learning is more effective when several modalities are used. There is a large body of research to make computers process these modalities, and ultimately, understand human language. These modalities have been, however, generally addressed independently or at most in pairs. However, merging information from multiple modalities is best done at the highest levels of abstraction, which deep learning models are trained to capture. The M2CR project aims at developing a revolutionary approach to combine all these modalities and their respective tasks in one unified architecture, based on deep neural networks, including both a discriminant and a generative component through multiple levels of representation. Our system will jointly learn from resources in several modalities, including but not limited to text of several languages (European languages, Chinese and Arabic), speech and images. In doing so, the system will learn one common semantic representation of the underlying information, both at a channel-specific level and at a higher channel-independent level. Pushing these ideas to the large scale, e.g. training on very large corpora, the M2CR project has the ambition to advance the state-of-the-art in human language understanding (HLU). M2CR will address all major tasks in HLU by one unified architecture: speech understanding and translation, multilingual image retrieval and description, etc. The M2CR project will collect existing multimodal and multilingual corpora, extend them as needed, and make them freely available to the community. M2CR will also define shared tasks to set up a common evaluation framework and ease research for other institutions, beyond the partners of this consortium. All developed software and tools will be open-source. By these means, we hope to help to advance the field of human language.

  • Funder: CHIST-ERA Project Code: IGLU
    Partners: Inria Bordeaux Sud-Ouest / Flowers Team, University of Mons / Numediart Research Institute, University of Zaragoza, Université de Sherbrooke, Université de Lille 1, KTH Royal Institute of Technology

    Language is an ability that develops in young children through joint interaction with their caretakers and their physical environment. At this level, human language understanding could be referred as interpreting and expressing semantic concepts (e.g. objects, actions and relations) through what can be perceived (or inferred) from current context in the environment. Previous work in the field of artificial intelligence has failed to address the acquisition of such perceptually-grounded knowledge in virtual agents (avatars), mainly because of the lack of physical embodiment (ability to interact physically) and dialogue, communication skills (ability to interact verbally). We believe that robotic agents are more appropriate for this task, and that interaction is a so important aspect of human language learning and understanding that pragmatic knowledge (identifying or conveying intention) must be present to complement semantic knowledge. Through a developmental approach where knowledge grows in complexity while driven by multimodal experience and language interaction with a human, we propose an agent that will incorporate models of dialogues, human emotions and intentions as part of its decision-making process. This will lead anticipation and reaction not only based on its internal state (own goal and intention, perception of the environment), but also on the perceived state and intention of the human interactant. This will be possible through the development of advanced machine learning methods (combining developmental, deep and reinforcement learning) to handle large-scale multimodal inputs, besides leveraging state-of-the-art technological components involved in a language-based dialog system available within the consortium. Evaluations of learned skills and knowledge will be performed using an integrated architecture in a culinary use-case, and novel databases enabling research in grounded human language understanding will be released. IGLU will gather an interdisciplinary consortium composed of committed and experienced researchers in machine learning, neurosciences and cognitive sciences, developmental robotics, speech and language technologies, and multimodal/multimedia signal processing. We expect to have key impacts in the development of more interactive and adaptable systems sharing our environment in everyday life.