You’re in a meeting making a plan. Everyone is taking notes, but the conversation roams, going from one item, and one speaker, to another, and as the hour comes to a close, it’s hard to remember who said what and which assignments were doled out to whom. Some of the questions that came up went unanswered. Worse than that, despite all the smarts in the room, several complications were overlooked and nobody noticed.
It doesn’t have to be this way, say the researchers in the newly formed Cognitive and Immersive Systems Laboratory at the Curtis R. Priem Experimental Media and Performing Arts Center, aka CISL@EMPAC. A collaboration between IBM Research and Rensselaer Polytechnic Institute (RPI) to pioneer new frontiers in immersive cognitive systems as an aid to group problem-solving and decision-making.
Hui Su, director of the new lab, puts the problem this way:
Today, when we need to make a decision as a group in a meeting room, the only thing we have is a projector, personal computers, and a whiteboard. We really want to change that environment in a way that will help people make more intelligent decisions.
Let’s go back to that meeting, only this time, with the vision of CISL@EMPAC. In this version, the participants enter a cognitive and immersive room. The room acts as a facilitator. It records the contributions of the participants, preparing a summary of the conversation, and a list of action items. It calls up specific information when queried, and presents it to the group. And it offers information that would otherwise be overlooked.
Such a cognitive and immersive space could be useful in any scenario where a group of people must share information and collaborate on a decision. CISL wants to develop cognitive situations rooms, examples of which include cognitive classrooms, cognitive meeting rooms, a cognitive design studio, and a cognitive diagnosis room.
This vision requires innovation. Su outlines three major attributes of a cognitive space, each of which poses technical challenges.
In brief, the room must:
- Understand natural, long-term, multimodal group interactions
- generate natural responses through multimodal story-telling
- contribute to the discussion with relevant context and points of view, facilitate the discussion, and support the decision
In each case, Su outlined realistic initial goals for CISL.
A room that understands natural, long-term, multimodal group interactions will understand natural human expressions (speech, facial expressions, gestures) and dialogues, and should be equipped with computer interfaces that are easy to use and allow people to share information.
One goal of CISL is to develop new computer interfaces that help groups of people working in daily decision-making environments.
An example of one such device is the Campfire, a tool developed at EMPAC that allows groups of people to share information in a projection device shaped much like a large, deep fire pit. The Campfire projects information onto the continuous interior cylindrical wall as well as the flat circular floor of the device. Users gathered round the Campfire can collaboratively view and manipulate different types of data projected onto the wall and floor, with the edge between the surfaces acting as a blending site. Here’s what Su has to say about the Campfire:
Campfire is a device that allows people to collaborate with the same information – they can not only look at the same thing, but they can also look from different perspectives, and they can play with the information they are discussing. We need more devices like this that serve this purpose in group decision-making contexts.
To meet the need for natural multimodal dialogue management, CISL can draw on emerging technologies in speech recognition, face recognition, and natural language processing, including the IBM Watson Deep Q&A system. But, as an example of the ground yet to be gained, Su pointed out that while Watson was proved itself unbeatable in the Jeopardy context of short-term question-and-answer, the system needs to be extended to carry out a sustained conversation with a group of people. As Su says:
In a meeting, we have long-term conversations that involve speech, gestures, and facial expressions. Building on existing technologies, we want to challenge ourselves to create something new around long-term natural multimodal dialogue management among people.
A room that can generate natural multimodal responses can help to present information or a point of view, or tell a story. For example, if a group of people meet to decide which demonstrations of cognitive computing research should be offered at an upcoming event, the room could draw upon a slate of known options, present images of the options, determine which representatives of the research would be available to offer the demonstrations during the event, and contact those representatives to request their participation. The first goal for this thread will create technologies that can generate multimodal narratives, which involves natural language generation to communicate with the human participants in the meeting, coupled with presentation of multimedia information.
A room that can contribute to the discussion could pace the conversation, offer relevant information, or generate a summary of the meeting or action items. A first goal for that attribute might be generation of a multimodal summary of the meeting.
Each of the attributes of the cognitive and immersive environment CISL envisions are very challenging, Su said. But the goals are within reach.