Fujitsu, OIST Team for AI R&D

October 17, 2016

The Okinawa Institute of Science and Technology Graduate University (OIST) and Fujitsu Laboratories have commenced joint research to develop reinforcement learning algorithms with human-like applied skills, putting to use the latest neuroscience knowledge.

Recently, a variety of successful cases has put the spotlight on reinforcement learning, in which a computer acquires an action selection policy suited to the environment through trial and error, based on rewards for certain actions. With reinforcement learning techniques to date, however, the designer had to specify the information of interest beforehand, and the learning process had to be done over again for each problem, limiting applicability in the real world.

In this joint research, the partners will look at how the human brain learns, and incorporate those mechanisms into reinforcement learning algorithms, with the goal of producing an artificial intelligence (AI) with human-like applied skills to tackle a wide range of real-world problems.


Machine learning, which creates a variety of task executors based on the data, has also moved forward in practical terms in the areas of image and voice recognition, and now forms the core of AI technology. One particularly appealing subcategory is reinforcement learning, in which the computer acquires an action-selection policy adapted to an environment through trial and error, based on rewards for certain actions.


The human brain is capable of learning applied skills in which it can select what is important from different kinds of information, apply past learning to new problems, and select a behavior as needed from among those suited to a particular situation, or that have a greater degree of certainty and safety. For example, a person in a crowd can instantly identify people or obstacles they need to watch out for, depending on the direction they wish to take, and avoid collisions. A person who already knows how to play chess can also generally quickly pick up shogi (a Japanese game similar to chess). Moreover, it is possible for a good player to make an appropriate choice according to the situation, depending on if a standard move should be played, or if a move based on deeper thoughts is required. But existing reinforcement learning techniques need a designer to specify the information of interest beforehand, and need to retrain for every problem, which limits applicability in the real world.

About the Joint Research

OIST and Fujitsu Laboratories will focus on such learning mechanisms contained in human brains and incorporate them based on the latest neuroscience insights to develop reinforcement learning algorithms with greater applied skills, and will work to create an AI that can autonomously modulate itself, unlike earlier AI that needed human intervention.

Figure: Image of joint research results

Figure: Image of joint research results

Specifically, they plan to develop new technologies in the following three areas where needs are high, from within those issues set for practical application:

  1. Technology to automatically extract information, suitable to reinforcement learning, from within enormous volumes of data that automatically changes.
  2. Transfer learning technology to apply past experience to create an action selection policy for a separate problem.
  3. Cooperative-concurrent reinforcement learning technology to select from many policies an appropriate one depending upon conditions to take an action.

Professor Kenji Doya of OIST and his research team will focus on mathematical modeling of neural computation structures from a neuroscience perspective, and apply that to reinforcement learning algorithms. Fujitsu Laboratories will jointly develop algorithms based on an optimization and control engineering perspective, and to investigate implementation methods that make full use of computing resources.

Future Plans

Moving forward, OIST and Fujitsu Laboratories will begin work on the problems of handling massive volumes of input data, and selecting actions where multiple policies learn in parallel, such as policies that can flexibly adapt in response to changes in the environment or more conservative responses.

Fujitsu Laboratories aims to build on the results of this joint research to develop AI solutions for real-world applications, such as ICT system management and energy management. Computers will thereby more efficiently be able to acquire policies adjusted to environments without needing manual setting or adjusting.

Fujitsu Laboratories also aims to develop new technologies that can serve as the core of Fujitsu's AI technology, Human Centric AI Zinrai.

Terms of Use | Copyright 2002 - 2016 CONSTITUENTWORKS SM  CORPORATION. All rights reserved. | Privacy Statement