R&D Activities

Human Sensing & Interaction

Design the ultimate harmony of system and human beings

Natural User Interface

This is a theme for developing “Natural User Interface” technologies in which natural actions of users, such as “seeing,” “speaking,” and “moving the body,” become inputs and the outputs are returned to users in an intuitive way without the need for thinking. For example, haptics technology, which is one of the output technologies and is being applied in games, realizes overwhelming reality and various immersive experiences by integrating high-definition dynamic haptic feedback and other sensory presentations based on human sensory characteristics. In addition, we will create new experience value at the interface between people and device through the development of technologies such as sound generation and odor presentation related to output, and gaze UI and voice UI related to user input. We are also focusing on accessibility technology, which uses technologies related to the five senses to provide a user-friendly experience for people with disabilities.

Motion Sensing

By capturing human motion in real time, a CG avatar can be controlled, or audio and visual content can be displayed accordingly. Using compact, low-power, and inexpensive inertial sensors (accelerometers and gyroscopes), we are working towards sensor fusion and deep learning to develop motion-sensing technologies that are easy for both creators and users to use. Specifically, we are addressing R&D themes such as self-position estimation to accurately track a pedestrian’s movement, as well as motion capture to estimate whole-body motion with a minimum number of sensors. These technologies will be utilized in products and services including, but not limited to, Sony’s entertainment business.

Vital Sensing and Emotion Estimation

We have been working on vital sensing and emotion estimation technologies to allow us to be “getting closer to people” and to better understand them. By combining the vital sensing technology achieved through the development of devices and signal processing with the emotion estimation technology derived from experiments based on our knowledge of machine learning, neuroscience, and physiological psychology, we are promoting research on fundamental technologies that provide accurate personalization services based on the real-time emotional changes of users, as well as feedback for the evaluation and production of entertainment content.

Remote Spectator Assistance System

We are developing a spectator assistance system that connects remote spectators and gives then a “sense of being there” and “enthusiasm” in sports and live music. By sensing the movement and mental state of the audience in real time, we can quantify the degree of concentration and interest that is difficult for the human eye to accurately grasp accurately. By expressing the degree of concentration and interest through images and sounds that enable a remote user to feel as if he/she were there, and by conducting interactive interventions, we provide an experience that enables people in a remote location to share the sense of presence, unity, and enthusiasm with the audience at a venue.

Sound AR Interaction

We are exploring the realization of an auditory AR experience that expands our world through the power of sound by superimposing the “sound of the real world” with the “sound of the virtual world”. This technology is already being used in Sony’s new sound experience called Sound AR™. This technology has three features. The first is a human-sensing technology for the real-time sensing of the location and behavior of users. The second is game-inspired sound-engine technology that generates real-time, interactive sound based on the sensing results. The third is a 360 Spatial Sound technology for superimposing the generated sound onto a three-dimensional space. By integrating these Sony technologies into interaction technologies tailored to human perceptual characteristics, Sony is able to provide a natural and immersive AR experience. We are also working on applications for accessibility that enable people with visual impairments to perceive space through the use of sound.

Telepresence

The elements of “reality” and “aura” can be leveraged to realize natural communication within space using Sony’s “MADO” (window) telepresence system. For “reality”, we applied the best in Sony’s video and audio capabilities. Meanwhile, for “aura”, we implemented the state of the art in cognitive psychology, that is, how humans perceive people and spaces. For example, instead of handling only “central vision” information such as the other person’s face, remarks, and materials as is the case in existing video conferencing, “MADO” works with a life-sized image of a person or the “peripheral vision” on a large vertical screen. The bi-directional, high-quality sound technology allows natural conversation in real time, as if the users were sitting in front of each other.

Image of realizing auditory AR experience by integrating the three technologies

Biometrics & behavior authentication

In recent years, multi-factor authentication methods have been introduced to address the security shortcomings of passwords. A disadvantage of these authentication methods is that they require the manual intervention of the end user, and authentication is not immediately available when needed. At Sony, we are working on the development of high-precision biometric devices that are small enough to be worn, as well as technologies that realize advanced functions, including anti-spoofing, through the use of general-purpose sensors. In addition, we are developing methods to continuously identify persons based on their behaviors or biometric properties – without requiring any interactions from the user. For example, by using data captured from mobile or wearable devices, a user’s walking style or favorite routes can be learned and used for authentication purposes.