People

MPEG: Key technology for the future of entertainment

Mar 1, 2019

In recent years, streaming has become the primary way people enjoy visual and music entertainment. Therefore, streamed content data volumes keep getting bigger. And data sizes will only continue to balloon as 3D video and 3D audio streaming becomes commonplace. Enter the critical technology of MPEG. With the enhancement of entertainment, the role of MPEG will continue to diversify rapidly. We asked the three engineers in charge of audio, video, and systems to help us better understand the latest trends on the frontlines of MPEG.

Profile
  • Toru Chinen

    Audio Technology Development Department,
    Fundamental Technology Research and Development Division 1,
    R&D Center,
    Sony Corporation

  • Ryohei Takahashi

    Connectivity Technology Development Department,
    Fundamental Technology Research and Development Division 1,
    R&D Center,
    Sony Corporation

  • Ohji Nakagami

    Visual Technology Development Department 1,
    Fundamental Technology Research and Development Division 1,
    R&D Center,
    Sony Corporation

At Sony, the starting line is entirely different

──Chinen-san, let’s start with you, since you have the most experience with MPEG among the three of you. Please tell us about your experience so far.

Toru Chinen:Since I joined in 2003, I have been in charge of audio codecs for a long time, and recently I have been working on MPEG-H 3D Audio standardization.
Prior to joining Sony I was at a venture company for a long time. It was a big surprise to see that, at Sony, it was readily possible to do things that venture companies typically struggled with. For example, at a venture company, if you wanted to do something in visual and audio standardization, you would first need to start by gathering a lot of video and audio data. At Sony, however, we already have plenty of these resources within the group. This makes for a great environment for engineers. On top of that, you have a lot of veteran engineers that you can consult with and a wealth of information available to you. In other words, anytime you start something here at Sony, the starting line is completely different than elsewhere. I am very pleased that I joined Sony.

──Next, Takahashi-san, please tell us about your history at Sony.

Ryohei Takahashi:I joined Sony in 2008 to do software implementation of Blu-ray recorders, but my first assignment took me to a department working on Blu-ray disc format development. From that time onward, I have been in charge of so-called standardization work. Currently, I am working on international standardization for Omnidirectional Media Format (OMAF), which is the MPEG systems standard for 360-degree video.

──Please tell us a little bit about your role as a person in charge of systems.

Takahashi:When distributing audio and video contents to users as single application or when storing the same, it is necessary to multiplex the two to create a single piece of content. My role is to provide the system technology that combines audio and video to provide end users with the value that the audio or video by itself cannot provide. This is basically what I do.

──Thank you. All right, last but not least, Nakagami-san. Please tell us about your career.

Ohji Nakagami:I joined Sony in 2004, and since then I have been in charge of developing visual compression technology. I was doing research on compressing visual when I was in school, which led my professor to recommend me for an internship at Sony. I ended up joining Sony after that and have been here ever since.

Reconciling conflicting values: proliferation of streaming and increasing data size

──A question for all of you: What is the focus of your respective research?

Takahashi:Virtual reality (VR) has finally become pretty commonplace, and all signs point to exponential growth in video and audio data volumes. At the same time, streaming of video and audio content via platforms like Netflix, YouTube and Spotify has become the norm. If the massive data volumes of the era to come could be delivered to users as is, it would make for an incredible experience, but the reality is that this kind of big data cannot be streamed smoothly in real time due to network bandwidth limitations. Moreover, conditions such as the network environment and device performance vary depending on the user. My focus is to develop distribution technologies that enable the optimum video and audio experience for each user, despite all these challenges.

──I imagine you can hardly wait for 5G.

Takahashi:Of course, 5G will help. However, every time there are advancements in communication technology, the size of data increases even faster. So, we always have to keep making advancements in video and audio compression and distribution technology. Specifically, even if compression efficiency increases, the data is so much larger that we need to be creative to keep up. The key will be finding ways to maintain the quality of the user experience while reducing data size by developing new methods of distribution on the basis of how the content is viewed. Take VR 360-degree video, for example: we can provide high-quality data for just the area the user is looking at right now, and use low-quality data for the area they aren’t looking at in that same moment. This would enable reduced data volume while maintaining the quality of the experience.

──What about you, Chinen-san and Nakagami-san? What kind of research are you doing in your respective areas of expertise?

Chinen:In the audio world, streaming is in full bloom. You no longer need to save music to your smartphone or other device, you just listen whenever you want to from the cloud. On the other hand, content-wise, it is pretty much still the same traditional two-channel stereo content. It is against this backdrop that we are working on developing technology for creating 3D music, as well as on MPEG-H 3D Audio standardization related to compressing and transmitting 3D audio sources. Of course, compared to 2-channel stereo, as the number of sounds increases, the amount of data will also go up, so the trick will be finding ways to compress and reduce the bit rate.
Conventionally, compression efficiency has been the focus with MPEG. However, in the audio segment, with the anticipated exponential increase in the number of sounds, the rendering method has now become a new area of focus. In addition to compression, rendering is also subject to standardization, such as how to express the sound field, how to express the broadcast content and music content. In other words, from the audio perspective, the scope of what we do with MPEG is rapidly expanding.
This is why, in addition to technical proposals, discussions, and final standardization related to compression technology, we also need to work to develop and propose the rendering technologies that we think are the best, and to move forward with standardization accordingly. This is what we are up to today.

Nakagami:In terms of video, there are two major trends. One trend is the evolution of 2D video. The image sensors for cameras are rapidly evolving right now. In the past, it was all you could do to take a photo in low resolution, but now we have HD and 4K, and 8K is knocking on the door. The resolution level has been increasing exponentially, and with that, so has the size of the data of the videos captured, necessitating technology to compress and transmit this massive data to large screen TVs and mobile devices. This is one trend in video compression.
The other trend is that of the advancement of 3D video. For example, it has become possible to shoot in 3D using multiple cameras in space. Of course, compared to 2D data, this kind of 3D data, for use on 3D displays and VR headsets, is much larger in size. So, the technology to compress such data will be critical going forward.

The ultimate goal is to provide things people will use

──At the beginning, Chinen-san spoke about the group synergy of Sony. Do you in R&D get specific requests from the business groups, which design products?

Chinen:Yes, we do. In addition to internal requests from the business groups, there are also requests that come in based on external trends around the world. Those involved in standardization like us not only contribute to the evolution of technology by writing articles. Our ultimate goal is that the technology is used by people, that is to say, we want our technology to be commercialized. So, naturally, we are keen to take up business needs. For example, we get internal requests that we keep the computational level at a certain level or that we raise the sound quality to a certain level.

Nakagami:It is only natural that what Sony wants in a product and what other companies want in a product will be different. I think that standardization exists to sort this kind of gap out. It is often the case that, when Sony makes a technological proposal for what we envision, other companies have a different view. When that happens, it is hard to see eye to eye. That is one of the difficult aspects of working in the area of standardization.

Takahashi:Looking back, when I was involved in standardization of the Blu-ray disc format, which had closer business implications, a more strategic approach to standardization was employed to make the most of the strength of our technology and products. Since our work would directly affect our business, there was a lot of pressure, but at the same time, I found the challenge enjoyable and rewarding.

──By the way, do you ever have any side-by-side collaborations among yourselves?

Chinen:Yes. For example, Nakagami-san was talking about 3D video. That is closely linked to my area of 3D audio. When you render a 3D video in space, the sound that accompanies it naturally has to be in 3D. So, we need to maintain a shared vision for 3D audio and 3D video. Otherwise, we would likely end up with inconsistent standards.

Takahashi:My area of systems is all about the technology to combine audio and video to create value accordingly. So, I always strive to collaborate and work closely with these two. And the very effort of finding ways to combine 3D video and 3D audio and deliver the value created to users will be the focus of our development.

Group synergies boost standardization

──This discussion has certainly helped reveal the importance and difficulty of the process leading up to standardization. In light of these reflections, let me ask: What precisely are the strengths unique to Sony?

Chinen:This overlaps a bit with what I said earlier, but we have Sony Pictures Entertainment, which creates video content, and Sony Music Entertainment, which creates music content, and we have Sony Interactive Entertainment, too. That is one of our biggest strengths—the fact that pretty much anything can be done within the Sony Group. It is important to get on the same page in the early stages when it comes to standardization, and getting on the same page is fairly easy at Sony because we have all the relevant content holders within the Group.
For example, at a startup in Silicon Valley, I hear there is often much discussion about what use cases to select, and so it can take time even to reach the starting line for actual research and development. However, in the case of Sony, because of all the content companies are within our group, it is easy to choose the customer use cases, identify needs, and set the conditions ultimately needed for standardization. This constitutes a strength.

──What does Sony need in order to take the initiative on standardization?

Chinen:I think one is continuing to deliver attractive content such as movies, music and games. If we lose that, then we will lose our differentiation from other companies. I can’t emphasize enough just how important and advantageous it is to have the content, when it comes to identifying the needs of the content creators and developing the technology to support them.

Nakagami:In terms of images, 2D compression has a long history, and it has continued to evolve since the JPEG era, gradually improving. We have also started discussions on a new codec for MPEG. I think that it is important for us at Sony as well to make steady improvements in this kind of situation—where every individual technology is becoming more advanced. At the same time, we must also think about 3D compression, but since this is a new field where the discussion has just started, I want to leverage the standardization process to prepare various technologies.

Takahashi:To give some examples, distribution of free-viewpoint video and immersive audio has yet to be implemented in the world. VR is emerging, but it will still take some time to deeply penetrate the public life. Under such circumstances, Sony is seen by outside companies involved in the standardization process as the company with strong synergies that has both content and a powerful platform like PlayStation ®. So, keeping our presence fully in mind, I would like to actively incorporate what we want to do into standardization. I hope that these efforts will help prime the pump for the whole Sony Group to start up new services.

Related article