People
Standardization creates social infrastructure
Imagine that you are able to watch a soccer match as if you were standing on the field. The era of this kind of immersive experience is nearer than you think. But to make this possible, engineers believe they will have to take a different approach than the streaming of today. What kind of technical developments will be required in the fields of audio and visual to pave the way for the entertainment of the future? We asked a few key engineers who are deeply involved in this endeavor to their visions with us.
Profile
-
Mitsuhiro Hirabayashi
-
Yuki Yamamoto
-
Teruhiko Suzuki
The partners we talk with have dramatically changed in the last 25 years
──First of all, please tell us a little bit about your respective careers here at Sony.
Mitsuhiro Hirabayashi:I joined Sony mid-career in 1991, but I didn’t start in R&D. I was initially assigned to a business unit and tasked with designing professional equipment. I transferred to R&D in 1997, and since 2000, I have been working on developing camcorder formats. Originally, Sony was quite strong in the traditional magnetic media formats like tapes and in discs, but when the age of internet media arose, we didn’t really know what to make of it all and how best to approach it. That was about the time I started getting involved.
Teruhiko Suzuki:I joined Sony in 1992, and since then I have been researching and developing visual signal compression technology. Since the MPEG standard was already in use at that time, you could say that I have been doing more or less the same thing ever since. When I joined the company, it was about the time we were launching DVD and digital broadcasting businesses using MPEG-2. I was fortunate to be able to be a part of that standardization from the beginning. Now, the times are changing, and the industry has just started to standardize for higher compression ratios.
──What kind of changes have you seen, Suzuki-san, in the visual technology trend over the past 15 or so years since you have been with Sony?
Suzuki:When you’re working on international standardization, you talk with people from various companies, not just people within Sony. So, in that regard, one thing I have noticed is that the partners I talk with have dramatically changed over the years. When I was involved in the standardization of the first MPEG-2 and DVD, it was mainly Japanese manufacturers. Then, in the next generation, IT companies had built a stronger presence, and after that, the smartphone companies became the key players. Nowadays, Chinese companies are becoming more important. China had been promoting standardization on its own, but it seems that they have now shifted toward embracing the international framework.
──How about you, Yamamoto-san? What about your career?
Yuki Yamamoto:I joined Sony in 2008. At that time IT was booming and really starting to make their presence felt. I studied IT at school, and many of my fellow students were hoping to land a job in the IT sector. The reason I chose Sony was actually a very personal one. I played the piano since I was young and was also in a band when I was a student, so I wanted to find a job that had something to do with music. It just so happened that I ended up interning at an audio department here at Sony. Finally, I decided on Sony because of my passion for music and because this job also allows me to leverage what I studied at school.
By 2008 when I joined Sony, the standardization of MPEG audio compression was complete. So, for the first few years I wasn’t involved in standardization, but the development of peripheral technologies. Specifically, I have been involved in technology for converting the deteriorated sound quality resulting from compression to high-quality sound. Then, in 2010 and thereafter, I worked on applying this technology to compression and standardizing it for MPEG. More recently, I have been working to develop 3D audio technology and technology for applying machine learning to audio.
Going forward, an approach different from streaming will be needed
──What are your views on the standardization of immersive media technology that enable to provide a sense of high immersion such as VR?
Suzuki:Immersive media is a very broad term. So far, we have built on 2D media such as TV and we can now capture the space around users in order to create an immersive experience. The next step will be to enable users to explore the space and view it any way they like. However, that is still about 10 to 20 years out. Right now we are working little by little, starting in areas where we can, to develop the elemental technologies that will make this kind of experience possible, for instance filming, compression, various signal processing, and recognition technology. I think the first step is to do what can be done with head-worn displays and then from there we will gradually move into such futuristic approaches.
──Already it is possible to put on a head-worn display and see the whole sky. What kind of immersive experiences do you envision in the future?
Suzuki:I think the next step will allow users, for example, to put themselves on a soccer field and view soccer matches from any point on the field that they wish. We might be able to make at least something like this experience available in a certain limited fashion in time for the 2020.
Yamamoto:Of course, it is a given that we will be developing technology for immersive media. However, I think it’s important to understand that in order for Sony to leverage its strengths, Sony has to continue to make great content. Immersive media is a new medium, so the manner that content is expressed is quite different from that of conventional media. The way content is created will change quite a bit as well. For that reason, we need to involve creators as well and stay focused on shaping the precise nature of immersive media going forward. These are the kinds of activities we are working on right now.
Hirabayashi:In terms of the distribution technology that we are working on, right now most streaming involves simply transmitting content, which is then received by a device. However, when it comes to sending spatial data, whether audio or visual, I think it will be difficult unless we take a different approach from the streaming of today.
──Maybe you cannot share details right now, but does that mean you are already researching the concept for the next-generation distribution format that will replace today’s streaming?
Hirabayashi:Yes, that is correct. I think you will start to see some of our research results emerge in 5 to 10 years.
Suzuki:We are talking about a radical change that would transform the world quite a bit. So, I think the change will take place one step at a time, rather than all at once. While 2D and TV haven’t changed much from when we were children, what we are doing now comes with a great deal of freedom, for instance totally reexamining what display devices should even be like. The most recent example of this is head-worn displays, but this, too, will keep changing gradually. Perhaps another major change will occur in about 10 or 20 years. However, since we cannot just wait until then, we are trying to focus on what can be done now and starting there.
Not just RGB images anymore
──So, since we have been discussing the future, let me ask: Which sci-fi movie best matches the future you yourself envision?
Yamamoto:I like “Vanilla Sky”. Its story, in a sense, matches what we call immersive.
Hirabayashi:I have two favorite sci-fi movies. One is “Back to the Future”. I love the technical aspect such as the hovering skateboard.
The other is “Planet of the Apes”. I like it because it invites me to contemplate questions like, “Is society really OK the way it is now?” and “Could a huge social change like that of humans reverting back to primate status actually occur?” I see it as thought-provoking for those of us involved in developing technology.
Suzuki:I like “Blade Runner”. Replicants are kind of a symbol for a world where AI continues to evolve. I am not sure that such a future awaits us, but it makes me think about risks like that.
──So, what kind of future do you think your own research and development may lead to?
Suzuki:I think immersive is one possible direction. Conventionally, images have been something for people to see. However, on another front, as exemplified by self-driving vehicles, with today’s increase in sensing devices, the images we deal with are no longer just traditional RGB images. So, with this all happening, we need to consider what kind of new processing technology will be needed as we go beyond mere RGB images.
Yamamoto:Humans, unlike other animals, enjoy entertainment. I don’t think that point will ever change. And forms of entertainment are often driven by technology. So, my technology will possibly be a factor in the entertainment of the future. In the past, technologies for recording sound and pictures led to the entertainment of music and movies. I think we are at another such turning point. For example, I cannot imagine that 200 years from now that we will still just be using two speakers. In two centuries, it will be commonplace for speakers to be everywhere, and all of space will be dominated by visual, delivering truly immersive entertainment. I see myself as being involved in developing the kind of technology that puts us on track for that future.
Hirabayashi:I think there are two things to consider. One is the expansion of UGC—user-generated content. The UGC market has expanded tremendously. I expect that this market will probably grow even larger than anything we can imagine right now. This is just my opinion, but I think that society is going to continue to get more and more segmented, to the point that everyone is enjoying contents in their own unique way. And, I think that our technology will help make this future possible. In particular, I think that the sending side of distribution technology will undergo a huge change.
Right now we already have the means to convey to others what we are thinking. In the future, though, I want to contribute to the realization of a diversified society where people can make fun, interesting expressions easily using immersive visual and audio media.
Another thing to consider is entertainment in our aging society. From a viewpoint of improving accessibility and usability for easy viewing and easy listening, content diversification is also important for elderly people. Furthermore, I think that we can help support our aging society by combining mechanical technologies that support physical functions with immersive visual and audio distribution technologies.
Standardization creates social infrastructure
──I think audiovisual technologies have been at the heart of Sony since the beginning. What about several decades from now? Will audiovisual technologies continue to be indispensable to society?
Yamamoto:I think so. I mean, of the five senses, the amount of information we get from our eyes and ears is huge. And there really are no other organs that provide us with so much wonder. I know that it is different for each person, but the amount of wonder that we obtain from our eyes and ears is immeasurable.
Hirabayashi:Surely there will be more entertainment that takes advantage of the other organs going forward, but I don’t think any of that will take away from the impact of audiovisual.
Suzuki:No one knows if devices will stay the same, but audiovisual will never stop being a major source of information for humans.
──Hirabayashi-san alluded to the possibilities of entertainment in the aging society. Do you think there is any chance that your own research and development will intersect with Sony’s social contribution in the future?
Yamamoto:I think “time” is very critical. Even when doing the same thing, it is obviously better if it takes less time to do it. In that respect, the ultimate goal of the transmission and compression technology we are working on is exactly that, to reduce the time. For example, if you can transmit space, you can eliminate traveling time. Moreover, if it can be transmitted without decreasing the amount of information, you can obtain the same exact output in a shorter amount of time, thereby making a huge contribution to society.
Suzuki:Sorry to bring the discussion down to earth after such fascinating talk about our dreams, but the international standardization we are working on now is like creating the infrastructure for the future. So, in that general sense, our daily activities are already leading to social contributions.
If we are building the next foundation for new audio and visual, it means that we are creating a foundation that will enable people to enjoy themselves. Also, by Incorporating a viewpoint that enhances accessibility to support older people and those with disabilities, it is possible to lead to the realization of a society where everyone can enjoy the content.
Hirabayashi:That’s so true. The standardization itself is social infrastructure. You’ve hit the nail on the head!