True 3D telepresence getting closer – Weekend Feature

Jan 07, 2012 1 Comment by

Researchers at the University of North Carolina have been working on life size 3D telepresence which allows remote viewers to look around a scene far away without wearing markers or 3D glasses.

review dividing line True 3D telepresence getting closer Weekend Feature

The team, led by Dr. Henry Fuchs and graduate student Andrew Maimone, have created a telepresence system with room sized real-time 3D capture and a life size tracked display wall.

The prototype shown in the video below utilises ten Microsoft Kinect cameras, a two panel display wall (the 'window' to the remote scene), complex algorithms and GPU accelerated data processing to allow a remote viewer to look into a live scene, which changes perspective as the viewer moves his or her head. It is as if the displays are a window into another room so as you walk past, you will be able to look around objects.

The Microsoft Kinect cameras provides 3D scene capture of the remote room including depth information.  The system then merges the data from multiple cameras, reads the depth information, and applies some processing to change the view presented giving the viewer of the remote room the illusion of depth which changes as the user's perspective changes. 

UNC Chapel Hill is one of three universities participating in an international consortium called the Being There International Research Center for Telepresence and Telecollaboration that includes Nanyang Technological University (NTU, Singapore), Swiss Federal Institute of Technology Zurich (ETH Zurich, Switzerland).

The above model is an extension to the work on a stereoscopic 3D version shown in June last year. The 2D display above creates an illusion of depth due to the head-tracking but the team have also made the system work with glasses free 3D displays, bringing us a step closer to true 3D telepresence.

3D for cinema is based on filming with two cameras side by side. This is a very primitive way of producing 3D content and has been around since the dawn of photography. The next stage of 3D capture is likely to offer multiple viewpoints within a 3D scene using ‘depth’ information (using two side-by-side cameras provides just one view – you cannot 'look around' objects) and the UNC Chapel Hill team are using the Microsoft Kinect to provide this depth data for telepresence.

They have presented a Kinect-based marker-less tracking system that combines 2D eye recognition with depth information to allow head-tracked stereo views to be rendered for a parallax barrier autostereoscopic display. Like the 2D system, a single Microsoft Kinect situated on a glasses free 3D display tracks the remote viewer’s head as shown in the video below. 

“It is both [2D and 3D].” said Andrew Maimone. The system displays the scene in stereoscopic 3D (each eye sees a different image), providing a similar sense of depth as current 3D TV and 3D cinema, but does not require glasses. The system also allows the user to "look around" the scene.“

When asked why a teleconferencing system needs to be in 3D and allow the user to look around the remote scene, Andrew Maimone replied…

“The value of these features is two-fold. First, they increase the sense of "presence" or "being there", the feeling that one is actually co-located with the remote user and his or her environment. This sense helps the user forget he or she is looking at a display screen and communicate naturally as if talking to someone on the other side of a table. Second, the ability to "look around" the scene helps preserve information that is lost during normal 2D video conferencing. For example, imagine that you are seated in a meeting room and someone's head is blocking your view of a whiteboard. In our system, as in real life, you would naturally move your head around for an unobstructed view. In traditional video conferencing, you must interrupt the meeting and ask that the remote user move his or her head. As another example, imagine an engineer is holding up a new prototype part to show to a remote participant. With our system, the remote participant could simply move his or her head around to inspect the part. With traditional 2D video conferencing, the remote participant must communicate back and forth with the engineer regarding the different angles the part should be held until it is fully inspected.”

Mixed reality collaboration True 3D telepresence getting closer Weekend Feature

One of the biggest challenges the team had to confront is the considerable overlay of images when using multiple Microsoft Kinect cameras. Several algorithms, such as hole filling and colour matching, were built to create a more true to life image. Below shows the various stages of data crunching although as you can see, the quality of the final result is still not as high as regular television but this is bound to improve over time.

Data Processing Results True 3D telepresence getting closer Weekend Feature

When asked how is this different than earlier Kinect 3D video capture Andrew Maimone replied…

"Earlier Kinect video capture utilized one or two Kinect units, which do not provide enough coverage to allow a user to look around a small room without large missing areas. Additionally, when using two Kinect units the data was not smoothly merged, presenting quality problems. Utilizing more Kinects presents a challenge since the units interfere with each other, causing holes to appear in the output. Our current system utilizes five Kinect units to provide more comprehensive scene coverage and new algorithms to overcome the interference problem and merge data with improved quality."

A few weeks ago we reported about a solution that enables direct eye contact for video conferencing. High tech R&D firm Fraunhofer Heninrich Hertz Institute in Berlin, Germany showed 3D Focus the Virtual Eye Contact Engine – a software module analyses a scene in real-time 3D from three cameras mounted around the video cameras display. It computes the depth structure information of the person’s head which is used to generate a 3D model. The 3D model is then used to compute the view of the virtual camera for both parties and the rendered output appears to show each person looking directly at each other.

Both systems are crude in appearance and the quality is still not high enough for commercialisation. However, it will be very interesting to see what researchers will do with the upcoming Microsoft Kinect 2, which is rumoured to be so accurate it will be able to lip read. The current Microsoft Kinect only offers a resolution of 320 X 240 but it has proved what applications can be created using depth capture including more realistic telepresence. 

For more information about the work of the Department of Computer Science University of North Carolina at Chapel Hill click here.

Related story (Telepresence in Dallas)
 

FREE WEEKLY 3D NEWS BULLETIN – 

  • Pingback: Samsung 3D Glasses & 3D Blu Ray Package | HD 3D Glasses