(Stereoscopic Displays and Applications VII, San Jose, California, February 1996, Proceedings of the SPIE volume 2653A)
3D Video Standards Conversion
Andrew Woods 1, Tom Docherty 2 and Rolf Koch 3
1 Centre for Marine Science and Technology,
Curtin University of Technology,
GPO Box U1987, Perth 6845, AUSTRALIA
WWW: http://www.about3d.info
2 School of Electrical and Computer Engineering,
Curtin University of Technology
GPO Box U1987, Perth 6845, AUSTRALIA
3 School of Physical Sciences, Engineering & Technology,
Murdoch University,
South Street, Murdoch 6150, AUSTRALIA
ABSTRACT
This paper discusses the conversion of 3D video between the three world video standards of NTSC, PAL and
SECAM. An overview is given of the five main methods of achieving 3D with consumer video and the principles of
video standards conversion are discussed. A solution for converting field-sequential 3D video between standards is
presented and a number of other advantages which the system offers are discussed.
Keywords: 3D video, stereoscopic video, field-sequential, standards conversion, PAL, NTSC, SECAM.
1. INTRODUCTION
The most commonly used format for 3D video is the field-sequential method. In what has become a defacto standard,
the left and right images are stored in the even and odd fields of the video signal. The standard is popular because it
uses relatively simple equipment to generate and display the 3D video signal and also because the 3D video signal can
be stored on a single video tape.
Unfortunately, despite its simplicity, field-sequential 3D video cannot be converted between the 60Hz (NTSC) and
50Hz (PAL & SECAM) video standards using conventional standards converters. Most, if not all, video standards
converters corrupt the 3D content of the 3D video signal by mixing the odd and even fields and the output signal is
unviewable in 3D in the new standard.
2. BACKGROUND
There are three main video standards in use around the world today - NTSC, PAL and SECAM 1. NTSC is used
extensively in North America and Japan and the PAL system is used primarily in Europe and Australia. SECAM is a
French developed system and the rest of the countries around the world are fairly evenly distributed between these
three standards mainly for political reasons.
The standards differ in three main respects: the field rate (number of fields per second), the number of lines per frame
and the method of encoding colour. The parameters for each of the three standards are summarised in Table 1.
|
| NTSC
| PAL
| SECAM
|
| field rate
| 60Hz
| 50Hz
| 50Hz
|
| lines/frame
| 525
| 625
| 625
|
| colour encoding
| QAM 3.58MHz
| QAM PAL 4.43MHz
| FM 4.25, 4.40MHz
|
Table 1: Differences between World Video Standards
All three standards use 2:1 interlacing. This means that each frame of 525 or 625 lines is scanned in two parts called
fields. Firstly all the odd numbered lines are scanned (called the odd field) and then the even numbered lines are
scanned (called the even field). Therefore there are 262.5 lines per field in the 60Hz standard (NTSC) and 312.5
lines per field in the 50Hz standards (PAL & SECAM). Interlacing is used because it allows a high vertical resolution
with a low amount of flicker while keeping the signal bandwidth to a minimum.
3. 3D VIDEO STANDARDS
There are five main techniques by which 3D/stereoscopic imagery can be encoded onto a standard video signal. These
are field-sequential, sidefields (side-by-side), subfields (over-under), separate channels and anaglyph. The field-
sequential and sidefield methods are the most commonly used today.
3.1 Field-Sequential
In this system, video fields are alternately encoded with right or left information. The popularity of this system is a
result of its simplicity. Field-sequential 3D Video is easily generated from a pair of genlocked video cameras by using
a video multiplexer which selects odd fields from the right camera and even fields from the left camera. The 3D
video signal can be recorded and played back with standard video cassette recorders (VCRs) and it can be viewed in
3D quite simply using a standard television, a pair of liquid crystal shutter glasses and a small device which
synchronises the glasses with the left and right images being displayed on the screen.
Field-sequential 3D video does have a problem with flicker when used with a standard television because each eye
only receives half the overall field rate (25Hz for PAL and 30Hz for NTSC). The flicker problem can be overcome
by using commercially available field doublers 2.
With field-sequential 3D video there are two polarities by which left and right images can be stored in the odd and
even fields. Most companies have chosen to store right images in the odd fields and left images in the even fields
(3DTV Corporation, VRex Inc, Virtual I/O, SOCS Research, etc). Some systems, however, use the opposite polarity,
e.g. the Toshiba 3D camcorder. The result of this is that 3D video generated with one system cannot be viewed
correctly on a system with the opposite polarity. The incorrect image will be sent to each eye and a pseudoscopic
(reversed stereo) image will be seen by the viewer and incorrect depth information will be perceived. Some systems
are, however, compatible with both polarities by changing an external switch.
3.2 Sidefields
The sidefield method (sometimes called the side-by-side method) stores the left and right images side-by-side on the
left and right halves of the video signal. There are actually two ways in which this can be done: (a) The left and right
images are squeezed in the horizontal direction by a factor of two or (b) without the 2:1 squeezing. The former is the
main system in use today.
The squeezing method has the advantage that allows the 4:3 aspect ratio (ratio of image width to image height) of the
left and right images to be retained (after they are unsqueezed). Digital video electronics are used to squeeze the left
and right images from a pair of video cameras to generate the sidefield 3D video signal. Digital video electronics are
again used at the display to convert the sidefield format signal into a 120Hz field-sequential signal. The sidefield 3D
video signal can be recorded and played back with a standard VCR. To our knowledge, the generation of 3D video in
this format is only supported by equipment available from StereoGraphics Corporation 3 (San Rafael, California).
3D video in this format can also be displayed on some equipment available from 3DTV corporation (San Rafael,
California).
The sidefield method without the 2:1 horizontal image squeeze is generally only used for amateur purposes because the
images have a vertically narrow aspect ratio of 2:3. Sidefield 3D video in this format is generally produced by a
single video camera fitted with an optical beam splitter. This is basically a device containing four mirrors and is quite
commonly used in 3D still photography. The image is viewed in 3D either by free-viewing the stereo-pair or by
viewing the display while using some optical aid (containing either mirrors or lenses).
3.3 Subfields
The subfield format (sometimes called the over-under format) is used extensively in computer graphics as the primary
method of producing stereoscopic imagery from computers with standard computer graphics hardware. Basically the
left and right images are stored in the top and bottom halves of the video signal. The left and right images are
squeezed in the vertical direction by a factor of two.
We are only aware of one system which used this format with standard video. It was developed by StereoGraphics
Corporation and implemented in the NTSC video standard.4 This system was discontinued several years ago when
StereoGraphics' sidefield system was released.3 The use of the subfield format with standard video is not supported
by any currently available equipment.
Since this system was invented before inexpensive digital video electronics were available, it required the use of
specially modified video cameras to generate the subfield format 3D Video. The signal was displayed on a monitor
whose vertical deflection scanned at twice the normal rate so that the left and right images were displayed overlapping
each other. The image was then viewed through a pair of shutter glasses which were driven in synchronisation with
the left and right images being displayed on the screen. This system had the big advantage that the 3D imagery was
displayed flicker-free, however this advantage was offset by the complexity of the cameras. The subfield format 3D
video signal can also be recorded and played back with a standard VCR.
3.4 Separate Channels
This system maintains two separate video signals - one containing the left images (from the left video camera) and one
containing the right images (from the right video camera). There are two main problems associated with this system:
recording/play-back and display. The two signals must be recorded and played back with a pair of synchronised video
cassette recorders. The synchronisation capability is generally only found on professional level video cassette
recorders. The signals must also be displayed on a dual channel stereoscopic display device (such as a pair of video
projectors or a dual monitor stereo display). This system does, however, have the advantage of maintaining full video
resolution from both cameras.
3.5 Anaglyph
This system encodes the left and right images by way of colour. The left and right images are stored as two different
primary colour channels - usually red and blue. The 3D image is viewed by wearing a pair of glasses with
appropriately coloured filters in the eye pieces. This system does not work particularly well with video because of the
low bandwidth allocated to the colour signal and the way in which the colour is encoded in the signal. A full colour
stereoscopic image cannot be displayed with this technique.
4. STANDARDS CONVERSION
In order for a video signal to be converted to another standard, three aspects of the video signal may need to be
changed - field rate, lines/frame and colour encoding. When converting PAL to SECAM, it is only necessary to
change the colour encoding of the video signal (since the field rate and the number of lines per frame are the same).
When converting from NTSC to PAL, however, it is necessary to change all three parameters.
Conversion of the colour encoding method is a fairly simple process and can be relatively easily achieved using linear
analog electronics. Unfortunately, the process of changing the field rate and the number of lines per frame is more
complicated and is generally performed using digital electronics. There are three main ways in which the number of
fields per second and the number of lines per field are converted: Field/Line Omission/Duplication, Field/Line
Interpolation and Motion Estimation.
4.1 Field/Line Omission/Duplication
This is the simplest process and requires the least complicated electronics. In what can be considered a two step
process, the number of lines per field are first converted to the new number and then the number of fields per second
is converted. In a PAL to NTSC conversion, firstly the number of lines per field is converted from 312.5 lines/field
to 262.5 lines per field. This is done by omitting one line from every six. This is illustrated in Figure 1(a). The
field rate is then be converted from 50 fields per second to 60 fields per second. This is done by duplicating or
repeating one field in every five. Note that because each field now consists of only 262.5 lines it is possible to display
60 fields per second. This is illustrated in Figure 1(b).
With an NTSC to PAL conversion, it is necessary to repeat one in every five lines and omit one in every six fields to
obtain 312.5 lines per field and 50 fields per second.

Figure 1: PAL to NTSC conversion. (a) Omission of lines when converting a 325.5 line field to a 262.5 line field (b) Duplication of fields when converting from 50Hz to 60Hz.
This is the simplest and lowest quality conversion technique. It introduces some conversion artefacts especially when
motion is present in the scene. Subjectively the conversion is acceptable.
4.2 Field/Line Interpolation
In this method, individual lines and fields in the output standard are a product of several lines or fields of the input
standard. This is an extension of the previous scheme where individual lines and fields in the output standard were
based on single lines or fields from the input standard.
In a simple implementation of such a system, a new line in the output standard is calculated as a linear interpolation
between two lines from the input standard. The particular input lines from which the output line is calculated and the
weightings used are determined from the position in the scan where the output line must be generated. This is
illustrated in Figure 2(a) which shows a PAL to NTSC conversion. For example, line 5 in the output standard is
calculated as 24% of line 5 and 76% of line 6 from the input standard. This calculation continues such that the
correct number of output lines is generated from the input lines.
The conversion of the number of fields per second is a similar process and is illustrated in Figure 2(b). For example,
output field number 3 occurs at t=2/60 seconds. It is calculated from inputs field numbers 2 and 3 (which occur at
t=1/50 and t=2/50 seconds) at a weighting of 33% of field 2 and 67% of field 3.

Figure 2: PAL to NTSC conversion. (a) interpolation of the new line rate and (b) interpolation of the new field rate.
Four line/four field converters are also available. They work in a similar way to the process explained above except
that each individual output line is based on a weighted average of four input lines and each individual output field is
based on the weighted average of four input fields.
The performance of this conversion method with standard video is much better than the previous method, however
some conversion artefacts are still evident, particularly with scene motion. It should be noted that the details I have
provided give only a brief description of the process. A full explanation is contained in Sandbank (1990).
4.3 Motion Estimation
In this method, a motion vector array is calculated between consecutive fields in the video stream. The motion vector
array shows how the objects move and change in the video image from field to field. When an output field is to be
generated at a time interval between two input fields, the motion vector array is used to calculate a new intermediate
field with a percentage of the motion between the two fields.
Motion estimation is generally only found in broadcast quality standards converters. The quality of conversion will
obviously vary with the quality of the algorithm which calculates the motion vector array.
5. STANDARDS CONVERSION OF 3D VIDEO
Of all the 3D video methods mentioned, field-sequential 3D video is the only method which has serious problems with
video standards conversion. This arises because the odd and even fields are used to store the different left and right
images. The three conversion techniques described either upset the order in which left and right images are presented
in the output standard or mix the left and right images to generated fields in the output standard. The problem actually
lies with the field rate conversion process - the conversion of the line rate and colour encoding does not corrupt the
signal. Therefore, field-sequential 3D video is only corrupted when a standards conversion is performed which
changes the field rate. For example, SECAM<->PAL conversions do not corrupt field-sequential 3D video whereas
NTSC<->PAL and NTSC<->SECAM conversions do corrupt field-sequential 3D video.
Three different types of problems occur with each of the three methods of standards conversion. These problems are
illustrated in Figure 3 for a PAL to NTSC conversion. Figure 3(a) show the native PAL field sequential 3D video
signal. The black and white squares represent the odd and even fields which contain right and left images. The
field/line omission/duplication method (illustrated in Figure 3(b)) does not mix fields, but the field polarity of the
output signal changes every five or six fields. It can be seen that where a field is duplicated (the two consecutive
white fields or the two consecutive black fields), the field polarity changes. This obviously destroys the 3D effect.
The field/line interpolation method corrupts the 3D information because it produces the output fields from a mixture of
odd and even fields. As can be seen in Figure 3(c), very rarely in the output field sequence does a complete left
image or complete right image exist. Generally the output fields are a mixture of left and right input fields
(represented by the different shades of grey). The motion estimation method corrupts the field-sequential 3D video
signal because output fields are motion estimated between consecutive odd and even fields and therefore between a pair
of right and left images. The corruption of the 3D information would not occur if even output fields were motion
estimates from a consecutive pair of even fields from the input standard. Unfortunately, this is not the case with
currently available equipment.

Figure 3: Conversion problems with field-sequential 3D Video. (a) Native 3D-PAL signal (b) converted to NTSC using field duplication and (c) converted to NTSC using field interpolation.
The separate channels method of 3D video has slight problems with standards conversion. Care must be taken that a
time shift is not generated between the two channels when the conversion takes place. If a time difference is
introduced between the two channels, temporal stereoscopic effects would be introduced which could upset the
stereoscopic information. Ideally both channels would be converted simultaneously with a pair of standards converters
(with synchronised output timebase) and with the input video signal coming from a pair of synchronised VCRs.
The 3D information in the other three 3D video methods (sidefields, subfields and anaglyph) is not corrupted by the
standards conversion process. The only conversion artefacts present are the same as those present when converting
normal (non-3D) video between standards but this does not corrupt the 3D information.
6. DISCUSSION
In order for field-sequential 3D video to be successfully converted between standards, it is important to keep the left
and right images separate during the conversion process. It is also equally important that the left and right images are
stored in the even and odd fields of the output standard.
We have extended the capabilities of a commercially available video standards converter to allow the successful
conversion of field-sequential 3D video between the PAL, NTSC and SECAM video standards. The converter allows
field-sequential 3D-PAL, 3D-NTSC and 3D-SECAM material to be converted to field-sequential 3D-NTSC or 3D-
PAL. Particular care is taken to keep the odd and even fields separate in the conversion process and to ensure that the
left and right images from the input standard are stored on the even and odd fields of the output video standard.
Figure 4 shows how the converter can be used to convert field-sequential 3D-NTSC to 3D-PAL.

Figure 4: A block diagram of how the converter can be used to convert field-sequential 3D-NTSC to 3D-PAL.
The advantage of using digital video and digital frame store technology in the implementation of a standards converter
is that it also allows a number of other functions to be achieved. The converter can be used for (a) the conversion of
field-sequential 3D video to 2D and (b) the conversion of field-sequential 3D video to the opposite field polarity (field
inversion). In the 3D to 2D conversion mode the output video signal consists of only odd fields (left images) or only
even fields (right images) of the original field-sequential 3D video signal as chosen by the user. This mode could be
used to convert a 3D video sequence to 2D so that the footage could be shown to an audience without need for 3D
viewing apparatus. Another application of this mode is for 3D video projection. If two converters are used along
with two video projectors, one converter could be configured to provide the first video projector with left images only
and the other converter could be configured to provide the second video projector with right images only. If
polarising filters are placed in front of each of the projectors and a silvered projection screen is used, a stereoscopic
video projection display would be achieved. This configuration is illustrated in Figure 5.

Figure 5: The use of two converters for polarised projection of field-sequential 3D video.
The field-reversal mode swaps the field polarity of the incoming video signal. Images stored on the odd fields are
shifted to the even fields and vice versa. For example, this mode could be used to convert field-sequential 3D video
which has been recorded with the Toshiba 3D camcorder (which stores left images in the odd fields) to the defacto
standard field polarity (left images stored in the even fields). This configuration is illustrated in Figure 6.

Figure 6: Conversion of the field-polarity of field-sequential 3D Video.
The converter also acts as a time base corrector to stabilise the timing of the video signal and clean up the
synchronisation signals.
7. CONCLUSION
We have described the effects standards conversion has on the various forms of 3D video. Most notably, field-
sequential 3D video is the only method which encounters serious problems with conventional standards conversion.
To solve this problem, we have presented a system which is capable of converting field-sequential 3D video between
standards without corrupting the 3D information. This will ease the difficulties encountered when collaborative work
is performed between researchers from different countries.
8. ACKNOWLEDGMENTS
This work was inspired by the wish to share field-sequential 3D video with people around the world. Thanks to all
those people that provided that motivation - particularly the attendees and organisers of the Stereoscopic Displays and
Applications Conferences. The authors would also like to thank Woodside Offshore Petroleum for their continued
support of the stereoscopic video research being undertaken at Curtin University.
9. REFERENCES
1. Keith Jack, "Video Demystified: A Handbook for the Digital Engineer", Hightext Publications, California, 1993.
2. Andrew Woods, Tom Docherty and Rolf Koch, "Field Trials of Stereoscopic Video with an Underwater Remotely
Operated Vehicle", Stereoscopic Displays and Applications V, Stereoscopic Displays and Virtual Reality Systems,
J. Merritt, S. Fisher, Editors, Proceedings of the SPIE volume 2177, pp. 203-210, 1994.
3. Lenny Lipton, "Stereoscopic Real-Time and Multiplexed Video System", Stereoscopic Displays and Applications
IV, J. Merritt, S. Fisher, Editors, Proceedings of the SPIE volume 1915, pp. 6-11, 1993.
4. Lenny Lipton, Lhary Meyer, "A Time-Multiplexed Two Times Vertical Frequency Stereoscopic Video System",
1984 SID International Symposium, Society for Information Display.
5. C.P. Sandbank, "Digital Television", John Wiley and Sons, Ltd., West Sussex, 1990.
The 3D Video Multi-standard Converter mentioned in this article is available for purchase. Click here to see the brochure.
Copyright on this document is retained by Curtin University. This document
is not public domain. Permission is hereby given to reprint this paper on a non-profit basis for
scholarly purposes provided the document is unaltered and this notice is
intact. This paper may not be reprinted for profit or in an
anthology without prior written permission. If you wish to reprint this paper
on this basis, please contact the primary author at the address shown on the
first page of this document.
GO BACK to Andrew's Home Page
Last modified: 16th February, 1996.
Maintained by: Andrew Woods