Generators of the Third Dimension

Before explaining how the classic types of Graphical Perspective are known to operate, it is essential to understand certain aspects of the way that we humans look at spatial reality and ascribe meaning to the images that we see.

We present a rudimentary analysis of how humans obtain an impression of three-dimensional space or depth while: A) looking at physical reality; and B) looking at flat representations—or pictures—of spatial reality.

Spatial Limitation of Human Vision

It is salient to begin by considering a vitally important fact concerning the nature of human visual perception. A key part of the eye is the detecting surface—the retina—which perceives light rays emanating (or reflecting) from objects present in spatial reality. However, the retina can only perceive the direction of said light rays and not their distances!

Put another way, a ‘pixel’ present on the retina stimulated by a light ray arriving from a particular direction could result from an object point at any position along the same direction! Object points located at any number of distances from the eye could theoretically result in the same ‘pixel’ stimulation!

Ergo theoretically, and in practice, the eye cannot detect distance in any way whatsoever! From whence then comes the—impression—that we perceive space or depth through the images we receive from our eyes? The answer is that while we cannot perceive depth directly, and our visual system employs a range of factors to obtain three-dimensional apperceptions (the same being more or less exact depth measurements depending upon the factor(s) used).

In Figure 1 below we have listed commonly known ‘Generators of the Third Dimension’, being a set of capabilities present in the human visual perceptive system that have also been termed the Depth Cues.

Extrinsic Factors
1. Psychological
  1. Aerial Perspective (Colour of distant objects systematically change and contours are systematically softened)
  2. Distribution of Light and Shade
  3. Overlapping Contours (Occlusion)
  4. Geometrical Perspective
  5. Interpretation of Size
1. Parallactic movements
  1. Motion Parallax
Adjustment Factors
1. Efforts of Accommodation (focus)
2. Efforts of Convergence (range-finding stereopsis)
Intrinsic Visual Factor – Type 1 – (Stereoscopic or Binocular Methods)
1. The Stereoscopic Influence of Dissimilar Images (shapes/positions)
Intrinsic Visual Factor – Type 2 – (Monocular Methods)
1. Additional Optical Cues from image form analysis (analysis of visual features)
Additional Environmental Factors
1. Optic Flow Patterns – images change according to the movement of observer.
2. Role of Invariants – when we move our head and eyes or walk around our environment, some ‘things’ move in and out of our viewing fields (or certain visual factors change in some other way), but other visual factors remain fixed. Both types of visual factor, changing and fixed, give unique structural and depth information.
3. Affordances (environmental cues that aid in perception) – Optical Array (Patterns), Relative Brightness, Texture Gradients, Relative Size, Superposition, Height in Visual Field, etc.

Figure 1: Generators of the Third-Dimension (Depth Cues)

Human Vision and the Depth Cues

Perspective is the science of spatial form; or more specifically the appearance of form. Accordingly perspective has close connections to the third dimension or depth. But before we can explore how perspective is able to probe spatial reality; we must understand certain aspects of, and the fundamental connections between; space, human vision and representation.

People often say to me that 3D as depicted in drawings, paintings, or on a television or computer screen is not “true 3D”. But they are then unable to explain this statement any further, and sometimes they also add that flat screen images cannot be true 3D because they do not show stereoscopic views. Along the way people occasionally mention that stereoscopic glasses are needed for 3D, one being red and the other blue for the left and right eyes. In one sense what these people are saying is correct, in that a flat picture plane does not show stereoscopic images. However they are entirely incorrect when they assume that only stereoscopic images are “true” 3D.

Vision experts have long known about the different aspects of depth perception, and that these can be grouped into two categories: monocular cues (cues available from the input of one eye) and binocular or stereoscopic cues (cues available from both eyes). Each of these different “cues” is used by our brains, either independently, or else together, in order for us to perceive the third dimension.

It is important to note that not all of the cues are required to be present simultaneously in order to give us an accurate or realistic impression of 3D. It has been demonstrated, for example, that we can get a realistic impression of 3D when just one or two cues are present, as in a perspective drawing for example.

I think it it is worth reminding ourselves of the cues in a list at this point. Monocular cues include Perspective, Motion Parallax, Color Vision, Distance Fog, Focus, Occlusion, and Peripheral Vision. Binocular cues include stereopsis – (or binocular disparity sometimes also called binocular parallax) which is the difference in shapes and positions of images due to the different vantage points from which the two eyes see the world. The other binocular cue is convergence, or range-finding stereopsis which is the human ability to judge the distances to objects due to the angle of convergence between the eyes.

Note that some vision experts would argue for the inclusion of other yet more subtle (monocular) optical cues, including occluding edges, horizons and other affects due to the “direct perception of surface layout”, but to simplify an obviously complex topic we shall ignore these additional factors here.

The greater number of items on the monocular list, gives a first clue that perhaps stereoscopic vision effects are not the primary way in which we as humans perceive depth or the third dimension. You can easily test this yourself by closing one eye, and immediately you notice that the world still appears to be spread out before you in all of its three dimensional glory! With one eye closed you may have difficulty with the finer points of depth perception such as picking up a pin off the floor. However largely for ordinary tasks if you lost one eye, then you would still be able to rely on the other eye for 3D vision, in fact exclusively by relying on the monocular visual cues.

On passing I would like to note here that those who suggest that stereoscopic 3D (aka red- blue parallax films) is the only true 3D, and further that its mechanisms are well known, are in fact claiming that they have more than a head start on some of the greatest experts in human vision who ever lived. World renowned scientists agree that science has yet to even begin to understand the mechanism by which human beings combine or overlay two different parallax views in real time into a single correlated image sensation.

Some even proclaim this image combination feat to be a miracle of the human perceptive system – and so it may be – because the source images are thoroughly misshapen and also distorted one relative to the other.

Perspective Defined (Generally)

Let us pause at this point and take a stock of where we are.

I hope I have been able to convince you of the fact that stereoscopics are not required to give an impression of 3D. If they were we would not be able to make much sense at all of television, films, photographs or even the vast majority of drawings and paintings. These methods, one and all, solely rely on the monocular cues for depth depiction, yet we have no difficulty understanding the 3D worlds depicted in which objects lie at different apparent distances from the viewer.

One of the most important of the monocular cues for depth perception is perspective. Let us now agree on a very simple definition of perspective. Perspective (from Latin perspicere, to see clearly), is an approximate projected representation of a scene as seen from a particular viewing location.

The two most characteristic features of perspective (or Renaissance/Linear Perspective) are that objects are represented with a smaller scale as their distance from the observer increases, and also that the scene experiences so called spatial foreshortening, which is the distortion of items when viewed at an angle.

A rudimentary knowledge of the different types of perspective is essential if we wish to understand how are able to see in 3D.

Types of Perspective

At this point I would like to make a distinction between two different types of perspective. Firstly, there is the type that arises from the perception of depth in human vision (sometimes called Visual Perspective [2nd Type] or Natural/True Perspective), and secondly there is the type that is created to facilitate the perception of depth in graphical images (Graphical Perspective).

Regardless of the features of the specific definition adopted, experts are in agreement that perspective is a very powerful depth cue in both the graphical and vision forms. It stands to reason therefore that in order to maximize the effectiveness of this cue in any representative method, it is important to mimic the overall optical affects of Visual Perspective as closely as possible. However, once again there are complications and disagreements over which is the most natural and realistic form of Graphical Perspective.

It turns out that there are many different forms of Graphical Perspective, including Linear, Curvilinear, Spherical, Parallel, and Axial etc. Arguments continue to rage over which is the more natural. Linear perspective, which was first developed during the period of the Italian Renaissance, is perhaps the most familiar form of perspective to the Western eye. Nevertheless, vision and optical experts have noted that linear perspective is not (always) a good approximation to so-called Natural or Visual Perspective.

We shall explore Linear Perspective in great detail on the ‘Central Perspective’ page accessible under the Classic Forms section of this site. Hence we shall not repeat that information here, but rather take a brief look at another form of Central Perspective, named Curvilinear Perspective.

Curvilinear Perspective

In particular, at the outer extremes of the human visual field, parallel lines become curved, as in a photo taken through a fish-eye lens. It may surprise you to learn that the human visual field has a natural curvilinear shape! However painters, building designers and scientists have been aware of this fact for hundreds and possibly even thousands of years.

It has been claimed for example that the Ancient Greeks made the Parthenon columns bow outwards to account for – and correct – the curvilinear shape of the human visual field. Also painters like Leonardo Da Vinci and Turner added curvilinear effects into their depictions to more closely mimic reality as seen by the human eye.

It has also long been known that it is possible to graphically re-create scenes in which the geometry conforms to an overall curvilinear shape similar in form to the views projected by a fish-eye lens. This form of perspective has sometimes been called Curvilinear Perspective, and it is a form of perspective which has an undeniable origin in the natural optics of scenes. Curvilinear perspective was ably explored in “Curvilinear Perspective, From Visual Space to the Constructed Image” by Albert Flocon and Andre Barre in their 1986 book. Artist Dick Termes has also produced many works based on curvilinear and 6-point perspective.

Curvilinear Perspective has a geometry which is closely related to the human visual field. In particular the rules of optics cause objects located at large distances from the central visual plane to be contracted in size, a true to life effect that is not depicted by Linear Perspective. Also others have noted that the human eye projects images onto a spherical retina, causing images to curve outwards in the same way as images in a wide field lens.

In figure 2 below, we see two images from Flocon and Barre’s detailed mathematical study of Curvilinear Perspective, being drawings which ably represent the basic features of the natural curvilinear shape of human visual field. Especially noteworthy here is the curvilinear shape of wide-angle scenes, and the “realistic” (if exaggerated) foreshortening of scale in the lateral dimension.

Curvilinear Realities

The facts of human vision presented here will come as a complete surprise to many. The question arises as to why it is that the facts of human vision should surprise us? Perhaps we are all too close to our own sense of vision to notice the natural curvilinear shape of every wide-field scene we ever look at, and likewise we do not generally take any notice of the miracle of 3D perception because it is ever present. Or perhaps we are all to-familiar with concepts such as Linear Perspective and/or the narrow field-of-view of photographs. In fact narrow-field photographic images do work rather well – in terms of 3D impression.

Next time you are looking at 2D television or at a photograph notice how strong the affect of depth or the third dimension really is. You have no trouble here forming a good conception of the different depths of the objects that are depicted, and can form an accurate overall impression of scene geometries. No 3D glasses glasses are used here, and in each case we use “monocular” cues to form an accurate internal mental “model” of these scenes, which aids and supports our comprehension of the third-dimension or depth.

At this point you may be asking yourself why it is that photographic, film and also television images do not exhibit scene curvatures. The answer is that they would if they covered a wide enough field of view – say around 180 degrees, and in any case optical designers have worked hard to ensure that the camera lenses involved eliminate such “distortions”. Note here that the so-called “Fish-Eye” wide-field lenses do show extreme curvilinear distortions similar to those depicted in Flocon and Barre’s detailed mathematical study.

Also when you have a moment, get a 30 cm ruler (longer is better), and whilst looking forward bring it close to the bottom of your nose, and notice how its shape at the outer edges curves upwards and forwards. It may take you a few minutes to be able to see this effect, because you are so accustomed to not noticing it! But once you do you will be amazed to see your curved field of view as it really is for the first time.

It is an established fact that wide-field optical perspective views are naturally curvilinear in form. Fish-eye lens views are not curved because of any effect introduced by the lens itself, but rather because that is how reality looks when you decide to project a specific scene over a very wide field of view (onto a flat picture plane)! Eagles and birds see the world like this, that is in the ultra-wide field aspect. This fact leads me to conclude that curvilinear perspective has a strong foundation in reality.

I am not claiming here that curvilinear perspective is necessarily a more real depiction technique than the linear ones that we are used to seeing, but only that all things considered it is an equally valid form of representation! Perhaps the main reason why Curvilinear Perspective seems so strange to us is that we have become so used to seeing everything in terms of straight lines and right angles. We may be missing out as a result on some quite spectacular images as a result.

Therefore, although curvilinear scenes may at first seem like a distortion of reality, our discussion has shown that this shape is in fact rooted in the natural optics of scenes and also at the same time in the human visual field which is inherently curved in shape.

Can the 3D World ever be “Truly” Represented?

Overall, many experts are in agreement that the human visual field is in fact curvilinear in shape.

It is important to note here that Curvilinear Perspective is related to one of the monocular depth cues experienced when viewing real scenes, that of peripheral distortion experienced when viewing wide-angled scenes. Nevertheless, and despite the arguments in favour of Curvilinear Perspective being a good approximation to human vision when looking at – or observing – extremely wide-field scenes; it is nevertheless true that for ordinary vision, and over a normal field angle (say less than 90 degrees) the rules of Linear Perspective (one-point perspective) do in fact serve as a good approximation to human vision as it is employed in everyday circumstances.

Thus graphical images projected according to the rules of Linear Perspective do enable the viewer to adequately ‘perceive’ the changes in appearance of objects in relation to the Perspective Components of Visual Transformation (and underlying Perspective Phenomena) that happen as a result of depth (said object being viewed from a particular station-point, or from a distance and in relation to a particular angle-of-view).

The most realistic 3D would be one which employed all of the depth cues, however no method to date has been devised which has been able to employ them all. In fact it may not even be an achievable goal to construct a system so realistic that it employs all of these cues. Such a system would be indistinguishable from reality, and may in fact be an impossibility because it is known that human vision uses other yet more subtle scene based optical cues to form an impression of 3D.

The fact that no single method employs all of the different depth cues (perhaps) leads to the conclusion that no one method of depth representation can be claimed to be more “real” than any another. What about holograms you may ask – don’t they employ all of the cues, both mono and stereo? I am afraid not. It is true that holograms do employ both monocular and binocular cues, but they do not usually employ moving images and so miss out on the moving cues. Other cues are often missed here including changes such as color, shadow, occlusion and also peripheral vision due to the relatively narrow field of view of most holograms.

As the same suggests, perhaps Virtual Reality (VR) systems come closest to being able to fully recreate a highly realistic spacial reality complete with convincing depth and by means of many of the stereoscopic and monocular depth cues. VR systems typically employ Linear Perspective to aid in the representation and perception of 3D, and some systems have even began to use Curvilinear Perspective to mimics or represent (or employ) the wide-field curvilinear optics of natural scenes.

Overall I would conclude that no representative method currently employs all of the depth cues, and so none is true 3D in the strictest sense of the word.

As an aside the author has invented a new type of mirror, named the “Hologram Mirror”, which produces an image of the self which “floats” in space in-front of the mirror’s surface. Here, unlike with holograms, image occlusion affects are created, and the viewer obtains a strong and realistic impression of 3D. Similar optical devices may be used for producing improved types of 3D displays, specifically for interfacing (naturally) with future computing systems.

A short explanation of the “Hologram Mirror” principle is salient. In Figure 3 (left drawing) we can see that the mirrors labelled 2 and 3 form an upside-down image of a subject (1) at 4, whereupon a (partially transparent) mirror labelled as 5 re- images this intermediate image into an upright, life-sized reflection of a person (7) that is observable “floating” in space at a short distance in-front of the same person (1).

Conclusion

In conclusion, I hope that I have been able to convince you of the fact that you don’t need stereoscopic 3D glasses to see things on a flat representation – a drawing/picture/painting or even a photograph – in “true” 3D, and also not to dismiss out of hand the “reality” of Curvilinear Perspective scenes and/or multiple and distorted perspective views.

In any case, we have seen evidence of how the methods of Central Perspective (Linear, Curvilinear forms), are able to help ‘fool’ the human visual system into ‘seeing’ a flat representation as a three-dimensional world complete with an accurate depiction of a variety of Perspective Phenomena and other non-percspectival depth cues.