Visual Optics

Perspective is deeply linked with human vision, geometrical optics, and inherent technical relations. In this section, we explore the sources of these visual factors, specifically to define the fundamental nature of perspective.

The Physical Basis of Perspective

The human eye’s primary function(s) are perception of colour, form, and space.

Our visual system evolved to correctly interpret apparent changes to the Visual Features of a scene/object as viewed from a particular location. The theory, processes, and technical procedures involved in determining how and why these features change is named perspective (general sense).

The mechanisms of perspective (generally) rest on the fact that, while we can hear around corners, we cannot see around corners, because light propagates in straight lines. In other words, light is not diffracted at sharp edges, unlike sound (at least to any noticeable degree and under ordinary visual conditions). Also, light typically does lose carried [spatial] information upon reflection, again unlike sound, which does not lose carried information upon reflection. Our task shall be to understand the many repercussions of this simple statement of fact, and explore the geometry of classic perspective forms and related visual processes.

Linear Perspective is perhaps the most familiar form of perspective, being a type of Graphical Perspective that attempts to mimic what the eye sees while looking at physical space. It is the geometrical basis of representational painting and (in some senses) photography.

Linear Perspective provides a linear structure for the depiction on a surface of the apparent shape, size, and relative position of the objects constituting a scene in three dimensions (3D). However, as we shall learn it is by no means the only type of perspective. And it can make no legitimate claim(s) to being the most realistic example or even to match reality with absolute correspondence (for all situations).

Linear Perspective is based on a series of geometrical and optical assumptions, each of which may – or may not – be true in a specific visual scenario. And it does not help that several falsehoods, misconceptions (and outright myths) have become connected to this category of perspective. Ergo, it is helpful to explore the physical basis Graphical Perspective, applying strict logical definitions and analytical precision to a sometimes confusing topic.

We begin with a brief introduction to geometrical optics (or light ray optics) and the human visual system.

Figure 1: Visual Pyramid projected onto a perspective window.
Abraham Bosse Instructional Drawing on Perspective (1665). 

Figure 2: The basic principle of Linear Perspective: illustration of an artist forming a perspective image
by painting on a ‘Perspective Window’ – Albrecht Durer (1520s)

Figure 3: Draftsman making a perspective drawing of a reclining woman – Albrecht Durer (1520s)

Figure 4: Visual Pyramid projected onto a perspective window.
(coincidentally an image formed formed by looking at a pyramid shaped polyhedron!).
Shows a 3D object (a pyramid) projected onto a transparent plane.
Illustration of the Basic Principle of Linear Perspective (modern drawing).
From: The Perception of the Visual World by J.J.Gibson (1950).

Light Rays and Image Formation

Throughout history, the physical basis and processes surrounding light and vision have long been a focus of mystery, awe, wonder and fascination. Scientists and philosophers have expended significant time and effort, and not a little brain power, in attempts to explain all kinds of naturally occurring optical effects.

The concept of light rays arose naturally from a consideration of such phenomena as the shadows cast by illuminated objects, and beams of sunlight; the straight paths of which are made visible by the presence of dust or smoke in the air when they enter a darkened room. According to Isaac Newton (1643 – 1727):

The least Light or part of light, which may be stopped alone without the rest of the Light, or propagated alone, or do or suffer anything alone, which the rest of the light doth not or suffers not, I call a Ray of Light.

Isaac Netwon

According to the light-ray theory, each luminous point (or illuminated object point present in an illuminated three-dimensional scene), sends innumerable light rays into surrounding space. Today we know that matters are a little more complex, and according to wave-particle duality, it is known that light sometimes behaves as if it were comprised of particles (aka photons which travel in straight line paths or as ‘rays’), whereas sometimes light exhibits a wavelike nature and has strange affects that are not always predicted by light-ray theory.

Nevertheless, the light-ray theory remains a good approximation to the behaviour of light in many everyday circumstances, and it is still useful because it provides significant explanation and predictive power. In particular the light-ray theory does provide a reasonable explanation of how images are formed in the eye and by optical instruments.

For over 500 years scientists have known the basic principles of ray-optics and imaging processes. For example, in the 15th century, Leonardo da Vinci did understand how images can be conveyed by light, and he wrote: 

The air is full of an infinity of straight and radiating light rays intersected and interwoven with one another without one occupying the place of another… All bodies together, and each by itself,  give off to the surrounding air an infinite number of images which are all in all and each in each part, each conveying the nature, colour, and form of the body which produces it.

Leonardo da Vinci

But patently ray-optics has an even older lineage. Earlier Lucretius (95-51 B.C.) gave a similar account to that of Leonardo, of the centripetal theory of vision as follows:

I maintain therefore, that replicas or insubstantial shapes of things are thrown off from the surface of objects. These we must denote as an outer skin or film, because each particular floating image wears the aspect and form of the object from who body it has emanated.


Once again, it is true that these rays are a mathematical abstraction of a more complex wave-particle duality of light; nevertheless geometrical optics – or ray optics – remains as useful a concept as ever for explaining how many optical procedures come to be, and in particular nothing more than ray optics is required to explain Geometrical or Linear Perspective (for example). 

Figure 5: Light Pyramid: the Theoretical Reflecting Point and it’s Corresponding Focus Point.
From: The Perception of the Visual World by J.J.Gibson (1950).

Figure 6: Visual Pyramids radiated from a spherical body – Leonardo da Vinci – 1400s

Figure 7: Study of light rays falling on a human face, by Leonardo da Vinci; circa approx. 1400s.
Pyramid of Light: Depicts light rays diverging from a single object point.

Figure 8: Studies of light rays reflected from a spherical mirror. Leonardo da Vinci; circa approx. 1400s.

Figure 9: Isaac Newton experimenting with light rays in a darkened room; and so to
discover light dispersion or that white light is comprised of multiple colours;
whilst also demonstrating the rectilinear propagation of light rays (late 1600s).

The Eye

It was Johannes Kepler (1571-1630) who first described (for the public) the modern theory of how the eye works; when in 1604 in his Ad Vitellionem Paralipomena book, he wrote:

Vision, I say, occurs when the image of the whole hemisphere of the external world in front of the eye – in fact a little more than a hemisphere – is projected onto the pink superficial layer of the concave retina.

Johannes Kepler

Kepler understood that light from external objects forms an inverted image of these objects on the retina. However, it seems that 100 years earlier, Leonardo da Vinci (1452 – 1519) had a very sophisticated theory of visual optics; and further, he understood clearly that the eye operated like a miniature camera obscura or camera, and projected images onto the retina and after that they were perceived by the brain.

However oddly, Leonardo did not believe that the image (within the eyeball) was inverted; which is strange because he had access to eyes from cadavers and did many eye dissections, detailed anatomical drawings, and even performed optical experiments with eyeballs, etc.

Below we see diagrams of the primary structures of the eye; and accordingly said features are explained in the followings sections.

Figure 10: Diagram of ocular refraction. Credit: Wellcome Library, London.
Woodcut by: Rene Descartes, 1637

Figure 11: Human Eye (Horizontal Section)

The shape of the human eye approximates a sphere, about 1 inch in diameter. Its outer coat consists of a fibrous membrane called the sclera. The sclera is replaced by a transparent window in the front of the eye, called the cornea.

The iris forms a variable aperture that controls the amount of light that passes through the cornea and onto a light-focusing lens that sends light into the eye’s internal structure where it strikes the retina, which covers the greater part of the inside of the eyeball. The retina is covered with numerous receptor cells, the rods, and cones, stimulated by the light pattern that constitutes the retinal image. These light-sensitive cells are connected to the many optic fibres that converge together towards the optic disc, where the optic nerve fibres emerge from the eyeball and go on to the brain.

The eye is essentially a camera obscura filled with water. The cornea and lens are responsible for refracting the light entering the eye. This refraction occurs at the cornea, which may be regarded as a convex surface separating air outside from the aqueous humour inside the eye. The lens bends the light further and causes the rays which reach it through the pupil to converge sufficiently to bring them in focus on the retina.

Muscles can pull on the lens to change its shape and hence its converging power and are responsible for accommodation or sharp differential focussing on objects at varying distances from the eye. Focussing is thus achieved in an entirely different way than in the ordinary photographic camera, which adjusts the length of the camera to focus on objects at different distances.

What is of primary importance for human vision to operate, is that the optical system of the human eye achieves a ‘point to point’ correspondence between object and its image on the nervous layer that is receptive to light (the retina). The eye must produce distinct or visibly sharp representations – or images – of  a three-dimensional scene. The pattern image is largely in-focus without suffering optical blurring effects or optical aberrations and shape distortions etc.

The image produced by the eye is a good approximation of how an object looks from a particular vantage point. Still, the corresponding perceptive processes necessary to achieve sufficient clarity of vision are complex and may involve several trade-offs in image sharpness, field-of-view, and the apparent perception of three-dimensions or depth, etc.

It is essential to realise that many ostensibly pure optical processes happening within the eye, are often complemented (and overridden) by the human perceptual system, as real-time visual processing procedures and psychological processes in the brain.

Figure 12: Basic Optics of the Human Eye (operates like a Camera Obscura)

Figure 13: Camera Obscura – drawing by Leonardo da Vinci

Figure 14: The Human Eye in relaxed state (focussed on infinity)

Figure 15: Accommodation of the crystalline lens (lens changes shape according to muscular action),
and to increase radii of curvature or optical power and thus to be able to form a sharp imager of close objects.

The Visual Pyramid (of Sight)

Leonardo da Vinci explained perspective by postulating the existence of a ‘point in the eye’ which was the apex of his ‘pyramid of sight’: saying:

Perspective is nothing else than seeing a place or objects behind a pane of glass, quite transparent, on the surface of which the objects that lie behind the glass are drawn. These can be traced in pyramids to the point of the eye, and these pyramids are intersected by the glass plane… By a pyramid of lines I mean those which start from the surface and edges of bodies; and converging from a distance, meet in a single point. A point is said to be that which [having no dimensions] cannot be divided, and this point placed in the eye receives all the points of the cone.

Leonardo da Vinci

Figure16: The Pyramid of SIght

Fig 17: Nodal Points of the Eye (page 33)

Figure 18: Field of View

The Pyramid of Sight remains a fundamental concept of visual optics, that explains many features of human vision, and not least of which are the various phenomena and methods of Visual Perspective (2nd type) and Graphical Perspective. However irrespective of Leonardo’s neat explanation of perspective, we must ask precisely is the Visual Pyramid – and to what extent is it based on real-world optics?

The Visual Pyramid, quite simply defined as “”.

For images focussed on the central retina, the point may be taken to be the optical centre K of Figure 4 above. Note that when dealing with the problem of visual angles and perspective, making an exact determination of the apex of the pyramid of sight is, however a complex matter because the images of objects at different distances from the eye cannot as a rule be focussed together simultaneously. The apparent position of said point will move about ever so slightly in 3 dimensions  (according to object point position) and due to focusing and the imaging characteristic of the human eye (aberrations etc).

Ergo, despite the fact that the light emerging from every point is divergent, there are certain rays, the ‘main rays’, ‘chief rays’ or ‘principal rays’ which do converge towards a point in the eye from all of the different object points and which determine the retinal image of the external scene or object as  whole. These convergent straight lines, outside the eye, form a pyramid of sight, the intersection of which by a picture surface is the image of objects in linear perspective on this surface. Euclid named these rays the ‘visual rays’. These chief rays define the Visual Angles subtended by the objects, which angles are the main subject of Natural Perspective (Visual Perspective of 2nd type). Ergo, the study of Visual Angles may be called Natural Perspective.

The Pyramid of Sight – refers to the Visual Field of each eye in a stationary position. The Visual Field is the region of outside space in which objects can be see by this eye when it it is a fixed position, and does not rotate on its orbit. However, if the head is kept immobile but the eye rotates in its orbit, the visual field moves with the eye, and along with this also the point of fixation or centre of vision. Consequently, the total Field Of View covered by the moving eye is considerably greater than the Visual Field itself.

The fovea centralis is located in the center of the macula lutea, a small, flat spot located exactly in the center of the posterior portion of the retina. As the fovea is responsible for high-acuity vision it is densely saturated with cone photoreceptors. The macula is about 5.5 mm in diameter, while the fovea is 0.35 mm in diameter. Furthermore, the fovea has about 50 cone cells per 100 micrometers squared and has an elliptical shape horizontally. 

Given this high cellular concentration, it is expectedly the location of the highest visual acuity, or resolution, in the eye. 

Natural Perspective

Linear Perspective, refers to the pattern of lines given by the central projection of the objects on a surface, the surface of the picture, the centre of projection being the relevant point in the eye (apex of visual cone with a central position relative to the visual angle or object extent). The perspective projection thus consists of the intersection of the pyramid of sight by the picture surface. Natural Perspective (Visual Perspective of 2nd type) is more general in scope than linear perspective since each different surface gives a different section of the same pyramid of sight (different picture surfaces shapes / positions / angles give different perspective views – ref. cone/pyramid/ spherical shapes picture planes etc). 

 what does he mean by above paragraph? Doe he mean different shaped picture planes for different kinds of perspective – or does he mean different cones of vision intersecting different regions of spherical retina (many many viewing angles?)? -see page 57 >> I THINKE MEANS INTERSECTING DIFFERENT SHAPED /ANGLED PICTURE PLANES!

Linear Perspective defines the size, shape, and disposition of the objects as drawn in the picture, with their foreshortening and the apparent overlapping of some near objects upon far objects, for one eye position – and for this position only.

In brief, perspective projection is the section of a surface of the Pyramid Of Sight which is seen issuing out of the eye.