Chapter   | 2 |

Light theory

Elizabeth Allen, Sophie Triantaphillidou

All images © Elizabeth Allen, Sophie Triantaphillidou unless indicated.

INTRODUCTION

The study of imaging must begin necessarily with an investigation into light. An understanding of light in the context of imaging includes its function and interactions throughout the imaging chain: its emission from a light source, reflection and absorption by surfaces within a scene, detection and transformation by an image sensor (which may be the human eye) and finally its interpretation. Alongside physics, a number of other scientific disciplines are involved, including biology, physiology, chemistry and psychology, each of which has contributed to the various theories that have developed in the history of light theory.

This chapter, introduces optics, the branch of physics describing the behaviour and properties of light. Optics can describe light in terms of how it is emitted and how it interacts with other matter, for example its deviation when moving from air into a more dense transparent medium such as the glass of a lens, how it behaves when moving through a small aperture or round the edge of an object, or how it is scattered by the earth’s atmosphere. It also explains how light energy may be transformed into other forms of energy, such as electrical or chemical energy, when it hits an image sensor. Further optics is the subject of Chapter 6.

A BRIEF HISTORY OF LIGHT THEORY

Understanding the nature of light has played an important role in the history of science. The following chronology is incomplete, but summarizes some of the key ideas that have led to modern scientific understanding of light:

•  Around 500 BC the ideas of a number of the great Greek philosophers fell broadly into two main schools of thought. They were more related to the mechanism of vision, and the relationship between light, observed objects and the eye. Tactile theory, suggested by Empedocles and further developed by Plato, modelled light as something emitted from the eyeball capable of touching things too far away to be physically touched by the observer. Emission theory, proposed by Pythagoras among others, postulated that light was a stream of particles emitted from all objects which bombarded and caused pressure on the eye. At the time emission theory was not widely accepted, but led later to the theories of the ‘atomists’, who believed that everything was made up of minute indivisible particles – atoms. For over a thousand years, tactile theory was widely accepted, despite its inability to explain certain phenomena, such as the reason that some bright bodies were able to light up neighbouring bodies.

•  Around AD1000, Alhazen, a scholar and astronomer from Basra, established that light was external to the eye, and the tactile theory was finally rejected. Alhazen studied optics, and the behaviour of light moving in a straight line, establishing the Law of Reflection (page 104).

•  A revised interest in the study of optics in Europe after the thirteenth century led to the development of the refracting telescope. In 1621 Willebrord Snell developed the Law of Refraction, also known as Snell’s Law (page 104). Descartes independently derived the same law in 1637.

•  Isaac Newton (1642–1727) considered light to be a stream of moving particles and this became known as corpuscular theory. The theory suggested that the particles travelled in straight lines, providing an explanation for both reflection and the casting of shadows. Famously, Newton also used a prism to disperse white light and concluded correctly that white light is made of a mixture of colours.

•  Francesco Grimaldi (1618–1663) observed that shadows were slightly smaller than predicted due to ‘fringing’ at the edges (diffraction effects).

•  Robert Hooke (1635–1703) proposed that light travelled in the form of a wave rather than a particle, which allowed diffraction effects to be explained as a result of constructive and destructive interference of the waves.

•  Christian Huygens (1629–1695) further developed the idea that light was a wave in a universal medium, the ‘aether’.

•  Thomas Young (1773–1829) experimented with shining coherent light through a pair of closely spaced slits in a screen to produce a diffraction pattern of bright and dark fringes. The results could not be easily explained by particle (corpuscular) theory and led him to further develop the wave theory of light, with the Principle of Interference. He was able to explain Newton’s results with dispersion of white light through a prism in terms of wave theory and even determined wavelengths for the colours.

•  Michael Faraday (1791–1867) first established a relationship between light and electromagnetism. James Clerk Maxwell proved theoretically that electric and magnetic fields could continually propagate one another, travelling as a wave at a specific speed, close to experimentally determined values for the speed of light. This was a huge advance – enough to allow the theory that light was an electromagnetic wave to replace Newton’s corpuscular theory.

•  In 1900 Max Planck, investigating black body radiation (page 38), proposed that light energy was always emitted in multiples of certain energy units, or quanta (a quantum in physics refers to an indivisible unit of energy), although he maintained his belief that light was a continuous wave of electromagnetic radiation.

•  In 1905 Einstein, following his work on the photoelectric effect (page 39), went further, suggesting that light was emitted in the form of localized particle-like packets of energy, later called photons.

WAVE–PARTICLE DUALITY

As is evident from the chronology, the issue of whether light is a wave or a particle has been an ongoing theme in the study of the nature of light. Although wave theory was generally accepted as a result of the work of Maxwell, there remained questions in terms of light’s behaviour under certain circumstances that wave theory could not easily explain. It became clear that the concepts of light as particles and waves, which had seemed mutually exclusive, must both be considered when viewing the world at a submicroscopic level.

The work of Planck and Einstein initiated the development of quantum theory and quantum mechanics in physics, which centre around the fundamental idea that energy is not continuous but exists in discrete quantities. It is now accepted that a complete understanding of the nature of light must recognize both wave and particle theories as valid. This is termed wave–particle duality and is extended in current scientific theory to all matter at a quantum level: the idea that any object existing in the form of particles must also exhibit wave-like behaviour.

THE NATURE OF LIGHT

Light is a form of electromagnetic radiation (or radiant flux), which is emitted from various sources, most obviously the sun. Electromagnetic radiation is a self-propagating wave of energy which travels through empty space at a particular speed and consists of oscillations in electric and magnetic fields. As established by Maxwell, all electromagnetic radiation travels at the same speed in vacuum, that is approximately 2.998 × 108 ms−1.

When electric charges move, they are accompanied by a magnetic field. If the charges accelerate or decelerate, then disturbances occur in both the electric and magnetic fields and radiation is emitted in a wave motion, carrying energy away from the disturbed charges. The fields oscillate at right angles to each other. They are also at right angles to the direction in which the wave is moving. Furthermore, electromagnetic waves exhibit rectilinear propagation – that is, a single point on a wave, if unimpeded, is propagated in a straight line.

Electromagnetic radiation exists at a wide range of wavelengths or frequencies and these are collectively termed the electromagnetic spectrum (see page 35). In general, in photography and digital imaging we are more correctly referring to the visible spectrum – that is, the range of wavelengths of electromagnetic radiation that are detectable by the human visual system. However, there is a range of imaging processes, more commonly employed in scientific imaging, that use non-visible electromagnetic radiation, such as ultraviolet or infrared radiation and X-rays, all of which have wavelengths outside the visible range.

Upon examining the behaviour of electromagnetic radiation in its interactions at a subatomic level, the wave model is no longer adequate, as it implies that radiation is continuous. As originally proposed by Planck, energy can only be transferred in discrete amounts or quanta. Electromagnetic radiation may thus also be represented as a stream of individual quanta of energy or photons. A beam of light may be envisaged as a huge quantity of photons of energy travelling at the speed of light. The energy associated with a photon is related to the frequency of the electromagnetic radiation (see page 40). Photons may be scattered if they collide with other particles; they may also be emitted or absorbed by charged particles such as electrons. We cannot see individual photons but we may observe the results of absorption or emission.

RADIOMETRY AND PHOTOMETRY

Radiometry and photometry are scientific disciplines concerned with the measurement of electromagnetic radiation. Radiometry involves the measurement of quantities of radiation from any part of the electromagnetic spectrum and therefore includes the ultraviolet, visible and infrared regions of the spectrum. Photometry, by contrast, deals with the measurement of radiation as it is perceived by the human visual system and therefore only deals with visible light, wavelengths 360–770 nm. This is achieved by weighting the radiant power at each wavelength by the average response of the human visual system to light of different wavelengths.

Radiometric and photometric measures usually quantify one of the following:

•  Energy, which is a property possessed by a system that allows it to do work.

•  Power or flux, which is the rate at which work is done or energy is converted.

Radiometric definitions

Radiometric definitions and units are summarized in Table 2.1.

Photometric definitions

Photometry takes into account the unequal response of the human visual system to different wavelengths. This process is accomplished by multiplying the measured values by the relevant value of a standard function, V(λ), wavelength by wavelength. V(λ) is the CIE spectral luminous efficiency function for photopic vision (see Chapter 4), which is a plot of the average relative response of the human visual system against wavelength. The V(λ) function is used as a spectral weighting function to convert radiometric quantities to photometric quantities via spectral integration:

image

where Φv is the photometric quantity defined by the corresponding radiometric quantity in the calculation.

The equivalent photometric definitions to those given in Table 2.1 are summarized in Table 2.2.

See also Chapter 3 for further definition of luminance, luminous intensity and luminous flux.

OPTICS

The study of the behaviour and properties of light and the way in which it interacts with matter is termed optics. The various areas of optics may be broadly classed according to the different models used to explain the particular properties.

Table 2.1   Radiometric definitions and units

image

Table 2.2   Photometric definitions and units

image

Physical or wave optics models the propagation of light through an optical system as a complex wavefront. When a wavefront reaches an obstacle it may change direction, resulting in interference between different wavefronts.

Geometrical optics discounts the wave nature of light and studies its propagation in terms of rays. The path of any single point on the wavefront referred to above is a straight line with direction perpendicular to the wavefront. Hence we say that light travels in straight lines, in rays. The concept of light rays is helpful in studying the formation of an image by a lens.

Both are branches of classical optics and use approximations to model the behaviour of light. Physical optics successfully explains the diffraction, interference and polarization of light. Geometrical optics models the path of a ray of light as it moves through an optical system and may be used to describe reflection and refraction, which form the basis of imaging by lenses and are fully described in Chapter 10.

Quantum optics considers radiation in terms of quanta of energy. The effects of this are only evident at an atomic or subatomic level with the emission or absorption of light energy by matter, for example by a photographic emulsion or silicon in a digital sensor.

Aspects of wave theory, physical optics and quantum optics are covered in more detail in the remainder of this chapter.

WAVE THEORY

Many of the properties of light can be described by the fact that it takes the form of a wave, i.e. a disturbance that propagates in space. The family of waves that propagate in the same fashion as light waves are termed electromagnetic waves; therefore, the wave nature of light is described here in the context of electromagnetic theory. Where sound waves propagate in a mechanical fashion through matter – for example, air, rock and water – electromagnetic waves are self-propagating, waves travelling freely in empty space (the so-called vacuum) with a maximum velocity, c, of approximately 2.998 × 108 metres per second (or 300,000 kilometres per second). In media denser than a vacuum, electromagnetic waves slow down. In air the velocity is nearly the same as in a vacuum, but in water or glass it is reduced to about two-thirds of its value in a vacuum.

Sound waves are longitudinal, i.e. they oscillate along the direction of propagation of the wave; electromagnetic waves, however, are transverse waves, and oscillate in a direction perpendicular to the direction of propagation. According to Maxwell, at the speed of light electromagnetic waves oscillate in two planes, the electric field and the magnetic field planes, at right angles to each other and to the direction of propagation. The oscillation is generally described as sinusoidal (see below). Figure 2.1 shows an electromagnetic wave at a fixed instance of time, with the direction of propagation x, the electric field vibrating in direction y and the magnetic field vibrating in direction z. The maximum displacement from the normal x, which corresponds to the crest or trough of the wave, is termed the amplitude of the wave, a. At any given time, the amplitudes of the two fields are proportional and their oscillations are always in phase (see below). In practice we cannot observe the oscillations as they occur far too fast. We observe the power transmitted by the light beam, averaged over a long time (millions of oscillations). This average power is the light intensity measured by light detectors. The amplitude of the light waves is related to the intensity of the radiation; in light waves it is related to the intensity of light.

image

Figure 2.1   An electromagnetic wave at a fixed moment of time.

Simple harmonic motion

The simplest regular waveform is known as simple harmonic or sinusoidal motion; it follows the profile of a cosine or a sine curve. Any complex waveform can be constructed by a superposition of harmonic waves (see Chapter 7). The regular oscillatory movement of simple harmonic motion generates a sinusoidal waveform and is that of the projected motion, Sy, of a point, S, rotating at constant (angular) speed in a circle. Figure 2.2 shows the point S moving around the circle, its path projected on segment AB and the generated waveform as a function of time, t. The radius of the circle gives the amplitude, a. The time of completion of one rotation (i.e. an angle of 2π radians) is the wave’s temporal period, T.

image

Figure 2.2   Simple harmonic motion as a function of time, t.

To visualize the generation of a sinusoidal waveform, consider the point of initial oscillation is affixed to a flexible cord (see Figure 2.3). The up and down movements transmit the signal to each point on the cord, and energy moves along, forming the waveform. During the motion from O to A, back to O, to B and back to O, one complete oscillation occurs. During the time required for such an oscillation, (the temporal period, T) the wave is propagated along axis x and travels a distance equal to the wavelength (or spatial period) of the wave. The wavelength is denoted by the Greek letter λ (lambda). Points on the cord separated by a distance equal to (or any multiple of) λ, such as S1 and S2 in Figure 2.3, will have the same displacement. These points are in phase. The customary unit of λ in light waves is the nanometre, where 1 nm = 10−9 m, but the micron (1 μm = 10−6 m) and the older angstrom (1 Å = 10−10 m) are also used.

The distance the wave travels during T depends on the velocity of the wave in the particular medium. The velocity V and the wavelength λ are connected by the fundamental equation of wave motion:

image

For electromagnetic waves in a vacuum, V = c (i.e. the speed of light) and thus Eqn 2.2 becomes:

image

The number of complete oscillations per unit time, usually per second, is the temporal frequency, usually denoted by the Greek letter v(nu), and measured in hertz (Hz; the SI unit of frequency), after the German physicist Heinrich Hertz. It is equal to:

image

which indicates that the speed of light equals the frequency times the wavelength.

A true sine wave is infinite in space and time. From this ideal, one can then consider a frozen episode in time (e.g. in Figure 2.1) or a fixed position in space (e.g. Figure 2.3).

Another quantity related to harmonic waves is the spatial frequency, f, which describes the number of waves (or cycles) per unit length:

image

The spatial frequency, f, for space is the equivalent of temporal frequency, v, for time.

Now let t in Figure 2.2 be the time required for the rotation of S from S0 (the origin of the circle) and S0OS the angle covered during t. The related harmonic wave is expressed by:

image

Figure 2.3   Oscillating flexible cord.

image

or by:

image

where a is the amplitude of the wave, t is time and ω is the angular temporal frequency (often referred to as angular speed and measured in radians per second). ω is given by:

image

Equation 2.7 indicates that a cosine wave has a phase difference, ε, equal to π/2 radians from a sine wave.

If the wave in Eqn 2.7 is expressed as a function of space rather than time, then:

image

where k is a positive constant known as the propagation number (equivalent to ω) and is measured in radians per metre. The propagation number for any waveform is obtained by:

image

Figure 2.4 shows the harmonic function described as a function of space, x, and of phase, ø. One wavelength therefore corresponds to a change in phase ø of 2π radians (rad) or one cycle.

The wave in Eqn 2.7 can be transformed to a progressive wave, travelling with a velocity V in the positive direction x. At a distance x from its source the time difference will be x/V seconds (since V = tx); therefore, replacing t by t − (x/V) transforms Eqn 2.7 to:

image

The wave is then periodic in both space (x) and time (t). The spatial period is the wavelength, λ, and T is the temporal period, corresponding to the amount of time taken for one complete wave to pass a stationary observer. Holding either t or x fixed results in a sinusoidal disturbance.

Definitions of the variables associated with harmonic waves are summarized in Appendix A. There are a number of alternative equivalent expressions that can be used for travelling harmonic waves; these are also summarized in Appendix A.

Simple harmonic waves have a single constant frequency (and single wavelength) and are therefore monochromatic, the monochromatic wave equation being given by any of the expressions above. For example:

image

Ψ(x, t) can be thought of as the electric field strength. The frequency is v so that c = λv. As t increases, the whole wave moves to the right with velocity c = λv.

In reality, electromagnetic waves are never monochromatic, instead consisting of a band of frequencies. Even ‘perfect’ sinusoidal generators emit waves encompassing a narrow band of frequencies and are said to be quasimonochromatic.

Principle of superposition

When two (or more) separate waves travelling in the same medium and at the same time meet, they will overlap and simply add or subtract from one another, then move away from this location unaffected by the encounter. The resulting disturbance at each point in the overlapping region is the algebraic sum of the separate disturbances at this location. This is known as the principle of superposition, given by:

image

where y is the resultant wave and y1 and y2 the two interacting waves. Figure 2.5 illustrates examples of the superposition of sinusoidal waves.

image

Figure 2.4   A sinusoidal function can be described in terms of space, x, and of phase, ϕ.

image

Figure 2.5   Superposition of waves, y = y1 + y2. Top: the superposition of two waves with the same wavelength, the same amplitude and a phase difference of π/2. Middle: the superposition of two different waves with the same wavelength, the same amplitude and a phase difference of 2π/3. Bottom: the superposition of two waves of different wavelengths and the same amplitude.

In electromagnetic theory, light is described as a superposition of waves of different wavelength and polarization (see later) moving in different directions. The principle holds for waves travelling in non-dispersive media, i.e. a vacuum, where all frequencies travel with the same speed. If the medium was dispersive then their shape would change after the encounter. The superposition principle is also valid when the amplitudes involved are relatively small – linear superposition – and do not affect the medium in a non-linear fashion, as for example in the case of extremely high intensity light sources. Most optical phenomena, such as interference, diffraction and polarization, involve wave superposition in one way or another.

image

Figure 2.6   Point source in the centre and waves radiating outwards like waves on a pond. Wavefronts are planes with separation equal to one (or multiples of a) wavelength.

Plane waves

Up to this point wave motion has been considered as a one-dimensional vibration. Real light waves travelling in an isotropic homogeneous medium spread out at the same speed in all directions. Thus we need to consider light waves propagating out from a point source in three dimensions, similar to water waves moving out from a co-central point in a pond. At any given time, the surfaces on which the disturbance has constant phase (e.g. peaks of the co-sinusoidal oscillations) form a set of planes in space. The planes are referred to as wavefronts (see Figure 2.6).

According to Huygens, every point on a wavefront is considered to be a new point source, from which secondary ‘wavelets’ spread out in a forward direction. The surface which envelops all these wavelets forms a new wavefront, as illustrated in Figure 2.7. The advancing wave as a whole may be regarded as the sum of all the secondary waves arising from points in the medium. This continues as long as the medium remains isotropic (i.e. has the same properties in all directions) and homogeneous (i.e. has the same properties throughout its mass), both of which are true for most optical media. The straight line of propagation or of ‘energy flow’, which is always perpendicular to the wavefront, is considered as a light ray and many optical effects assume light as rays (as in geometric optics).

image

Figure 2.7   Propagation of light as wavefronts and wavelets.

The plane wave is the simplest representation of a three-dimensional wave (Figure 2.8a), where wavefronts are parallel to each other and perpendicular to the direction of propagation. Light propagating in plane wavefronts is referred to as collimated light. Far enough from the source a small area of a spherical wavefront – described above – resembles a part of a plane wave, as illustrated in Figure 2.8b.

The mathematical description of a monochromatic (i.e. single constant frequency) plane wave is the vector equivalent of the expressions describing the simple harmonic motion (see definition in Appendix A).

Light intensity

In practice, the wave oscillations are not usually observed, as their frequency is too high. We observe the power transmitted by the light beam, averaged over some millions of oscillations. The instantaneous light power is proportional to Ψ2(x, t):

image

The average of this over a long time is equal to the light intensity, or irradiance, I, which is proportional to the square of the amplitude, a, of the electric field:

image

This is the quantity measured by light detectors.

Refraction and dispersion

As mentioned in the introduction to wave theory, electromagnetic waves slow down when travelling through media denser than vacuum. A measure of this reduction in speed in transmitting materials such as air, glass and water is the refractive index n, which increases with the density of the medium. It is given by:

image

Figure 2.8   (a) The progression of a plane wavefront. The vector image denotes the direction of the plane wave. (b) Far away from a point source a portion of a spherical wave is described as a plane wave.

image

where v is the phase velocity of radiation of a specific frequency and is equal to:

image

ω and k are as before. The refractive indices at a wavelength of 598 nm for vacuum, air and crown glass are 1.0, 1.0029 and 1.52 respectively.

When light waves travel between transmitting media with different refractive indices, their speed changes at the boundary and the waves entering the new medium change direction. This is refraction and is illustrated in Figure 2.9 (see also Chapter 6 for the Law of Refraction). The frequency of the waves does not change when the waves pass from one medium to another, but the wavelength increases or decreases (since V = vλ). The refractive index is therefore wavelength dependent and the speed of propagation in media other than a vacuum is frequency dependent. In general, n varies inversely with wavelength; it is greater for shorter wavelengths. This causes light in denser media to be refracted by different amounts, depending on wavelength. Refraction is therefore responsible for the splitting of the white light by a prism into different wavelengths, which are perceived as different hues. Because the refractive index of glass is higher than that of air, the different frequencies travelling with different speeds are refracted at different angles. This separation of electromagnetic waves consisting of multiple frequencies into components of different wavelengths is the phenomenon of dispersion (see Figure 2.26).

image

Figure 2.9   The refraction of waves. Here the refractive index of the transmitting medium, nt, is greater than the refractive index of the incident medium, ni. The travelling wavefront bends at the boundary because of the change of speed.

Polarization

So far we have considered electromagnetic wave propagation but not the direction of wave oscillations. Polarization in electromagnetic waves describes the direction of oscillation (or disturbance) of the electric field in a plane perpendicular to the direction of propagation – the plane of polarization. In polarization we describe the electric field only, since the magnetic field is perpendicular to it and their amplitudes are proportional. With natural light, which arises from a very large number of randomly oriented atomic emitters, the direction of wave oscillations is changing constantly and randomly. So light in its natural form is considered as randomly polarized, or unpolarized (see Figure 2.10).

To illustrate, it is useful to visualize a plane harmonic wave with electric field image, split into two component waves with perpendicular electric fields, imagex and imagey, with orientations x and y respectively, and with z the direction of propagation. The two component waves will have the same frequency but their amplitudes and phases may differ. The state of polarization of the wave is obtained by examining the shape outlined by the electric vector, image, in the plane of polarization as the wave progresses, which depends on the amplitudes and phase difference of the two component waves.

image

Figure 2.10   A head-on view of an approaching wavefront of unpolarized light at a fixed instant in time. At each position the direction of vibration in the plane of polarization is randomly aligned.

In linear polarization or plane polarization the two orthogonal components are in phase (or have a phase difference of an odd integer multiple of π – i.e. opposite phase) but their amplitude may differ. A linearly polarized wave has (a) a fixed amplitude and (b) a constant direction of the electric field (although it may vary in magnitude and sign with time). That is, the tip of the wave vector outlines a single line in the plane of polarization, as illustrated in the right corner of Figure 2.11. One completed line corresponds to one complete cycle as the wave advances one wavelength.

This process is equally valid in reverse. It is possible to examine the direction of oscillation (i.e. the state of polarization) of a wave resulting from the superposition of two linearly polarized light waves of the same frequency and with electric field vectors perpendicular to each other and to the direction of propagation.

So far we have illustrated only harmonic waves in which the orientation of the electric field is constant, i.e. plane polarized or linearly polarized light waves (e.g. Figure 2.1). Light can be converted from an unpolarized form to a linearly polarized form by passing it through a linear polarizer (e.g. polarizing sunglasses or filters). A polarizer has a transmission axis such that only incident oscillations parallel to this axis are transmitted, as illustrated in Figure 2.12.

In circular polarization the two constituent waves have equal amplitude and a phase difference of π/2, or exact multiples of it. That is, when one component is zero the other component is at maximum or minimum amplitude. In this case the wave is circularly polarized. It has (a) a constant amplitude and (b) the direction of the electric field is time varying and it is not restricted to one plane (Figure 2.13). The wave’s electric field vector traces out a helix along the direction of wave propagation. After one complete oscillation the vector makes one complete rotation and outlines a circle in the plane of polarization. The direction of rotation depends on the phase difference of the two constituent waves. If the wave is moving towards an observer, in right circular polarization (where the phase difference is equal to …, −π/2, 3π/2, 7π/2, …, and so on) the wave’s electric field vector appears to rotate clockwise, whereas in left circular polarization (phase difference equal to π/2, 5π/2, 9π/2, …, etc.) it appears to rotate anticlockwise. A linearly polarized wave can be composed of two opposite polarized waves of the same amplitude.

image

Figure 2.11   Linear light. The two wave components with electric fields imagex and imagey are in phase. The wave’s electric field, image, is linearly polarized in the first and third quadrants.

image

Figure 2.12   Transmission through a linear polarizer.

image

Figure 2.13   Rotation of the electric field vector in a right circular wave.

Adapted from Hecht (2002).

In all other cases, if the two orthogonal components are not in phase, or out of phase by exactly π/2 or multiples of it – and don’t have the same amplitude; the vector outlines an ellipse in the plane of polarization as it advances a wavelength, and the polarization is called elliptical polarization. Both linear and circular polarizations are special cases of elliptical polarization. Figure 2.14 illustrates various polarization configurations.

Plate retarders (such as photographic filters used with auto-focus lenses for light polarization) are crystals that retard the phase of wave vibrations in one direction relative to the direction at right angles. A quarter-wave plate, for example, will retard one direction by π/2 relative to the other direction.

Generally light, whether natural or artificial, is neither completely polarized nor unpolarized. Commonly, the electric field vector varies in a way which is neither totally regular nor totally irregular. We refer to this light as being partially polarized. It is often useful to imagine this as the superposition of specific amounts of natural and polarized light.

Interference

Interference occurs when two or more light wavefronts are superposed, resulting in a new wave pattern. According to the Principle of Superposition the resulting disturbance in the overlapping region is the algebraic sum of the separate disturbances. For simplicity, when we describe the effect of the superposition – that is the interference effect – we use coherent wavefronts (i.e. monochromatic waves of the same frequency), which interfere constructively or destructively. If two wave crests meet at a specific point in space then they interfere constructively and the resulting amplitude, a, is greater than either of the individual amplitudes a1 and a2 (i.e. a = a1 + a2). If a crest of a wave meets a trough of another wave then they interfere destructively, and the overall amplitude is decreased (i.e. a = |a1a2|). Figure 2.15a illustrates an example of constructive interference of two monochromatic waves in phase, where the amplitude of the combined wave is equal to the sum of the constituent amplitudes. Figure 2.15b illustrates an example of destructive interference, where the two waves are π radians out of phase. In this case one wave’s crest coincides with the other wave’s trough and the amplitudes totally cancel out.

image

Figure 2.14   Various polarization configurations.

Adapted from Hecht (2002)

image

Figure 2.15   (a) Total constructive interference. (b) Total destructive interference.

Consider two coherent point sources, S1 and S2, emitting light in a homogeneous medium, having separation d. In a point P far away from the sources the overlapping wavefronts from the sources will be almost planar, as illustrated in Figure 2.16. The irradiance, I, at point P is equal to:

image

where δ is the phase difference arising from the combined path length and the initial phase differences in the emitted waves.

At various points in space the irradiance can be greater, equal to or less than I1 + I2, depending on δ:

•  The minimum irradiance occurs when cos(δ) = 1 (δ = 0, ±2π, ±4π, …, etc.). This is the case of total constructive interference (as in Figure 2.15a), where the phase difference, δ, is integer multiples of 2π:

image

Figure 2.16   Waves from two point sources overlapping in space.

Adapted from Hecht (2002).

image

 

•   When 0 < cos(δ) < 1 the waves are out of phase and I1 + I2 < 1 < Imax, but the result is still constructive interference.

•   When cos(δ) = 0, at δ = π/2 the waves are a quarter of a circle out of phase and I = I1 + I2.

•   When 0 > cos(δ) > −1 we have destructive interference and I1 + I2 > 1 > Imin.

•  The minimum irradiance occurs when cos(δ) = −1 (δ = ±π, ±3π, ±5π, …, etc.). This is the case of total destructive interference (as in Figure 2.15b), where δ is odd multiples of π:

image

When the amplitudes of the interfering waves reaching P are equal, denoted here as I0 (i.e. I1 = I2 = I0), Eqn 2.17 becomes:

image

Thus, Imin = 0 and Imax = 4I0.

Interference finds practical applications in single- or multi-layer anti-reflection coatings on lenses and as interference or dichroic filters for selective filtration of broad or narrow spectral bands.

Diffraction

Diffraction effects are general characteristics of wave phenomena whenever a portion of the otherwise infinite wavefront is removed by an obstacle. Light ‘bends’ around the edge of the obstacle producing a spreading of the image beyond the area expected. A shadow of a solid object when lit by a point source shows fringes near its edges due to diffraction effects. Diffraction also occurs when all but part of the wavefront is removed by an aperture or stop. The importance of diffraction depends on the size of the obstacle or aperture relative to the wavelength. It is most noticeable when the wavelength is close to the size of the diffracting obstacles or apertures. The complex patterns in the intensity of a diffracted wave are a result of interference between different parts of the wave reaching the observer by different paths. The halo seen around stars or around the sun are effects occurring due to the diffraction of light by particles in the atmosphere.

Different types of diffraction are observed, depending on the distance of the diffracting object to the plane of observation and the type and position of the light source. If an aperture is illuminated by plane monochromatic waves coming from a distance source and its image projected on to a screen parallel to the plane of the aperture, then Fresnel or near-field diffraction is the phenomenon observed when the plane of observation – in this case the screen – is near enough to the aperture. The projected image of the aperture appears structured with prominent fringes (Figure 2.17a). This is the more general case of diffraction; it is very complex to describe mathematically. When the screen is at a large distance from the diffracting object (i.e. so that planar or almost planar wavefronts hit the screen), or a lens is inserted between the object and the screen, focusing the image at infinity on to the screen, the result is Fraunhofer diffraction or far-field diffraction. In this case the projected image of the aperture is considerably spread out (Figure 2.17b). Both the incoming and outgoing waves are planar, or nearly planar – differing therefore by a small fraction of wavelength. Generally, Fraunhofer diffraction occurs at an aperture (or obstacle) of width d, or smaller, when:

image

Figure 2.17   Diffraction patterns from a single slit. (a) Fresnel diffraction. (b) Fraunhofer diffraction.

image

where R is the smaller of the following distances: (i) the distance from the point source to the aperture plane, or (ii) the distance from the aperture plane to the plane of observation. Equation 2.22 indicates that aperture width, wavelength and the distance R are parameters for Fraunhofer diffraction to occur. Fraunhofer diffraction is a special case of Fresnel diffraction but is more commonly examined because of its inherent mathematical simplicity.

Diffraction of a circular aperture and a single slit

Fraunhofer diffraction is observed when plane waves pass through a small circular aperture. Instead of producing a bright spot on a distant screen, the consequent far-field diffraction will produce a diffuse circular disc, known as an Airy disc, surrounded by much fainter concentric circular dark and bright rings, as shown in the examples in Figure 2.18. This example of diffraction is very important because the eye, cameras, telescopes and many other optical instruments have circular apertures.

image

Figure 2.18   Computer-generated Airy discs for apertures with: (a) diameter d; (b) diameter 2d.

The radius, y1, from the centre of the Airy disc to the first dark ring, the first minimum or first zero, is given by:

image

where λ is the wavelength, d the aperture diameter and R the distance from the aperture’s centre to the first minimum of the Airy disc on the screen (see Figure 2.19). As R is very large compared to the radius of the disc, it is often represented as approximately the distance from the aperture plane to the screen plane. The distance between each dark ring is equal to y1, and thus the distances between the centre of the Airy disc and the second, third, …, nth dark ring (i.e. the second, third, …, nth minimum) are multiples of y1.

image

Figure 2.19   Set-up for Fraunhofer diffraction from a circular aperture producing an Airy pattern on screen.

Note that, if a lens is focusing on the distant screen, the lens’s focal length, f, is approximately equal to R and thus f is replacing R in Eqn 2.23. The diameter of the Airy disc, i.e. 2y1, in the visible spectrum is roughly equal to f/N, where N is the f-number or relative aperture of the lens (see Chapter 6). For light of wavelength (λ) 400 nm, D is approximately N/1000, i.e. 0.008 mm (8 mm) for an f/8 lens. In the most usual notation N = f/d and thus the first minimum according to Eqn 2.23 occurs at y1 = 1.22λN. Asshown in Figure 2.18, the diameter of the Airy disc increases as the aperture diameter decreases. The Airy disc is the diameter (D) of the first dark ring of the pattern given by D = 2.44λN, where N is the f-number.

An optical system is diffraction limited if its resolution is only limited by diffraction and not by the system’s spreading or degradation. Diffraction limited optical systems are very high quality systems, free of optical aberrations (see Chapter 10), which otherwise mask the diffraction effect. Most non-specialist lenses are aberration-limited as residual aberrations are greater than the diffraction effects, certainly at larger apertures. The Airy disc of a diffraction-limited optical system is considered as the image of a point formed by the system and represents the distribution of light at the focus point. Instead of just a point, the complex amplitude is distributed according to a Bessel function (i.e. a rather complicated mathematical expression). This represents the point spread function of the system (see also Chapter 7).

The one-dimensional approximation of the Airy disc is similar to the Fraunhofer diffraction pattern from a single long slit (see Figure 2.20). The one-dimensional analysis represents the behaviour of a lens aperture invariant in one direction (a long slit) when imaging a one-dimensional point (a line). The amplitude of the diffracted pattern is given (instead of a Bessel function) by a sinc function,§ while the intensity (or irradiance) is the square of it:

image

Figure 2.20   Fraunhofer diffraction of a circular aperture (left) and of a thin slit (right).

image

where d is the width of the slit, y is any point in the diffracted pattern, R approximately the distance between the slit and the screen, and λ the wavelength.

Hence, the sinc2 function obtained in Eqn 2.24 represents the distribution of intensity of the image of a line. It is the so-called line spread function of the optical system, illustrated in Figure 2.21.

Notice the shape of the functions. There is a maximum value (i.e. representing the maximum intensity I(0)) when y = 0 and the function drops to zero at:

image

and then again at 2(λR/d), 3(λR/d). These correspond to the ± first, second, third, …, etc. minima of the intensity profile. Similarly to the case in the circular apertures, the width of the central bright line varies with the width of the slit, d, while R represents the slit to screen distance – and for a lens its focal length, f.

Rayleigh criterion

Consider a diffraction-limited lens system imaging two incoherent point sources, such as two closely distanced stars. The images formed by the system will consist of two Airy distributions partially overlapping. According to the Rayleigh criterion, the minimum resolvable angular separation Δθ, or angular limit of resolution, is:

image

where d is the aperture diameter, λ the wavelength, and θ the angle between the optical axis and the axis from the centre of the aperture to the first minimum on the distant screen (Figure 2.19).

If Δl is the centre-to-centre separation of the two Airy discs, the spatial limit of resolution is given by:

image

where f is the focal length of the lens and N the numerical aperture (see Figure 2.22). Δlmin in Eqn 2.27 is essentially y in Eqn 2.23, which is often quoted as the just resolvable separation or resolution limit of the optical system for two incoherent point sources. Note that for square apertures it is equal to Nλ. The resolving power (see Chapter 24) for the lens system is defined as 1/Δømin or 1/Δlmin.

image

Figure 2.21   Amplitude (thin line) and intensity (thick line) distributions for Fraunhofer diffraction from a thin slit. The latter also represents the line spread function of a diffraction limited optical system, with value I (0) at y = 0.

image

Figure 2.22   Intensity profiles of two overlapping images. (a) Well resolved. (b) Just resolved – the centre of the one Airy disc coincides with the first minimum of the other. (c) Not resolved.

THE ELECTROMAGNETIC SPECTRUM

As mentioned earlier in the chapter, there are other waves consisting of alternating electric and magnetic fields and travelling with the speed of light, but with shorter or longer wavelengths than light. The complete series of waves, arranged in order of wavelengths (or of frequency), is referred to as the electromagnetic spectrum. This is illustrated in Figure 2.23. There is no clear-cut line between one wave band and another, or between one type of radiation and another; the series of waves is continuous.

Radio waves, discovered by H. Hertz in 1887, have very long wavelengths (from 0.3 metres to many kilometres). Their frequency range is very large and extends to about 109 Hz. Radio waves have no effect on the body, i.e. they cannot be seen or felt, although they can readily be detected by radio and television receivers. Radio frequencies are used in astronomy and are captured by radio antennas. Otherwise radio waves are used for transmitting telephone communications, guiding aeroplanes, in speeding radars and in remote sensing. Moving along the spectrum to shorter wavelengths, the microwave region extends to about 3 × 1011 Hz. Water molecules are heated by the absorption of microwave radiation, hence its use in microwave ovens.

Infrared radiation, or IR, covers the region of frequencies before those seen as light, i.e. the visible spectrum. IR was first detected by the English astronomer Sir William Herschel in 1800. Heat emission from all molecules with temperatures above absolute zero (0°K or −273°C) is accompanied by IR radiation, with hotter bodies emitting more. IR is emitted from all light sources and naturally the sun, which produces almost half of its electromagnetic energy in IR. The IR spectrum extends roughly from 3 × 1011 Hz to 4 × 1014 Hz and is divided into four regions: the near IR, nearest to the visible spectrum, from 720 nm to 3000 nm; the intermediate IR, from 3000 to 6000 nm; the far IR, from 6000 to 15,000 nm; and the extreme IR, from 15,000 nm to 1 mm. Specially sensitized photographic films and digital camera sensors respond to a part of the near IR part (<1500 nm). The landscape in Figure 2.24 is photographed in the near IR, showing foliage (tree leaves, grass) strongly reflecting IR in the same way that snow reflects light. Far IR is used in thermal imaging, where IR detectors capture small differences in temperatures of objects which are further described as an image, often referred to as a ‘heat map’ or a thermograph – often used for diagnostic purposes and for ‘night vision’. IR is widely used in astronomical, remote sensing and forensic applications. In biometrical and museum imaging IR is also very valuable due to its penetrating properties, allowing imaging beneath layers of objects.

image

Figure 2.23   The electromagnetic spectrum.

image

Figure 2.24   Infrared photograph captured digitally using Sony DSC F828 camera and Hoya R72 filter.

© Andy Finney, www.invisiblelight.co.uk

The visible spectrum occupies a minute part of the total range of electromagnetic radiation, comprising frequencies between approximately 4.15 × 1014 Hz and 7.85 × 1014 Hz (380 and 720 nm). Within these limits, the human eye sees changes of wavelength as changes of hue (see more on perception of light and colors in Chapters 4 and 5). The change from one perceived hue to another depends on many variables influencing the human visual system, but the spectrum may be divided up roughly as shown in Figure 2.25. More than 300 years ago, Newton discovered that white light was a mixture of hues by allowing it to pass through a triangular glass prism. A narrow beam of sunlight was dispersed into a band showing the hues of the rainbow. These represent the visible spectrum. Newton’s experiment is illustrated in Figure 2.26. It was later found that the dispersed light could be recombined using a second prism, producing white light once again.

Just beyond the visible in the electromagnetic spectrum is the ultraviolet, or UV, region with frequencies from 4.15 × 1014 to 8 × 1014 Hz. UV radiation is also subdivided into near, intermediate, far and extreme or vacuum UV. Much of UV radiation is harmful to humans, causing sunburn as well as other damage to the skin and the eyes. Fortunately, ozone (O3) in the atmosphere absorbs most of the harmful UV coming from the sun, but considerable amounts are found at sea level due to scattering. UV is not as penetrative as infrared, although long-wave UV radiation penetrates deeper into the skin than visible light. UVA is the part of the UV spectrum with the longest wavelength; it is known as black light. Black light artificial sources (also known as Wood’s lamps) are used in UV reflectance and UV fluorescence photography. UVB is medium wavelength and UVC is short wavelength UV radiation. The UVB and UVC radiation are partially responsible for the generation of the ozone layer and cause sunburn to the human skin. A positive effect of UVB exposure nevertheless is that it induces the production of vitamin D in the skin. Humans cannot see UV because the eye’s cornea absorbs it, but a number of insects – such as honey bees – and other animals create images from UV reflected from various objects. UV is also used to excite some types of molecules, for example in flowers’ pollen, which in turn emit other (usually higher) wavelengths (a phenomenon known as luminescence). Similarly to IR, UV is widely used in astronomical imaging and in biomedical and forensic applications.

image

Figure 2.25   The visible spectrum.

image

Figure 2.26   Dispersion of white light by a prism.

X-rays were discovered by Wilhelm Conrad Röntgen in 1890. X-radiation ranges roughly from 2.4 × 1014 to 5 × 1014 Hz and has extremely small wavelengths, most of them smaller than the size of atoms. X-rays are very energetic and penetrate the human body and several centimetres in metals. Medical applications of X-rays include medical film, digital radiography and computed axial tomography. X-ray microscopes and telescopes produce images of the invisible of a huge range of spatial scales. Finally, gamma-radiation is the most energetic radiation in the spectrum and is also very penetrative. Both X-radiation and gamma-radiation, unless properly controlled, are harmful to human beings.

BLACK-BODY RADIATION

A black-body radiator is a theoretical surface, which absorbs all radiation that is incident on it (therefore appearing perfectly black) and also behaves as a perfect emitter of radiation.

Black-body radiation theory began with the work of Gustav Robert Kirchhoff in 1859, studying the electromagnetic radiation emitted from bodies in thermal equilibrium. All objects at non-zero temperatures emit electromagnetic energy. Kirchhoff proposed that different surfaces would absorb and emit different amounts characterized by two coefficients, αλ and ελ. The absorption coefficient, αλ, is the fraction of incident energy of wavelength λ absorbed per unit area per unit time in a tiny band of wavelengths around λ and the emission coefficient, ελ, is the energy emitted per unit area per unit time for the same tiny band of wavelengths. The coefficients are specific to the material, depending on its surface properties, and wavelength dependent, i.e. a body may emit or absorb well at some wavelengths but not others.

Kirchhoff suggested that for a surface in the wall of an isolated cavity at thermal equilibrium in which all radiation was absorbed and emitted by the walls, the total amount of energy absorbed at all wavelengths must equal that emitted, or the temperature would change. He defined a wavelength distribution function Iλ(λ), independent of the material, but dependent only on T, the absolute temperature of the cavity.

The energy absorbed at wavelength λ (for any material) will be αλIλ, which will be equal to the energy emitted, ελ. Kirchhoffs Radiation Law can be expressed as:

image

If Kirchhoff’s law is applied to a black-body radiator, then by definition it will absorb 100% of the incident radiation; αλ will become 1, and:

image

In other words, the radiation emitted from a black body heated to a particular temperature is equivalent to Iλ at that temperature. Iλ is commonly called a black-body radiation curve and is illustrated in Figure 2.27.

From Kirchhoff’s work, the scientific community began to try to obtain Iλ for different temperatures, theoretically and from experimentation, but experienced difficulty in finding suitable sources approximating black-body radiation. From some experimental results, Jozef Stefan in 1879 realized that the total energy radiated per unit surface area per unit time (its power) was proportional to T4. T is the temperature measured in Kelvin. The Kelvin scale is an absolute temperature scale where 0 K is absolute zero (which corresponds to – 273.15°C). Ludwig Boltzmann, in 1884, used laws of thermodynamics to provide a theoretical basis for Stefan’s findings and together their results were expressed as the StefanBoltzmann Law:

image

Figure 2.27   Black-body radiation curves (Iλ) at different temperatures.

image

where P is the total radiant energy at all wavelengths, A is the radiating surface area, T is the absolute temperature and σ is a universal constant = 5.67033 × 10−8 Wm−2 K−4; P/A is equivalent to the area under the Iλ curve at that temperature.

Wien’s Displacement Law

A further development in black-body radiation came with the work of German physicist Wilhelm Wien, who in 1893 used thermodynamics to derive a law describing the way in which the peak wavelength of a black-body radiation curve (the wavelength at which most energy is radiated) shifted as the temperature changed. The peak wavelength defines the dominant colour of the radiation. Wiens Displacement Law defines the peak wavelength as inversely proportional to the temperature:

image

where λmax is the peak wavelength in nanometres (nm), T is the temperature in kelvin (K) and the constant is Wien’s displacement constant = 2.8977685(51) × 106 nm K.

Because the two variables are inversely proportional, as the temperature increases, the peak wavelength gets shorter. A black body, when heated from cold, will begin to radiate electromagnetic energy, with the dominant energy emitted first in the infrared long wavelength regions, and as it gets hotter will begin to emit visible radiation, glowing red, then yellow and eventually blue–white (a similar effect can be visualized by considering a piece of metal being heated in a furnace). The change in peak wavelength is illustrated by P1 and P2 in Figure 2.27, which are 580 and 724 nm respectively.

Wien’s Displacement Law has particular relevance in photography in the characterization of light sources in terms of colour temperature. Colour temperature is a measure that may be used to describe the spectral characteristics of a light source and is derived from black-body theory. The measurement of colour temperature plays an important part in trying to match the spectral response of the sensitive material to the light source to ensure correct or adequate colour reproduction. The measurement of colour temperature and characterization of light sources is dealt with in Chapter 3.

The practical problem of an appropriate source of black-body radiation was solved at the end of the nineteenth century: The radiation from a black body will be the same as that for an isolated cavity in thermal equilibrium at the same temperature; therefore, it may be approximated by that emerging from a small opening in the wall of a furnace at the same temperature. This meant that actual data could be obtained to produce experimental curves for Iλ. It became clear, however, that the theoretical approaches used so far were not adequate; they did not produce curves that fitted the data. Until this point, the expressions derived were obtained by applying classical wave theory, the assumption being that the electromagnetic waves in the cavity travelled as a continuous flow of energy. It was possible to use the expressions to obtain curves that would fit the practical results, but only for some parts of the electromagnetic spectrum, either fitting the experimental curves at short wavelengths but deviating from them wildly at longer wavelengths, or vice versa. It was clear, finally, that wave theory alone could not adequately describe electromagnetic radiation.

Planck’s Law

As already stated, all objects emit electromagnetic energy and this is the result of the constituent atoms moving randomly. The atoms in the walls of the cavity in the experimental set-up to determine black-body radiation are in constant motion, and are continuously emitting and absorbing radiation. Overall, the atoms must also be in an equilibrium state, i.e. the amount of absorption and emission is the same, to maintain a constant temperature. In attempting to derive an expression to predict the measured Iλ at a particular temperature, classical wave theory modelled the radiation in the cavity as standing waves, assuming that radiation could be absorbed or emitted in any quantity, however small. It was therefore assumed to be a continuous quantity.

Max Planck, in 1900, turned to the work of Maxwell and Boltzmann, which used statistical analysis to predict the behaviour of gas molecules enclosed in a chamber at constant temperature. This led him to an expression which successfully fitted the experimental data for black bodies, but only if he made an important assumption: that the energy exchanges of the atoms in the wall of the cavity were in discrete quantities only. His hypothesis was that the amount of energy released was dependent on the frequency of radiation, and he defined the relationship between energy and frequency in a relationship now known as Plancks Law:

image

where E is the energy, v is the frequency and h is a universal constant, known as Planck’s constant, with a value of 6.626 × 10−34 joule seconds. Planck’s assumption led him to an expression which could predict black-body radiation where others had failed; however, he held on to the classical wave picture of light, not yet realizing that light itself came in quantum amounts.

THE PHOTOELECTRIC EFFECT

It had been observed in the 1890s that when certain metals were bathed in either ultraviolet radiation or visible wavelengths of light, electrons were released from the metal surface (Figure 2.28). This phenomenon was known as the photoelectric effect. The photoelectric effect occurs as a result of electrons gaining enough energy to escape the ionic attractions in the metal’s surface by absorbing the incident electromagnetic energy. For each metal it was found that there was a maximum wavelength (or minimum frequency) of radiation that would produce this effect and above this no further electrons would be emitted (for example, a metal might emit under short-wavelengths but not under long).

Classical wave theory predicts that the energy in an electromagnetic wave is continuous and evenly spread across the entire wavefront. If this were true, then electrons would gradually gain energy from the incident wave and be able to escape from the surface. The intensity of the incident light would be proportional to the amount of energy it carried. This theory could not explain, however, why a minimum threshold frequency of light was required for each metal, below which electrons would not be emitted, regardless of the light intensity. Additionally, it was observed that the electrons were emitted immediately that the surface was irradiated, which should not happen if there was a slow cumulative gain of energy over time. It was also established that the electrons had various kinetic energies from zero up to a certain maximum energy and this maximum was independent of the light intensity, again unexplained by classical theory.

Albert Einstein, in 1905, published a different approach to the mechanism of the photoelectric effect by assuming that there was a work function for the metal surface. The work function represents the minimum amount of work required to give the electron enough energy to escape the ionic attractions at the metal’s surface. Using Planck’s hypothesis, Einstein concluded that if the electrons in the surface were irradiated with light of frequency v, they would receive energy only in quantities of hv, called photons.

If the quantity hv was less than the work function, then the energy gained would be lost immediately in collision with neighbouring atoms and the electrons would not escape. There was no intermediate state of ‘partial escape’. This meant that electrons could not be gaining energy cumulatively: they must do so in a single step. Such an assumption explains why light intensity has no influence on whether electrons are emitted; an increase in intensity would only result in more energy ‘packets’ of the same size, not an increase in the amount of energy that a single electron could receive.

If hv was greater than the work function of the material, then the electron would have enough energy to escape the surface and any additional photon energy would be converted to kinetic energy in the electron. The escaping electrons might lose some energy during the escape, due to collisions with other atoms, meaning that they would emerge with a range of kinetic energies up to a maximum, which was related to the frequency of the radiation. This is summarized in Einsteins photoelectric equation:

image

Figure 2.28   Demonstration of the photoelectric effect. A zinc plate is placed on a gold leaf electroscope and negatively charged. In (a), the gold leaf is repelled as a result of the negative charge. (b), When the zinc plate is bathed in UV radiation the gold leaf gradually collapses due to the emission of electrons.

image

where k.e.max is the maximum kinetic energy of the escaping electrons and Ψ0 is the work function of the material.

The existence of a threshold frequency for each metal can be explained by the work function of the material; the minimum photon energy capable of causing the photoelectric effect will eject the electron with zero kinetic energy. The photoelectric equation becomes:

image

where v0 is the threshold frequency.

The existence of light in quantities called photons provided solutions to the problems raised by the photoelectric effect that classical wave theory had failed to explain. Einstein’s work was a theoretical treatment and caused much controversy at the time, appearing as it did to contradict the wave theory of light based upon Maxwell’s equations for electromagnetic waves and, perhaps more fundamentally, the idea that energy was not something that could be divided into infinitely small units, i.e. continuous in nature.

It was not until 1915, with the work of Robert Andrews Millikan, that Einstein’s theory was shown to be correct: Einstein had predicted that the energy of the ejected electrons would increase linearly with the frequency of the radiating light and Millikan was able to demonstrate this linear relationship experimentally. The work on the photoelectric effect had turned the world of physics on its head, resulting in the introduction of the concept of the duality of light: that it exhibited the characteristics of both waves and particles in different circumstances. Wavee particle duality would eventually be extended to other elementary particles such as the electron.

THE PHOTON

A photon is a single quantum of energy that can only exist at the speed of light. It is the fundamental element carrying all electromagnetic radiation. It is indivisible, it cannot be split into smaller units, and is a stable, elementary particle. Photons differ from other elementary particles such as electrons in that they have zero mass or electric charge and consist purely of energy. When many photons are present, for example in a light beam, their numbers are so large that the inherent discontinuity or granularity of the light beam disappears and it appears as a continuous phenomenon. The photon can exhibit wave-like behaviour, resulting in phenomena such as refraction or diffraction, described earlier in this chapter. It also behaves as a particle when interacting with matter at a subatomic level, exchanging energy in discrete amounts. The amount of energy E (from Planck’s Law) exchanged during such an interaction is dependent on the frequency of light, and is known as the photon energy. This relationship may be rewritten in terms of the wavelength of the light:

image

where c is the speed of light, h is Planck’s constant, v is the frequency and λ is the wavelength. This means that shorter wavelengths of light will have higher photon energy (see Figure 2.23, which shows the electromagnetic spectrum in terms of wavelength, frequency and photon energy). This becomes important later on, in understanding why imaging materials are sensitive to some wavelengths of light and not others (spectral sensitivity).

BOHR MODEL OF THE ATOM

In the early twentieth century, the commonly accepted model of the atom was the planetary model of Rutherford in 1911 (Figure 2.29), which consisted of a small densely packed positively charged nucleus, made up of protons and neutrons, about which clouds of negative electrons orbited, like planets around the sun. Rutherford’s model assumed that some outer electrons were loosely bound to the nucleus and could therefore be easily removed by applying energy.

There is a problem with this model, however, as follows: the negative charge of the nucleus and the positive charge of the electrons mean that there is an electrostatic attraction between them. The electrons must have a certain amount of energy to allow them to orbit at a distance away from the nucleus. The orbiting electrons are constantly changing direction; therefore they are accelerating. Classical laws of electromagnetism state that accelerating charges must radiate energy. A charge oscillating with frequency n (the frequency of its circular orbit) will radiate energy of frequency n and if the electron is radiating energy, it is losing energy. The radius r of the orbit must decrease, leading the electron to fall into the nucleus, which would indicate that all matter was inherently unstable. As the orbit of the electron decreases, the frequency of emitted radiation should also change, continuously. Emission spectrum experiments with electric discharges through low-pressure gases show, however, that atoms will only emit electromagnetic radiation of specific discrete frequencies.

image

Figure 2.29   Rutherford model of the hydrogen atom.

Niels Bohr, in 1913, used quantum theory to propose a modification of the Rutherford model. He postulated three fundamental laws:

1.   Electrons can only travel in orbits at certain quantized speeds; therefore, they can only possess certain discrete energies. While in a state corresponding to these discrete energies (a stationary state), the electrons will not emit radiation. The energy of a particular orbit will define the distance of the electron from the nucleus.

2.   Rather than continuously emitting energy as they accelerate, electrons only emit (or absorb) energy when they move between different orbits.

3.   The electron will absorb or emit a single photon of light during a jump between orbits. The photon will have energy equal to the energy difference between the two orbits. If E2 and E1 are the energy levels of the two orbits, then using Planck’s Law, ΔE = hv = E2E1; therefore, the frequency of radiated energy will be (E2E1)/h.

Fundamentally Bohr’s model led to the idea that at the subatomic level, the laws of classical mechanics did not apply and quantum mechanics must be used to describe the behaviour of electrons around a nucleus.

Bohr initially developed his model for the hydrogen atom, but later extended it to apply to atoms containing more electrons (to balance out the heavier nucleus as a result of more protons). He suggested that each electron orbit, corresponding to a stationary state, could contain a certain number of electrons. Once the maximum had been reached for the orbit, the next energy level would have to be used. These levels became known as shells, each one corresponding to a particular orbit and energy level. The shell model of the atom was able to explain many of the properties of elements in the periodic table. It eventually led to a more complex but accurate model of the atom using quantum mechanics to describe the behaviour of electrons in atomic orbitals, but the shell model is often used as an introduction to quantum theory.

Many of the chemical properties of elements are determined by the outer (valence) shell of the atom. If the atom has a full outer shell, it will tend to be very stable, and will not tend to form chemical bonds. The inert (noble) gases, e.g. helium or neon, are examples. Elements with outer shells containing single electrons, i.e. with spaces to fill, will tend to be highly reactive. Atoms of the elements silver (Ag) and the halide family (chlorine (Cl), bromine (Br) and iodine (I)) are of particular importance in photography as together they form the sensitive material in photographic emulsions. Of note is the fact that silver has a single electron in its outer shell, while chlorine has seven electrons in an outer shell that can contain eight; essentially it has a space for one more electron. This is covered in more detail in Chapter 13.

THE EMISSION OF ELECTROMAGNETIC RADIATION IN ATOMS

The Bohr model has now been superseded by more complex models; however, the fundamental idea remains. The major mechanism for the generation of electromagnetic radiation is generally accepted as the emission of a photon that occurs when an electron drops from a higher energy level to a lower one.

An atom is said to be in its ground state when all its electrons are in the lowest possible energy states available to them. The lowest energy level that an electron may occupy, i.e. the one closest to the nucleus, is termed the normal level and may be taken as the zero energy state (as there are no lower levels to which an electron may drop). If any of the electrons occupies a higher energy level than its ground state, it is said to be excited (these stationary states are also called radiating, critical or resonance states), and such states are unstable and temporary. The energy level at which an electron can escape the pull of the nucleus completely is called the ionization level (at this point the electron is no longer in a stationary state and is promoted into the conduction band).

To excite or ionize an atom, it must be supplied with energy, either through electron impact, for example as a result of heating, which leads to atomic collisions (for example, in the walls of the heated cavity in a black-body radiation experiment), or as the result of photon impact when the atom is radiated (this is the mechanism responsible for the photoelectric effect). During an atomic collision, if the incident electron has enough energy, it will be able to transfer this to elevate one of the valence electrons to a higher energy level, either an excited or an ionized state, with any additional energy retained by the incident electron as kinetic energy. In the case of photon impact, to create an excited state, the incident photon energy must correspond exactly to the difference between two energy states for the photon to be absorbed. Alternatively, if the photon contains more energy than the difference between the valence shell energy level and the ionization level, then the atom will be ionized.

Whatever the mechanism for absorption of energy, an atom will generally only remain in an excited state for approximately 10−8 or 10−9 seconds before returning to its ground state. It loses the excitation energy in this process, either by conversion to thermal energy or by emission of electromagnetic energy. In the latter case, the excited electron will usually return to its ground state in a single step, with the emission of a photon of energy equal to the difference between the two states. The entire mechanism is illustrated in Figure 2.30. In some special cases it may return to ground state in several steps, emitting photons of lower energy at each. This is the mechanism of fluorescence or luminescence, which usually occurs when certain materials are irradiated with ultraviolet wavelengths and then emit visible radiation of longer blue or green wavelengths.

image

Figure 2.30   Absorption and emission of electromagnetic radiation in an atom.

BIBLIOGRAPHY

Ball, J., Moore, A.D., 1997. Essential Physics for Radiographers, third ed. Blackwell, Oxford, UK.

Freeman, M.H., Hull, C.C., 2003. Optics, eleventh ed. Butterworth-Heinemann, London, UK.

Halliday, D., Resnick, R., Walker, J., 2001. Fundamentals of Physics, sixth ed. John Wiley, New York, USA.

Hecht, E., 2002. Optics, fourth ed. Addison-Wesley, San Fransisco, CA, USA.

Jacobson, R.E., Ray, S.F., Attridge, G.G., Axford, N.R., 2000. The Manual of Photography, ninth ed. Focal Press, Oxford, UK.

Jenkins, F.A., White, H.E., 1976. Fundamentals of Optics, fourth ed. McGraw-Hill International Editions, London, UK.

The speed of light is defined in a perfect or absolute vacuum. Absolute vacuum does not exist in reality. The speed of light in a vacuum is therefore a theoretical concept.

This should not be confused with the English character v, which is used as a symbol of phase velocity introduced later in this chapter.

§ The sinc(x) function is defined as sin(x)/x.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset