>>> Click here to access this episode of the Syllab Podcast on Spotify <<<
a) Exposure & Lens
The camera is our successful attempt to focus light on a medium which can directly fix information or indirectly transmit it to a storage device such that this information can then be developed or decoded and viewed later. Making the transient permanent.
It would take some understanding of the nature of light, whether wave or particle, to master camera-related technologies but the phenomenon of a pinhole image formed by a tiny hole in a wall letting only light rays travelling in a straight line from their source of emission had been observed in antiquity and actively used in the form of the camera obscura in the early 17th century to assist in drawing and even make celestial observations when coupled with the recently invented telescope lenses. In fact, this is the very concept of the human eye we looked into in S2 Section 7.b in which the cornea and the lens focus the light on the retina and photoreceptors are activated thereby transducing light stimuli into electrical signal. The human eye had tens of millions of years to develop during the progression of our evolutionary tree, fortunately it took a lot less time for us to wilfully design a mechanism not simply to focus light but also to fix it onto a medium. The first success took the form of heliography devised by Niepce, a process relying on the hardening of a material in sunlight and the sibling process of photolithography using photoresist materials would lead to great things in the manufacturing process of semi-conductors (refer to S4 Section 1.d). This was followed shortly thereafter by the much more practical daguerreotype, named after Daguerre, which consisted in exposing a light-sensitive material, then triggering a reaction to make the image visible, neutralizing the light sensitivity, and then sealing the result; I include a link to the relevant Wikipedia entry so you can check out some of those pictures, they really do have character. This was nearly the mid-19th century and we were then well on our way to the pre-digital photographic film, the topic of the next section. In this section, we will focus on the control of the input aspect: light.
The quantity and the quality of light are essential variables in photography. Too much light and the result will be overexposed, washing out colours, not enough and it will be underexposed, making parts of the content too dark. Granted, this effect can have an artistic intent, so here I am talking about the ability to have the desired degree of exposure, a relative rather than an absolute notion. Light exposure is a function of three parameters: the luminance of the scene being captured (that’s the light intensity that will traverse the camera lens), how wide or narrow the access channel is for the light, and for how long it remains open. Using the analogy of a river, the amount of water flowing depends on the current, the width and depth of the riverbed, and the amount of time during which you make the measurement.
The timing is quite straightforward, the film or sensor is kept in the dark behind a shutter and, for the selected amount of time, it will move to permit exposure and then resume its original position to end the exposure. This is called shutter speed or exposure time and can be as short a as few microseconds or as long as several minutes in the case of night-time photography – if you have seen scenes of stars tracing arcs in the sky, this is the result of a very long exposure which is not overexposed because luminance is very low and the aperture might be quite narrow.
As for the control of the opening size, the technical term is aperture and it is generally set with increments of 2x, meaning increasing the aperture one level lets twice as much light enter in a given duration. When coupled with a light meter (all of this would be automated on modern devices with auto-focus), it is possible to achieve the desired exposure by adjusting either aperture or shutter speed. Which one should the photograph use then?
It so happens the level of aperture has optical consequences in terms of depth of field. A shallow depth of field will allow for a crisp image for objects in a narrow band, starting quite close to the lens (that distance depends on the lens as well) but anything that is closer or beyond this distance will not be focused. Thus portraits with a blurred background are realized using a very wide aperture. Conversely, for landscape photography where one wishes to have much of the scene in focus, we should opt for more depth of field. The resulting focus will not be as sharp but it is possible to have nothing out of focus. Depth of field behaves very straightforwardly in mathematical terms; it is positively proportional to the square of the distance of the items being in sharpest focus, so the further away the subject in focus is the more distance in front and behind it will also be in focus, and it is inversely proportional to the square of the focal length.
We already came across the concept of the focal length in S3 Section 9.c when explaining the workings of the optical telescope. It is noted “f” and corresponds to the distance between the lens and the focal point, the point where light is being made to converge by the convex lens (it would be a negative number in case of a concave lens). This implies that, with a wide aperture, much of the light rays will come in at slightly different angles and miss the focal point after travelling through the lens whereas with a small aperture, the spectrum of light-ray angles will be narrower and much more of them will be in focus or nearly so, i.e. they will converge near the focal point. Focus is such a crucial aspect of photography that the aperture increments are referred to as f-stops rather than a-stops and because the mathematical relationship is to the second power, the f-stops are incremented by the square root of 2 (such as 1.4, 2, 2.8, 4) to yield a doubling of the amount of light – not surprising because the focal distance is one-dimensional and the aperture is 2-dimensional.
So now we understand the concepts of exposure and focal length, the last aspect we need to get our head around is the focusing aspect of the lens, often called “glass” since this is the most valuable (and heaviest part) of the equipment. The distance of the focal point is typically expressed in millimetres and gives its name to the lens; it might be a range in the case of a zoom lens or a fixed number for a prime lens. The advantage of the zoom lens is the ability to enlarge objects with the degree of magnification being the ratio of the longer focal length over the shorter one whereas using a prime lens can provide a higher quality image, as well as a lighter lens, though you or the subject may need to move to be in focus. Coming back to the focal point distance, it is a function of the distance D1 between the object and the frontal plane of the lens as well as the distance D2 between the rear plane of the lens and the image plane where the film or sensor resides. The mathematical relationship is: 1/f = 1/D1 + 1/D2.
However, there is more to a lens than adjusting the focal length because light is made of a mix of waves with different wavelengths and unequal refractive index resulting in dispersion. This is called chromatic aberration and it can be minimized by increasing the focal length or by combining several materials with different refractive indexes. In addition, high quality lenses will also correct several types of monochromatic aberrations, thus called because they would also occur with light rays of identical wavelength. These include, inter alia:
- Some light reflection at the interface between the air and the glass, which can be minimized by applying an antireflective coating acting as alternating layers, some of which create destructive interference in the reflected beams and on the flip side other layers support constructive interference for the non-reflected beams.
- Spherical aberration is the result of light refraction and means that with a perfectly spherical surface, the exact focal point for light hitting the glass near the outside edge will be shorter than that for beams passing at or near the centre. To correct for this, either the lens should be aspherical, something not easy to manufacture, or several layers of glasses mixing concave and convex shapes can be used.
- Distortion changing the shape of the scene. This can be unwanted or a necessity, as with wide angle lenses seeking to cram more light information coming from a larger distant surface area by decreasing the magnification away from the centre of the scene (barrel distortion) or the opposite (pincushion distortion).
If the topic is of interest, I invite you to read the Wikipedia entry for Optical aberration; the hyperlink is included in the last section of this chapter.
b) The photographic film
We now have the equipment to control exposure and ensure the image would be focused on the film or sensor, at the image plane. How do we know this is the case however, and for that matter how do we know what scene or objects we are framing exactly? For that we need to see what the lenses is going to capture, and before the new generation of mirrorless camera bodies, in Single-Lens Reflex (SLR) cameras, this was accomplished by placing a mirror in front of the image plane and reflecting it through a prism into the optical viewfinder. The viewfinder is this part situated at the top of the rear side of the camera body where the photographer can see the subject or scene as if she was looking through the lens. At the time of actually taking the picture, the mirror would be lifted (and the shutter opened) to let the light hit the film or digital sensor. Nowadays, with digital sensors, the mirror is no longer necessary as the image detected by the sensor can be reproduced electronically, in real time, on a small screen placed in an electronic version of the viewfinder. This mirrorless construct saves a significant amount of space between the sensor and the rear plane of the lens and where the 5-sided prism was, allowing for a simpler mechanism and lighter, less voluminous body. By the way, I refer to the body of the camera because higher-end cameras use interchangeable lenses and therefore lens and body can be mechanically separated and one need not buy a new body every time new glass is acquired.
Even those readers who may not have used film-based cameras should be interested in the film technology so we’ll begin with this and then cover the digital sensors in the next section. The whole concept of storing light information is predicated on the fact that certain atoms and molecules absorb photons of different energy levels, hence some substances will react to light of certain wavelengths and not to others. In the case of film photography, the light-sensitive compound is silver halide crystals, which is sensitive to blue light and may be combined with dyes that will be adsorbed on the crystal and make them sensitive to other wavelengths (in addition to blue). So on a colour film there would be the standard silver halide layer on top, then a yellow filter blocking all remaining blue light, and this will be followed by a layer sensitive to blue and green (and since there is no more blue it works as a green detector and fixator), and then a final layer sensitive to blue and red (which works as a red detector). When responding to light of the relevant wavelength, the crystals will turn black so the result is a negative image of the latent image “impressed” on the film. It doesn’t matter because the information is sufficient to develop photographs with the correct colours through a series of chemical reactions.
The brightness of the photograph will be a function of the exposure, as explained in the previous section, as well as the sensitivity of the film called speed, which follows international standards (ISO) – the idea being that more sensitive films require less exposure time. The most common films would have ISO values of 100 to 800 and while a lower ISO is more sensitive and would work best in low light condition, their resolution would not be as good as those with higher ISO and would look grainy (this is not the only factor in the film resolution, they are other variables such as the size of the crystals). So why not use high ISO? Because these may require longer exposure times and result in blur images… It is all about trade-off, or at least it was back in the day of chemical silver films.
c) Digital sensors
The main drawbacks of photographic films are their inability to immediately display the content of the picture captured since it needs to be developed with chemical reagents for this, as well the limited number of pictures that could be taken on one roll, generally 24 or 36 for 35mm film, and the cost of the film for there was no possibility of it being reused once exposure had taken place. For the record, the reference to 35mm film corresponds to a specific sensor size of 36x24mm known as “full frame”. The other main standards are APS-C at around 23x15mm (there is variation depending on brands), and 17.3x13mm for the Four Thirds system but there is also the medium format with sensors of around 60x90mm using the 120 film that is approx. 61mm wide and the large format which is 90x120mm or larger. In comparison, smartphones have sensors with diagonals measuring up to 12mm resulting in a crop factor of about 3.5, meaning their diagonal is 1/3.5 the diagonal of a full frame sensor and therefore the surface area is about 12 times smaller (3.52 = 12.25).
Larger formats offer higher quality pictures because the larger the surface area the more information can be captured. Speaking in digital pixels (the word “pixel” is a compression of “picture element”), though the idea is the same for silver halide crystals, a larger surface allows for more pixels and thus better resolution or an identical number of pixels with larger pixel areas so each of them can capture more light, a substantial advantage in low light conditions such as indoor photography.
As we have already seen in S2 Section 7.b on the optical harvesting of light by the human eye, every colour we perceive can be deconstructed in a combination of red, green and blue light intensities. Hence, very much like the original photography film, a digital camera sensor will capture this red-green-blue (RGB) information and store it in the form of a binary number. What we call “true colour” has 24 bits, 8 each for R, G and B, which in our base 10 numbering system translates into 16,777,216 different colours. It is nicknamed true colour, I presume, because that is the rough magnitude humans can perceive (the estimate is about 10 millions) so in that sense, there is theoretically no loss of information although our perception might not be spread equally across the spectrum, especially for wavelengths beyond red and shorter than blue.
There are currently two technologies dominating the digital sensor industry: charge-coupled device (CCD) and active-pixel sensor that uses complementary metal-oxide-semiconductors (MOS). For colour pictures, a CCD would consist of an array of capacitors (miniature devices that can store an electrical charge) we call pixels, these are originally monochromatic though a colour filter is layered on top of them so that each capacitor is only exposed to red, green or blue light (there are also competing technologies using vertically stacked arrangements so each pixel is made of three RGB-specific sub-pixels). From a high-level technical standpoint, the information capture and storage of a CCD comprises 2 steps: the information or photon capture and the measurement. As soon as the shutter opens, the sensor is exposed and the capacitor charge will increase as a result of photon absorption, so the more luminance in the red, green or blue spectrum at a given spot on the sensor, the more charge contained within this pixel.
After the shutter closes, the charges of all the capacitors are shifted vertically, with the last pixel horizontal row dumping the electrons in charge amplifiers and the voltage created is then measured and stored in turn for each column of what used to be the last row. This process is then repeated with the charges being moved another row down, and so on until the original content of the top row has found its way to the bottom, then be amplified and its content measured. This ability to shift charges is the reason behind the expression of “charge-coupled”. When the shifting and measurement have been completed, the array is back to its original depleted condition. When you are using live view on the back screen of an electronic viewfinder, the process is actually the same, except the values are not stored – so don’t be surprised if the battery drains faster compared to bulkier models using optical viewfinders.
In the CMOS technology, the conversion from charge to voltage takes place in each pixel (hence the naming of “active pixel sensor”), meaning there is additional circuity embedded but this results in much faster processing speed even though the data for each pixel needs to be read and stored. This in-pixel amplification is achieved thanks to MOS field-effect transistors (MOSFETs) and if necessary, you may want to re-read or refer to S4 Section 1.c to understand how these work.
d) Movie and video cameras
I think it is universally understood movies and videos are essentially a series of images called frames that are recorded and projected in succession. The higher the frame rate (expressed as frame per second or “fps”), the smoother the visual experience and the ability to watch scenes in slow motion. The difference between movie and video cameras thus doesn’t lie in a different recording architecture but it is the encoding medium and process that differ. We have just seen how images are captured for film in section b) so we can directly turn our attention to video.
Unlike film that relies on chemical reactions, a video signal is electrical in nature and made possible by the photoconductive effect whereby the exposure to light can result in the increased electrical conductivity of a material. By exposing a photoconductive surface to the scene being recorded and then creating a current at this exact location, it is possible to measure the brightness of this point of the image by the changes in voltage produced – this is analogous to the image sensors of digital cameras. The current is generated by beaming electrons in sequence across all points of the image sensor – these electron beams are also called cathode rays, a term and technology we will see more of in S4 Section 8.d on CRT television.
Because an electron beam can be so focused, video allows for the delineation of several hundred rows, with each row providing an analogous signal that can be stored and transmitted either in an analog or digital format. For storage, videotapes would be used and rely on magnetic storage, a technology we already covered in S4 Section 2.d though nowadays Flash memory is the go-to alternative for most use cases (refer to S4 Section 2.c) and magnetic storage is mostly employed for archival purpose.
In order to generate different signals corresponding to red, green and blue colours that could be encoded and then used to reconstruct the right colours or chroma in addition to the right luminance, the main method used was the splitting of light into three beams with different wavelengths by combining two prisms, which requires three cathode ray tubes, each with their electron beams and sensors. The other obvious solution is the use of filters working very much like those of the photography films but this creates issues in terms of luminance attenuation. If you want to know more about analog video recording and its various iterations, I am including the link to the Wikipedia entry for video camera tube at the end of this chapter.
In many ways, analog video can be seen as a technological interlude stemming from the combined demand of television broadcasting featuring many live programs and the ability to watch these programs as a stream of data carried by radio waves. And so, in the old days, the film was used both for still and moving images and, with the adoption of digital data recording and display technologies, the same sensors can be used to record both images and what we still call videos in a generic sense because of the electronic nature of these moving images.
Now, shifting to the separate consideration of quality and viewing experience between digital video and the analog signal of films generated by the continuous nature of the crystals’ reaction as a function of light frequency and luminance. Does that mean film has better quality?
It depends on what you mean by quality, and it is not always a function of quantity. In fact, video may objectively have better resolution and yet film may still provide a better visual experience, a more natural feel, which is a function of the way our visual system processes data and experiences the output in our mind. This is the reason why some movie directors still prefer to shoot using film though, considering the one-time use of the medium and the number of frames involved, this is not realistically affordable for most people plus it takes time to develop so movie cameras are very much of a professional tool nowadays.
You may find that digital pictures look sharper, and indeed they are when seen in their native resolution. This is the often-intended result of enhancing the contrasts at the edges within an image, something made all the easier by the fact that digital images are made of discrete pixels. That hard feel may or may not be preferred to the softer look of film and can become a drawback when the resolution is increased or decreased because mathematical interpolation is required to decide which colour and luminance values should be displayed and this will often create a blocky, “pixelated” image, not a smooth result.
Part of the increasing attraction of digital camera over film, including for movie footage, is the ever-increasing improvements in resolution, the amount of information contained in each image. Image sensors have indeed packed an increasing number of pixels and so have the devices displaying the content they record. That said, when it comes to movies and digital video, the resolution is only part of the equation and the other key variable is the frame rate and this has also increased quite significantly, from 24 fps for film to 25 or 30 fps for standard video and 50-60 fps or Hz is also supported though the human eye makes limited distinction and as mentioned earlier the higher rate is probably more important on the recording side than on the display one to enable slow motion sequences to feel fluid. Much more on the display side in Chapter 8 on the television.
e) Trivia – High dynamic range imaging
The capabilities of the human body are often stunning when going into the details and our sensory systems are certainly no exception. Among the many breath-taking characteristics is what is technically termed “dynamic range”. In audition it would be the ratio between the maximum acoustic intensity we can withstand without pain and the minimum threshold for detection; this ratio is estimated to be 1013, which corresponds to 130 decibels. For vision, it is a still very respectable 1 billion times in terms of luminance, which corresponds to 90dB or about 30 stops (which is the base 2 equivalent of this ratio since 230 ≈1.07 billion).
This order of magnitude means a camera will not be able to capture very high dynamic ranges with a single exposure (even though improvements in CMOS sensors is extending their native dynamic range) otherwise parts of the image would be completely overexposed, washed out in white, or on the contrary will be nearly pitch black. The trick then, to go beyond the limits of the device, is to combine several different exposures, and this is what the HDR mode of your camera does – including the camera systems of smartphones.
For still images, it is easy enough to understand this can be achieved by taking several pictures and then combining them, using the maximum chromatic values for each. For digital videos the principle is actually the same though not as many frames can be replicated with different exposures and, typically, there will be a doubling of the actual frame rate and two frames will be processed into one so for a 30 Hz video, 60 frames will be shot every second.
Like every technology, its impact is a function of how it is wielded and the contrast can give allure to a picture, potentially reversing the gloomy effect an overcast day can have on the scene, but it can also feel unnatural, especially when applied to people’s skin and faces. And it can even be utterly misleading by making rundown buildings look quite decent-looking. I prefer not to use the mode and play with the exposure myself.
f) Further reading (S4C7)
Suggested reads:
- Understanding Exposure, by Brian Peterson (buy)
- Wikipedia on the Daguerreotype: https://en.wikipedia.org/wiki/Daguerreotype
- Wikipedia on Optical aberration: https://en.wikipedia.org/wiki/Optical_aberration
- Wikipedia on Video camera tube: https://en.wikipedia.org/wiki/Video_camera_tube
Disclaimer: the links to books are Amazon Affiliate links so if you click and then purchase one of them, this makes no difference to you but I will earn a small commission from Amazon. Thank you in advance.
Previous Chapter: The Internet
Next Chapter: The Television