File Formats and Why They Matter



There has been a lot of discussion on using RAW vs. JPG vs. TIF. Most of it is based on misinformation. In order to bring order from chaos, we need to understand some basic 'FACTS' about file formats and what they do.

Computers only understand binary information, ones and zeros. Every computer file/image/document is nothing more than a long string of ones and zeros. Some of the ones and zeros are commands that allow the computer to know what to do and how to do it and some are data. The file format tells the computer which ones and zeros are commands and which are data.

The Sensor in a digital camera translates light intensity into ones and zeros. Using firmware in the Sensor the light intensity is translated into color information. Software in the camera translates the information out of the Sensor into a digital file. The translation is driven by the file format of the resultant file.

For RAW, the information from the sensor is written directly into the file as it comes from the Sensor. It is written into the file according to the rules of a RAW file. The RAW format includes the single color depth of each pixel making it a relatively small file, but still containing a complete data set for the image - approximately 7MB for a Nikon NEF file out of the D1x.

For TIF, the information from the sensor is processed according to the settings in the camera for Contrast, White Balance, Sharpness, Etc. This translated information is written into a file according to the rules of a TIF file. The TIF format includes the processed information for the total color depth for each pixel making it a very large file - 30MB for 8 bit color or 60 MB for 16 bit color.

For JPG, the information is processed according to the settings in the camera, and then compressed. Typical compression is a process of replacing repeating patterns of ones and zeros with symbols derived by mathematical manipulation providing an approximation of the original information. The original information in the ones and zeros is thrown away and replaced with the symbols. The smaller data set makes a smaller file. The actual file size is dependant on the amount of compression, perhaps in the 1-2 MB range.

So, when would it make sense to use which format?
  • RAW - you want all possible information and want to do the processing yourself.

  • TIF - you want all of the information, you don't mind the camera doing your processing, and you can deal with the longer download times and smaller number of images that will fit on a memory card.

  • JPG - you don't mind the detail loss, you want the camera to do the processing, and you need the faster upload times and want to put the maximum number of images on a memory card.

The Rest of the Story

When the RAW file is read by the RAW converter the individual pixel colors are combined to produce a composite RGB value for each pixel. This information is read into RAM so it can be displayed and processed by PhotoShop. Once processed, it can be saved in any format available to PhotoShop.

When the TIF file is read by PhotoShop, the information is read into RAM so it can be displayed and processed by PhotoShop. Once processed, it can be saved in any format available to PhotoShop.

When the JPG file is read by PhotoShop, it must first be decompressed using the JPG rules. This is a process of reading the symbols and reconstruction the image, making a best guess about the missing information. The reconstructed information is read into RAM so it can be displayed and processed by PhotoShop. The 'missing' information can NOT be regained. Once processed, it can be saved in any format available to PhotoShop.

Typically, files intended for use on the internet are saves as JPG. The loss of information in the translation to JPG is not a problem here because the resolution of a computer monitor can only reproduce a certain subset of the detail and color from the image.

If the file is intended to be printed, a higher resolution format is necessary because the print can produce more detail than a computer monitor. TIF will work, but creates very large files. PSD, the PhotoShop native format is a good choice because it contains as much information as TIF, but produces a smaller file.

JPG Compression Events

For most images, the loss of detail will be so small as to go unnoticed. Reading the same image multiple times will not increase the loss. Saving the file back into JPG causes another application of the compression process. Accessing the file again, will cause another decompression with more guesses being made about missing information. Depending on the amount of detail and the depth of compression, eventually opening, saving, and reopening the saved versions will produce noticeable degradation as depicted below.

This is the original image. Taken in RAW and converted to JPG by "Save For Web" at 70%. The crops below were taken from the upper left part of the globe. Original Image
200% crop of Original Fourth Generation JPG
The original image at 200% - It was saved as a JPG at level five to generate V2. V2 was opened and saved to generate V3, and so on until V5 below. Some of the degradation was noticeable in V2. I continued the process to insure enough degradation to be visible in these crops. The fourth generation - Note the degradation of the edges, the color artifacts in the brown, and the block structures in the yellow.