Creating Sustainable Digital Files; What Archivists Need to Know

Margot Note

Margot Note

February 20, 2023

Interpolation is the process of creating missing data, often used to create new pixels to insert into an image or to choose which pixels to remove from a resized image to keep the resolution and ensure that the image does not pixelate. 

Interpolation is also referred to as resampling, upsampling (increasing the number of pixels), or downsampling (decreasing the number of pixels). 


Compression algorithms reduce the number of bytes needed to represent data and the amount of memory required to store images. As a result, this process increases the amount of data sent online and permits more image storage. Compression relies on two main strategies: redundancy reduction and irrelevancy reduction. 

Redundancy reduction searches for patterns to express more efficiently. An image viewed after lossless compression will be identical to how it was before. The compressed file may still be too large for network dissemination. However, lossless compression supplies efficient storage when all the information stored in an image must be preserved for future use.

Irrelevancy reduction, a lossy compression, uses a means for discarding the least significant information to create smaller file sizes. Lossy compression reduces the image’s quality for more storage. Still, it should not be used when image quality is essential, such as with archival digital files.

Not all images respond to lossy compression similarly. A compressed image may produce artifacts or unintended visual effects. Other images, such as those with text or line illustrations, will show the lossy compression artifacts more clearly. Artifacts may accumulate over generations, especially with different compression schemes. Archivists should keep uncompressed master files from which they generate derivative files. 

Recommended Formats

File formats ensure that the data is stored so that other systems can access files. Archivists consider long-term usefulness and accessibility and choose non-propriety standards. Formats provide the maximum re-use of the images across projects and through time.

Despite the range of file formats, a few are recommended for image collections. The most common formats for digital imaging projects are TIFF (Tagged Image File Format) and JPEG (Joint Photographic Experts Group File Interchange Format).

The “tagged” in TIFF refers to the format’s structure, which allows for custom metadata fields without affecting compatibility. As a result, TIFF is the best file format for preserving high-quality images.

JPEG is a lossy compression format that compresses data by assigning a color value to a block of pixels rather than to individual pixels. The process can be controlled but causes deterioration, most noticeably in smooth gradient areas. Therefore, JPEG is best used with continuous-tone images online or when storage space is limited.

Some cameras capture RAW formats, including the original data captured by the sensor. The technician can adjust the white balance, exposure, and sharpness before saving the images in a non-proprietary format. RAW processing offers maximum flexibility with image brightness and white balance. It removes the limitations of fixed in-camera processing, such as sharpening. RAW files usually have higher bit depths than JPEGs and TIFFs. The RAW format is often considered a digital negative because the image has little camera processing. Since no standards exist, an image editor capable of translating them should open RAW files. 

Making Decisions

Converting physical information to electronic form encompasses a range of knowledge to inform procedures with varying implications. Consequently, the judgments archivists make during digitization projects involve not only an understanding of the fundamentals of the process but also the intellectual and physical nature of the materials, the current and potential users and uses, and how the resulting records will be described, delivered, and archived. 

Margot Note

Margot Note

If you’re interested in this topic and eager to learn more, please join us for Digitization Fundamentals”, the second in Margot Note’s latest free webinar series. It’s on Wednesday, February 22, 2023 at 11 a.m. Pacific, 2 p.m. Eastern. (Can’t make it? Register anyway and we’ll send you a link to the recording and slides afterwards). Register now or call 604-278-6717.

Similar Posts

Leave a Comment

Comments are reviewed and must adhere to our comments policy.


Pin It on Pinterest

Share This