Creating Sustainable Digital Files; What Archivists Need to Know
Interpolation is the process of creating missing data, often used to create new pixels to insert into an image or to choose which pixels to remove from a resized image to keep the resolution and ensure that the image does not pixelate.
Interpolation is also referred to as resampling, upsampling (increasing the number of pixels), or downsampling (decreasing the number of pixels).
Compression algorithms reduce the number of bytes needed to represent data and the amount of memory required to store images. As a result, this process increases the amount of data sent online and permits more image storage. Compression relies on two main strategies: redundancy reduction and irrelevancy reduction.
Redundancy reduction searches for patterns to express more efficiently. An image viewed after lossless compression will be identical to how it was before. The compressed file may still be too large for network dissemination. However, lossless compression supplies efficient storage when all the information stored in an image must be preserved for future use.
Irrelevancy reduction, a lossy compression, uses a means for discarding the least significant information to create smaller file sizes. Lossy compression reduces the image’s quality for more storage. Still, it should not be used when image quality is essential, such as with archival digital files.
Not all images respond to lossy compression similarly. A compressed image may produce artifacts or unintended visual effects. Other images, such as those with text or line illustrations, will show the lossy compression artifacts more clearly. Artifacts may accumulate over generations, especially with different compression schemes. Archivists should keep uncompressed master files from which they generate derivative files.
File formats ensure that the data is stored so that other systems can access files. Archivists consider long-term usefulness and accessibility and choose non-propriety standards. Formats provide the maximum re-use of the images across projects and through time.
Despite the range of file formats, a few are recommended for image collections. The most common formats for digital imaging projects are TIFF (Tagged Image File Format) and JPEG (Joint Photographic Experts Group File Interchange Format).
The “tagged” in TIFF refers to the format’s structure, which allows for custom metadata fields without affecting compatibility. As a result, TIFF is the best file format for preserving high-quality images.
JPEG is a lossy compression format that compresses data by assigning a color value to a block of pixels rather than to individual pixels. The process can be controlled but causes deterioration, most noticeably in smooth gradient areas. Therefore, JPEG is best used with continuous-tone images online or when storage space is limited.
Some cameras capture RAW formats, including the original data captured by the sensor. The technician can adjust the white balance, exposure, and sharpness before saving the images in a non-proprietary format. RAW processing offers maximum flexibility with image brightness and white balance. It removes the limitations of fixed in-camera processing, such as sharpening. RAW files usually have higher bit depths than JPEGs and TIFFs. The RAW format is often considered a digital negative because the image has little camera processing. Since no standards exist, an image editor capable of translating them should open RAW files.
Converting physical information to electronic form encompasses a range of knowledge to inform procedures with varying implications. Consequently, the judgments archivists make during digitization projects involve not only an understanding of the fundamentals of the process but also the intellectual and physical nature of the materials, the current and potential users and uses, and how the resulting records will be described, delivered, and archived.
If you’re interested in this topic and eager to learn more, please join us for “Digitization Fundamentals”, the second in Margot Note’s latest free webinar series. It’s on Wednesday, February 22, 2023 at 11 a.m. Pacific, 2 p.m. Eastern. (Can’t make it? Register anyway and we’ll send you a link to the recording and slides afterwards). Register now or call 604-278-6717.
Digitization Labor Options
Archives and archivists should consider both in-house and outsourced labor options for digitization projects; guidance on weighing the options.
Digitization Planning and Cost Projection
Organizations rely on best practices to justify the investments made in digitization projects; strategic planning, management, realistic cost estimates
Selection Criteria for Digitization
Aesthetic, evidential, informational, intrinsic, and artifactual values, influence selection for archival digitization.
Issues to Consider Before Digitization
Prior to archival digitization, archivists should address areas of concern: publicity, privacy, copyright, legal matters, including international laws
Leave a Comment
Comments are reviewed and must adhere to our comments policy.