While gathering family history records over the years, you’ve probably been preserving them physically. So why consider preserving them digitally now? This paper discusses the benefits and challenges of using digital preservation to both augment and enhance the preservation of your family history records. It also presents solutions to the challenges, identifies what types of family history records are suitable for digital preservation, and summarizes what is required to get started archiving digital records.
WHY DIGITAL RECORDS?
There are compelling reasons for embracing digital records. To begin with, once a treasured but very fragile historical document has been captured digitally, you’ll no longer worry about damaging or soiling it from excessive handling. Nor will you ever grimace again over a photograph’s fading colors and diminishing contrast. You’ll also find that the 999th digital copy you make is just as bright and sharp as the original. And printing records will be simpler than ever before.
Furthermore, you’ll realize how much easier it is to organize, find, and use records by letting a computer do most of the clerical work for you. And you’ll be delighted to see how easy it is to share your family history records with others. A digital record can be emailed anywhere in the world in just seconds. If the computer file that contains the record is too large to email, you can get it to your intended destination almost as fast by using a free internet service.
Digital record storage has other advantages. For example, it doesn’t require much physical space, so you won’t need to consume your precious real estate storing bulky boxes and costly containers. And if you use the M-DISC recommended in this paper, you won’t have to worry about factors like mold and humidity destroying your historical records.
Accessing digital records is extremely easy, and doesn’t require climbing a rickety step ladder to hunt for dusty boxes in the attic or hobbling downstairs to a musty, dark basement and brushing away cobwebs to search for something you hope is there. In fact, all your family history records can literally be just a few clicks away—at any time of the day.
Perhaps more importantly, you’re probably facing the need to preserve digital family history records anyway, since most photographs and documents are now being created digitally. “Born-digital” is a term coined for records that originate in digital form, such as a personal history written with a computer or a family photograph taken with a digital camera.
DIGITAL RECORD REQUIREMENTS
Creating and preserving digital records requires (i) a computer with appropriate software, (ii) the means to digitize physical records (i.e., convert them from paper or analog to digital), and (iii) the ability to preserve the digitized records for posterity and extended family. If you plan to share your digital records with others, you will also need access to the internet.
Virtually all physical record types can be digitized—genealogies written in family Bibles, copies of vital records (birth, death, marriage), copies of census records, photographs, journals, written and typed documents, oral histories, newspaper clippings, family videos, handwritten or printed music, artwork, books, maps, etc. And all can be preserved digitally.
WHAT IS DIGITAL PRESERVATION?
It is NOT merely backing up your data! Backup only provides near-term protection of your digital records.
Rather, digital preservation is a process that involves storing digital records (i) with descriptive information (ii) for a very long time (iii) in multiple locations (iv) at the highest resolution you can afford; (v) periodically migrating the records to new storage media in order to prevent data loss or the inability to read the data; (vi) changing file formats before they become obsolete; and (vii) providing access to your digital collection now and in the future.
Going beyond backup, digital preservation provides long-term protection of your digital records.
While these steps may seem overwhelming at first glance, this paper discusses tools and techniques that can help to make digital preservation a straightforward and even enjoyable activity.
DIGITAL PRESERVATION CHALLENGES
Just as preservation of physical records has challenges, so does preservation of digital records, although the challenges are different.
You cannot feel or see digital information, which is written to a storage device that a computer can interpret. Digital information may be characterized as a string of 1’s and 0’s where each digit is called a “bit.” Digital bits are written in a predefined file format that computer software interprets so a digital record can be rendered on a computer screen or a printer.
Unfortunately, almost all computer storage media decompose over time, causing bit loss—which in turn causes data loss. Depending on the type of record, loss of a single bit could result in the corruption of a word (in a personal history, for example), or it may only cause a tiny flaw in an image that is not discernable to the human eye (in a photograph, for example).
The random, unpredictable nature of bit loss caused by storage media degradation can complicate management of a digital archive if not addressed.
Another challenge of digital preservation is obsolescence—both of file formats and storage technologies. Historically, new or enhanced storage technologies and file formats are introduced periodically to improve functionality and/or lower costs. As these enhancements are embraced by the computer industry, their adoption can cause issues for digital preservation. The reason is that vendor support is eventually withdrawn for storage technologies and file formats that become obsolete as newer formats and technologies gain popularity.
As with bit loss caused by storage media degradation, file format and storage technology obsolescence can also complicate management of a digital archive if not addressed.
ARCHIVAL STORAGE MEDIA CHALLENGES AND SOLUTIONS
Industry experience with storage media used for digital preservation purposes has shown that external hard drives and commodity optical discs (writable CDs and DVDs) have an archival life as short as three to five years—meaning that digital bit loss can occur in just three years! And USB flash drives, which many people use to store their digital family history records today, have an even shorter archival life!
According to IT Director Rae Williams, “Flash drives are very handy for carrying files from place to place and computer to computer. However, they are relatively volatile storage, so you should never consider them a primary backup for your files. They fail much, much, much more quickly than CDs or hard drives.” 1
While such storage technologies are useful for short term backup, they do not provide an adequate solution for long term digital preservation because of the high probability that bit loss will occur in a relatively short time.
Happily, a practical and effective solution for storage media bit loss is now available—the M-DISC, a revolutionary optical disc technology developed by Millenniata, Inc.2
In effect, the M-DISC stores your family history records with permanent engraving—like etching stone (as with petroglyphs, which were the inspiration for developing the technology). Unlike other optical discs that use an organic dye for recording, the M-DISC uses a synthetic stone-like material to record your digital bits. A laser etches the bits into this synthetic stone, and they can be read by most quality DVD drives and virtually all Blu-ray drives.
In 2009, the U.S. Naval Air Warfare Center Weapons Division at China Lake, California tested four different brands of gold archive-grade DVDs and one commodity DVD along with the M-DISC. The project was an accelerated aging test that evaluated disc stability and readability after being exposed to elevated levels of light, heat, and humidity. A report of the testing results stated, “None of the Millenniata media suffered any data degradation at all. Every other brand tested showed large increases in data errors after the stress period. Many of the discs were so damaged that they could not be recognized as DVDs by the disc analyzer.” 3
Based on this and other internal testing, Millenniata claims that the M-DISC has an archival life measured in centuries (i.e., a millennium!). And M-DISCs have no special storage requirements to achieve this remarkable archival data life.
Clearly, the M-DISC represents a breakthrough in personal archiving storage media!
Millenniata recently announced that it will offer Blu-ray M-DISCs starting in August 2013. The Blu-ray M-DISC increases both the storage capacity and the accessibility of the original M-DISC—which features a standard DVD format with 4.7 gigabytes of storage capacity.
The new Blu-ray M-DISCs will be writable and readable on any Blu-ray drive, and will offer 25 gigabytes of permanent storage capacity. RITEK Corporation, the world’s leading manufacturer of optical storage media, will produce the new Blu-ray M-DISC.
Millenniata is well on its way to making the M-DISC a de-facto world standard for long-term digital preservation. For example, a partnership with Hitachi-LG Data Storage, Inc. (the world’s leading company in optical storage) enables LG to manufacture M-DISC compatible optical disc drives and market them through LG’s extensive sales channels.
Likewise, a marketing and distribution partnership with Imation Corporation, the leading worldwide distributor of data storage products, allows Imation to co-brand and distribute both the M-DISC and the Blu-ray M-DISC under Imation`s TDK, Memorex and Imation brands. RITEK has also signed a license agreement with Millenniata to distribute and co-brand both the DVD and Blu-ray M-DISCs through its established distribution and reseller channels.
Furthermore, LG, Dell, and Acer offer newer computers and laptops with M-DISC compatible DVD and Blu-ray drives.
The LG Super-Multi Drive is capable of reading and writing M-DISCs, Blu-ray M-DISCs, DVDs, and Blu-ray discs—offering the widest capabilities of any optical disk drive. More about LG M-DISC READY drives can be found at mdisc.com.4
From a phone call to Best Buy in Sandy, Utah, the author found that an internal M-DISC READY drive sells for $40 and a portable external version sells for $60. Also, a 10-pack of M-DISCs (DVD format) can be purchased at mdisc.com for $29.99. Pricing for Blu-ray M-DISCs will be announced when the discs are available (August 2013).
With the M-DISC, you can now preserve your digital family history records for generations without worrying about random, unpredictable bit loss and data loss that complicate the management of a digital archive. And, happily, the cost of this remarkable, breakthrough technology is affordable.
Although the partnerships identified above will undoubtedly create a significant consumer market for M-DISCs, the prospect of DVD and Blu-ray obsolescence remains a potential issue that is addressed next.
ADDRESSING THE CHALLENGE OF OBSOLETE STORAGE TECHNOLOGY
Reflecting on computer technology history, you might assume that the day will eventually come when DVD and Blu-ray drives are out of production. What will happen to your family history records written to M-DISCs then?
Fortunately, there is a straightforward solution to this potential predicament. The solution is referred to as a media refreshment migration in the digital preservation industry.
Such a migration involves copying your family history records to a newer storage medium that is about to replace DVD and Blu-ray technology. More specifically, you should copy your M-DISCs to the M-DISC replacement technology of the future, whatever it turns out to be.
Assuming the migration work is completed in a timely manner, media refreshment provides a viable solution for providing access to your digital record collection in the future.
However, the migration work may have to be performed by your posterity or your extended family, since you may not outlive the ability to read M-DISCs. Therefore, it behooves you to prepare your posterity and extended family for such migrations.
There are three software tools available for Windows that can help with media refreshment migrations. One is TeraCopy,5 a high speed data copier. The other, Unstoppable Copier,6 can help recover data from scratched discs. Both can be downloaded over the internet free of charge for personal use. IsoBuster7 is another data recovery tool for purchase that can rescue lost files from a bad DVD or Blu-ray disc.
Some words of encouragement and direction are in order here. The goal of the partnerships between Millenniata and Hitachi-LG Data Storage, RITEK, and Imation is to create a new de-facto standard for archive-grade storage media. Millenniata is also working with the International Organization for Standardization (ISO) Committee Group responsible for Blu-ray Disc and DVD media standards in order to promote the M-DISC as an international ISO standard. Representatives from Panasonic, Pioneer, Toshiba, Samsung, Hitachi, LG, and Sony are part of this ISO Committee Group.
If the M-DISC becomes a standard, whether an ISO standard or a de-facto standard, the computer storage industry will be forced to accept and deal with the standard, which should significantly extend the life of M-DISC technology.
Perhaps more importantly, you have the opportunity to change the course of history regarding archival storage media!
Consider the chronicle of the long-playing (LP) phonograph record. It was introduced in 1931, gained tremendous popularity in the third quarter of the Twentieth Century, was superseded by the compact disc in 1982, and yet you can still buy needles and turntables today—nearly eighty years after its introduction. Why? Because a significant market for playing or digitizing LP records exists today. To illustrate, a recent Google search on “LP record turntable” provided about 4,350,000 results!
As more and more people like you invest in M-DISCs, a market is being created for readers and writers that can and will buck the historical computer technology trend. This is why you have the opportunity to change the course of history regarding archival storage. Carpe diem!
FILE FORMAT CHALLENGES AND SOLUTIONS
Another challenge of digital preservation has to do with file formats. Obsolescence is of particular concern.
To illustrate, in the early and mid-1980s, WordStar was the most popular DOS word processing software in the world. Today it is effectively “abandonware” (i.e., no longer developed or maintained).
Anyone attempting to preserve a WordStar document in 1985 would undoubtedly have a difficult time getting his or her personal computer to read it today—even if the storage medium used at the time were still readable!
File size can also be of concern for digital preservation.
When preserving digital family history records, you should always preserve them at the highest resolution you can afford.
The reason is that a digital record’s resolution quality cannot easily be improved once the digital record is created. And since you don’t know how a record will be used in the future (either by you or your posterity/ extended family), resolution can become problematic. For example, if the record is to be printed, print quality will reflect resolution quality of the digital record when you archived it.
For photographs, TIFF (Tagged Image File Format) provides very high resolution, but it also creates large files that consume considerable amounts of archival storage capacity. Converting a TIFF file to the JPEG format will reduce the size of the file, but the reduction will come at the expense of resolution. That’s because JPEG processing does lossy compression of the digital bits—which means that many of them are discarded in order to achieve a significant reduction in file size.
JPEG decompression always results in altered file content compared to the original. Such JPEG images may be suitable for viewing on a website, but they may disappoint if you try to print them. (Note: JPEG stands for Joint Photographic Experts Group—originators of the JPEG standard.)
Fortunately, there are file formats that can help overcome both challenges described above.
The first is the PDF/A format (Portable Document Format for Archiving). Recognizing the impact of file format obsolescence on digital preservation, the International Organization for Standardization (ISO) defined in 2005 an “electronic document file format for long term preservation.” Based on the Adobe PDF 1.4 format, PDF/A provides a self-contained, self-describing format that is independent of external sources. For example, it embeds relevant fonts and color information with the content data so that future computer software will be able to render the document exactly as it can be rendered today. In effect, PDF/A uses a software archiving approach to digital preservation.
Combined with the M-DISC, PDF/A provides a breakthrough in personal archiving!
PDF/A can be used for most record types. Audio and video are exceptions. Also, PDF/A does not allow encryption and requires the use of standards-based metadata (i.e., descriptive information). Since fonts used in the document must be embedded with the content data, the resulting file size will be larger than a corresponding (regular) PDF file.
Nevertheless, PDF/A offers the promise of renderability well into the future.
For more information about PDF/A, see the REFERENCES section. Also note that a PDF/A file has the same file extension as a non-archival PDF file (i.e., .pdf)—therefore you cannot detect a PDF/A file without examining the metadata that describe it.
A partial list of PDF/A software for Windows is provided here—
- Adobe Acrobat (get Version 8.0 or later)
- soft Xpansion Perfect PDF Master (free for personal use)
- Nuance PDF Converter
- Solid PDF Creator
- Microsoft Office 2007 via its “Save as PDF” plugin (float your cursor over “Save As,” click on “PDF or XPS,” click the “Options…” button, then select “ISO 19005-1 compliant (PDF/A)” under “PDF Options”)
Mac OS PDF/A software includes Microsoft Office 2011, OpenOffice, Nuance PDF Converter, and Adobe Acrobat (get Version 8.0 or later).
Another file format worth noting is JPEG 2000. Like PDF/A, it is also an ISO standard, although it applies strictly to images.
As an improvement to the 1992 JPEG standard, JPEG 2000 provides both lossy and lossless compression. Lossless compression allows the exact original data to be reconstructed from the compressed data. And yet, lossless compression typically achieves 50% to 60% reduction in file size compared with source files—without sacrificing resolution quality in the conversion! For this reason, among others, JPEG 2000 is becoming more popular in the digital preservation industry. File extensions for JPEG 2000 files are .jp2 and .j2k.
Combined with the M-DISC, JPEG 2000 provides a breakthrough in personal archiving of images by simultaneously delivering the benefits of high resolution and reasonable file size!
PNG (Portable Network Graphics) is another ISO standard file format that provides lossless data compression. In some cases, such as images having areas with many pixels of the same color, PNG is even more space efficient than JPEG 2000. However, JPEG 2000 is more error resilient than PNG and is gaining a foothold in the digital preservation industry; hence the author’s focus on JPEG 2000 for general use.
JPEG 2000 software for Windows is identified in the following incomplete list of products—
- Adobe Acrobat and Adobe Photoshop
- FastStone Image Viewer (free for personal use)
- XnView (free for personal use)
- ACDSee Photo Editor
- Corel PaintShop Photo Pro
Mac OS JPEG 2000 software includes Apple Preview, GraphicConverter 7, XnView, ACDSee Pro, and the Adobe products mentioned above.
The author successfully tested most of the Windows software identified above for PDF/A and JPEG 2000 (the Adobe products were not tested). The following test results are worth noting—
- Using a 14 megabyte TIFF image as a source file, all tested JPEG 2000 products provided a lossless compression benefit of 61%.
- Solid PDF Converter Plus creates a PDF/A with JPEG 2000 lossless compression, thus combing the best of both formats. However, the author encountered a software bug with one test and reported it to Solid Documents. A commitment was received to fix the problem, but no time frame was given.
- soft Xpansion Perfect PDF Master (which is free for personal use) does not allow for the addition of descriptive information (metadata) when creating PDF/A files. You must purchase the business version of this product from soft Xpansion in order to get this capability, which is discussed below.
To preserve digital audio files, the Waveform Audio File Format (WAV) is recommended. Compatible with both Windows and Mac OS operating systems, WAV software is plentiful (search on “free WAV software”) and is expected to be used for many years to come. MP3 and SP2 should be avoided for preservation. One reason is that converting a WAV file to MP3 compresses the audio data as much as 91%, which contradicts the digital preservation principle of preserving at the highest resolution you can afford.
Likewise, digital video should be preserved using QuickTime or the Audio Video Interleave (AVI) format. QuickTime runs with both Windows and Mac OS operating systems, but creating AVI files with Mac OS is not easily done. Flash, MPEG-2, and MPEG-4 should be avoided when preserving digital video (compression is a factor here also, as with digital audio).
ADDRESSING THE CHALLENGE OF OBSOLETE FILE FORMATS
The file format recommendations provided in this paper are intended to maximize the renderable life of your digital family history records. In the event that any of the file formats you use appear to be losing vendor support, you should promptly migrate the affected records by converting them to replacement file formats and writing the transformed files to M-DISCs.
This type of migration is called a transformation in the digital preservation industry.
To illustrate, if a new JPEG format is introduced in the future that enhances the JPEG 2000 format, vendors will undoubtedly provide software to convert JPEG 2000 files to the new format. This software will be available for a number of years as customers gradually transition to the new format, providing a window of opportunity for you to transform your affected images.
Since the converted images must be rewritten, such a migration might also provide a needed media refreshment—which is an example of reducing overall preservation workload by combining or overlapping tasks.
In order to ensure that you can transform digital records before their file formats become obsolete, it is critical that you stay abreast of digital preservation technology.
Once again, this kind of transformation work, as well as the technology monitoring it entails, may need to be performed by your posterity or extended family, since you may not outlive the file formats you choose to preserve your digital records. Therefore, it behooves you to prepare them for such transformation migrations and ongoing technology watching.
Recommended PDF/A links—
- www.aiim.org/documents/standards/19005-1_FAQ.PDF (frequently asked questions)
- www.adobe.com/enterprise/pdfs/pdfarchiving.pdf (white paper)
Disclaimer—the recommendations in this paper are the personal opinions of the author. Use them at your own risk. The author is not liable for any consequences resulting from using his recommendations.