Programming Course for Beginners - ZedLX

The Easiest Online Computer Programming Course, for Free

Hardware / Recommended

Recommended Data Storage Solutions

Updated: June 2020

The fact that data has value should not escape from our sight. The value of data might be personal, like a collection of digitalized photographs of a person's birthday celebrations. The value of data might be corporate, like a collection of digitalized music for the music publisher.

Some data might represent a persons work, like text files of a writer's book, or image files of a digital graphics designer.

Hardware for safe and practical data storage is a topic of this article.

A computer lacking any data storage cannot even be started up, as the operating system (OS) is also data. Therefore, at computer's startup, the OS first needs to be loaded into main memory, usually from computer's primary data storage device or devices.

An SSD for Startup Drive

An SSD (i.e. Solid State Drive) is a fast, relatively reliable and relatively expensive data storage solution.

We recommended using SSD data storage for a boot (i.e. startup) drive, in order to speed up the computer's startup procedure. Similarly, all the computer's software should be stored on the computer's startup drive in order to speed up application launch times.

A capacity of 120 GB or more is a good choice for a new SSD. We do not recommend buying an SSD smaller than 120 GB.

For new purchases we recommend Crucial MX500 series, because of a low price per GB and good performance. Alternatively, we recommend any SSD employing a DRAM and a recently designed controller:

  • Crucial MX500 (avoid BX500 series)
  • Samsung 850 / 860 / 970
  • Samsung 850 / 860 / 970 Evo, Pro
  • Kingston UV500

Data Backup - Reliable, Simple and Inexpensive, Pick All Three

A data backup is an essential method for significantly decreasing chances of data loss.

In the simplest and the least expensive setup, multiple external USB flash drives can be used for data backup. The minimal number of USB flash drives required for safe backup is two.

With two external backup drives it can be arranged that, at any point in time, at least one of the backup drives is disconnected from a computer. Such an arrangement protects against many adverse scenarios, like a thunderstrike while backup procedure is in progress, a power loss during backup, or an accidetal deletion during backup procedure.

Here we repeat this important piece of advice: you should never have all of the backup drives plugged into a computer simultaneously.

Therefore, the minimal data backup hardware consist of something like two 16 GB USB flash drives. The total cost of such a system is only around 12 USD, while the benefits are enormous. Therefore, it is usually a mistake to ommit data backup hardware as a necessary component of a working computer system.

An HDD for Mass Storage

An HDD (i.e. Hard Disk Drive) is a relatively reliable data storage solution for inexpensive storage of larger amounts of data.

The primary disadvantage of HDDs is that they are approximately 10 times slower than SSDs.

HDDs are best used for storing large amounts of data, like movie collections, large music collections, and large collections of high quality photographs. For most users, an SSD having a capatity of 120 GB or 250 GB is plentifull; therefore a common user does not need a HDD at all. If a user requires larger capacities, like 500 GB to 12000 GB (same as 0.5 TB to 12 TB), then a HDD would be a much cheaper solution than SSDs.

Failure Rates of Storage Devices

There are various failure modes that can affect both SSDs and HDDs. Here is a list of the most common failure modes:

  • Sudden and complete failure, without any specific reason or prior warning. Yearly rate: less than 5% yearly for HDDs in the first 5 years of operation; less than 2% yearly for quality SSDs in the first 10 years of operation.
  • Complete failure in the first 6 months after purchase due to undetectable manufacturing defects. Less than 5% of drives are commonly affected.
  • Failure to correctly read previously written data (uncorrectable read errors). On HDDs, this error mode is commonly caused by a bad sector. On SSDs, this error mode is at least several times more frequent than on HDDs; the common cause is a malfuctioning or overused flash memory block.

The rate of major failures for HDDs are around 3% yearly in the first 5 years, then it rises to approx. 10% yearly until the drive fails.

The rate of major failures for SSDs are around 1% yearly until the total size of written data exceeds the drive's design limit. SSD drive manufacturers commonly state this design limit as TBW (Total Bytes Written).

For SSD drives, the manufacturer's guarantee is void after the drive exceeds the specified TBW limit. After TBW is passed over, the probability of error and probability of failure increase quickly and considerably.

RAID 1 to Increase System Reliability and Availability

RAID (Redundant Array of Inexpensive Drives) can protect stored data from some drive failure modes and error modes. In a RAID 1 array, all data is always written to two separate storage devices; in other words, the data is intentionaly stored redundantly (duplicated). If one of the devices fail, the other device can be used to recover all the data, therefore preventing any data loss.

When a drive in a RAID 1 array fails, the computer should notify the user of the failure. The user should then replace the failed drive with a new drive. The computer then initiates a RAID 1 rebuild procedure, which copies all the data to a new drive. This process is transparent to the user, meaning that the computer can be normally used while RAID 1 array is being rebuilt. When the rebuilding procedure is completed, all the data has been duplicated and the data is again safe in case of a single drive failure.

We recommend using RAID 1 for all the data on SSDs and HDDs that are permanenently attached to a computer.

There are many solutions which provide RAID 1 capabilities. Most computer motherboards have hardware-assisted RAID 1 built in and accessible through UEFI or BIOS. The user needs to setup RAID 1 manually, either through UEFI or BIOS interface, or by using software provided by the motherboard manufacturer.

Alternatively, a RAID 1 software solution can be used. On a Linux-based computer we recommend using the btrfs file system, which provides RAID 1 capabilities through software. The user needs to manualy setup RAID 1 on btrfs.

Note: RAID 1 is NOT backup. RAID 1 protects only against a subset of possible failure modes. For example, RAID 1 can not protect against accidental deletion, viruses, or thunderstrikes, while a regular backup can provide substantial protection in those circumstances. Doing regular backup is more important than employing RAID 1.