The TSM Way of Backup

Users switching from traditional backup methods to TSM backup are often confused by the approach TSM takes to backup. In this article, we will highlight the TSM backup feature and how it differs from traditional backup methods.

In the world of data backup, there are basically exactly three different ways to create a backup, i.e. a backup of a data area. The simplest case and also always the basis for the other two, let's call them backup types, is the so-called full backup. This is nothing more than a complete copy of the data area to be backed up to another medium. The second backup type is the so-called differential backup. Here, only the changes to the last full backup are copied to another data carrier. The third backup type is the so-called incremental backup. Here, only the changes to the last backup, regardless of whether it was a full or incremental backup, are copied to another data carrier.

If we look at the backup types in the light of the fact that data backups are traditionally made on magnetic tapes, there are some advantages and disadvantages which we will outline below.

In the case of full backup, the serious disadvantage is that the complete data set must be copied each time, which places a heavy load on system resources. Furthermore, each full backup is independent of all previous backups. So if a file has not changed since the last backup, it will still be saved again. This means a waste of storage space. The advantage of a full backup, however, is that only a single backup needs to be copied back in order to restore the data. This means that a full backup is often the fastest solution for a restore.

With a differential backup, the differences to the last full backup increase with each day, so that you have to transfer more data to the backup medium every day, even if they have not changed since the last differential backup. Therefore, a new full backup should be created at regular intervals, which can then be used as a basis for further differential backups. In comparison, this method is much more resource-efficient in terms of the data to be copied daily and, of course, the total amount of backup data to be stored. However, more storage space is still used than is actually necessary, since redundancies still exist between the differential backups. In the restore case, the last full backup is restored first and then the last differential backup is copied over it. Thus, two steps are necessary.

The incremental backup is optimal with regard to the amount of data to be transferred and stored, since only the changes to the last backup need to be recorded, regardless of whether this is a full or incremental backup. The big problem, however, becomes apparent in the case of a restore. If you don't have any additional information like TSM stores, you have to restore the last full backup first and then all incremental backups created since then. So there are as many steps necessary as incremental backups have been created since the last full backup.

Based on the backup types described above, an infinite number of backup strategies can now be developed. One of the most common strategies is the so-called grandfather-father-son principle. To implement this, with a five-day week and a retention period of 6 months, you need the following number of media:

  • 4 Son-Media (Montay-Thursday)
  • 4 Father-Media (Friday)
  • 6 Grandfather-Media (last day of the month)

By means of this strategy, one can access the data statuses of the last four days, the last four Fridays and the last 6 ends of the month. But for this you need at least 14 tapes, even if the respective backups are much smaller than the tape capacity.

TSM's backup strategy is fundamentally different from traditional backup strategies. The goal was to make backups as efficient as possible in terms of resource consumption while still providing a viable restore procedure. The TSM developers succeeded in doing this by enriching the incremental backup procedure with additional metadata, thereby realizing a so-called progressive incremental backup strategy. The metadata is stored in a DB2 database. With the progressive incremental strategy, only incremental backups are made. The only difference is that the first backup run is more or less a full backup. In the metadata, however, it is additionally stored where on which tape which version of a file is located. The individual backups of a computer are simply written one after the other to a tape until it is full and then the next tape is used. Thus it is also optimal for the capacity utilization of the individual tapes. In the restore case, TSM searches out the required versions of the data from the metadata, sorts them in such a way that as few tape changes and positioning operations as possible have to be performed, and then creates a synthetic full backup from the point in time requested by the user, which is then sent to the client.

Unfortunately, however, even with progressive incremental strategies, it is often not possible in practice to simply keep all versions of a file for an indefinite period of time, since the cost of storage would be disproportionate to the benefit. Therefore, TSM provides the possibility to define retention policies with respect to the retention time and the number of retained versions of a file.

To understand the mechanism of these retention policies, the concept of active and inactive data is essential. For TSM, any backup copy of a version of a file still on the source system is an active version. Active data is never deleted in TSM. This ensures that after a data loss the state of the data can be completely restored to the last backup run. If a file on the source system is deleted or replaced by a new version, the previously active backup copy will be marked as inactive by TSM during the next backup run. If the file is overwritten, the new version will be kept as active backup copy from now on. From the moment a backup copy of a file is marked as inactive, it is subject to the retention policy. This means that from that point on, it is retained for a maximum of the retention period defined in the policy, or until there are more current, versions of that file than the retention policy allows for the maximum number of versions.

The LRZ standard retention policies work with a maximum retention period of 180 days of a maximum of 3 inactive versions. Via the dedicated selection of the so-called management class for individual files and directories, we also offer the option of retaining a maximum of 10 inactive versions. In our experience, these guidelines represent a good compromise between the costs of storing the backups on the one hand and the benefits in terms of the chance that data can still be found in the backup after a data loss on the other.

If we now compare the traditional generation strategy with the LRZ's progressive incremental strategy, the two methods have different advantages and disadvantages.

Pro Progressive Incremental strategy:

  • lower storage space consumption
  • better storage space utilization
  • lower resource requirements for backup
  • lower costs
  • Each deleted file can be restored to its last version for 180 days.
  • The last 3 or 10 versions of a file are kept.

Contra progressive incremental strategy:

  • Metadata database needed
  • Old versions of a file age out earlier than 180 days when the maximum number of versions is reached.
  • Higher effort for restore

Pro generational strategy:

  • no metadata database necessary
  • less effort for restore
  • If a file exists at the time of the monthly backup, it is guaranteed to be restored even after 180 days, regardless of the number of versions that have existed of that file.

Contra generational strategy:

  • higher storage space consumption
  • worse storage space utilization
  • higher resource requirements for backup
  • higher costs
  • Files that have been deleted for a long time cannot be restored to the last version because only weekly or monthly backups exist.
  • Not the last 3 or 10 versions of a file are kept, but only the state of the file at the backup times.
  • Files that existed only between two monthly backups or two weekly backups, for example, cannot be restored.

For data that has to be kept for a longer period of time TSM offers the so called archiving function. With it you can keep data in TSM for 10 years by default. On request we also offer the possibility to store archive data without expiration date. However, please note that the archive function does not work in progressive incremental mode, but corresponds to a full backup. I.e. if one and the same version of a file is archived 100 times, it will also be stored 100 times in TSM. Since this is, of course, a waste of storage resources and, in the end, of taxpayers' money, extreme care must be taken when selecting files to be archived. For example, we observe time and again that users archive their entire file system at regular intervals, probably for reasons of convenience. Of course, this is not what the inventor intended and leads to massive technical problems when retrieving the data and, last but not least, is unfair to those users who are concerned about which data is worth archiving and which is not.

We will discuss further semantic differences, the different application areas of the TSM backup and archive functions and which requirements can be fulfilled with these functions and how in another article.