9: Archive Application

This chapter provides information specific to configuring and operating Backup with the Archive Application. It also includes explanations that compare and contrast the different methods used for protecting data across a network.

The Backup archive server provides file archiving and retrieval services to a range of client machines. It is packaged as an add-on extension to existing Backup backup servers, and uses the same license mechanism as Backup.

The Backup archive client can be any machine on a network that employs archive services provided by the archive server. Clients may be enabled for backups, for archives, or for both.

Data archiving is the process of taking a snapshot of files or directories as they reside on primary media (usually disk) at a given point in time. The snapshot image is typically stored on removable media, such as tape or optical disc. Once the snapshot is safely stored on removable media, related files can optionally be deleted to conserve space on disk.

Navigating the Windows

The following sections explain the features and use of the various Backup Archive windows.

Clients Window

Before archiving can occur, you must configure Backup to recognize each archive client. Select the Clients Setup command from the Clients pull-down menu to open the Clients window.

The Archive services choices enable or disable archives for the currently- selected client in the Clients list. To allow archives for the client, click the Enabled choice for Archive services. If archive services remain Disabled, the client will not be able to perform an archive.

Note - When you enable archive services for one client, you also enable other clients of the same name on that server.

Find the Archive users list at the bottom of the scrolling section of the Clients window. To allow users on the client to perform manual archives, enter their username into the Archive users list. To schedule an archive request of an entire workstation, root (or equivalent) must be on the Archive users list for that client, or root@client must be in the Administrator list for the server.

Note - For a complete description of the Clients window, see Chapter 3, "Configuring and Monitoring Clients."

Archive Requests Window

To schedule an archive on a one-time basis, choose Archive Requests from the Customize menu.

The Archive Requests window appears, as shown.

The Archive Requests scrolling list displays archives previously requested on this server. You can add new archive requests and delete old ones using the Create and Delete buttons.

The scrolling panel displays the following information:

Name - displays the archive name.

Annotation - a mixed-case comment string, limited to 1024 characters, which you provide for every archive request.

Status - indicates when to begin the archive. Start now means archiving begins when you click the Apply button. Start later means archiving begins as indicated in the Start time field.

Start time - gives the next time to begin, based on a 24-hour clock.

Client - displays the hostname for the archive client. To request an archive for the server, enter its name in the Client field.

Save set - specifies pathnames of directories or files to archive.

Directive - specifies the backup method, usually Unix standard or Unix with compression.

Archive pool - specifies the volume pool to which archives should be sorted. The default volume pool is Archive.

Verify - indicates whether or not to automatically check the integrity of data archived on tape.

Clone - indicates whether or not to automatically clone the archive volumes for extra security.

Archive clone pool - specifies the volume pool for archive clones. The default archive clone pool is Archive Clone.

Grooming - indicates whether or not client files and directories should be removed after archiving is complete.

Caution -

If you use directives that include instructions to skip files, do not enable the Grooming option. Grooming occurs after a file has been archived. If a file is skipped, it cannot be groomed and will cause the archive to fail.

Archive completion - contains optional command to execute after archiving is complete, for example /usr/ucb/mail.

Archive Request Control Window

To see the results of an archive request, or to start or disable a scheduled archive, choose Archive Request Control from the Server
pull-down menu.

The Archive Request Control window appears, as shown.

The most-recently-run archive request is initially highlighted in the Archive list. Usually it is Disabled, since scheduled archives only run once. To see details about when this archive completed and how successfully it ran, click the Details button.

If you want details about, or control over, a different archive request, select the one you want from the Archive list.

To initiate an archive request immediately, click the Start button. This has the same effect as selecting a Status of Start now for that archive request in the Archive Requests window.

To reschedule an archive request, click the Schedule button and enter a new starting time. This has the same effect as selecting Start later and specifying a new Start time in the Archive Requests window.

To disable an archive request that you scheduled here or in the Archive Requests window, click the Disable button. To halt an archive in progress, click the Stop button.

Archive Request Details

To see the progress of a recent archive (the one currently selected), click the Details button inside the Archive Request Control window.

The Archive Request Details window appears, as shown.

This window provides information about the completion of an archive request. Some of the information in this window, such as annotation, archive clone pool, archive pool, client, clone, name, save set, start time, status, and verify, also appears in the Archive Requests window.

The completion time field displays when the archive finished. Its duration is the difference between this and start time. The log field shows messages generated by the archive. The run status field shows the outcome of the archive request, either completed, failed, or partial.

Note - For more information about failed archives, see the log file, usually /nsr/logs/daemon.log on the server.

Archive Example

Suppose you must shut down and remove the workstation (and hostname) of someone who has left the company. It would be wise to archive the system data first, in case its filesystems contain essential files you need to access later.

This section gives an example of how to schedule and run an archive request.

Creating an Archive Client

Only registered archive clients can use the archive facility. To create an archive client, follow these steps.

1. Choose Client Setup from the Clients menu. The Clients window appears, as shown on page 236.

2. Click the Create button; the Clients window changes.

3. Enter the hostname of the workstation in the client Name field.

4. Click Enabled after Archive services to allow archives for this client.

5. If you want to permit users on the archive client machine to use archive and retrieve, scroll to the bottom of the Clients window and add their user names to the Archive users field.

(It is unlikely that you would want to allow manual archives and retrieves on a workstation about to be shut down.)

Machine hostname is now a registered archive client. However, an archive will not take place until you request one.

Note - If you want to allow archives on the server, make sure that archives are enabled for the server as a client of itself.

Making an Archive Request

Valid archive users may request archives manually using the nwarchive command. However, a manual archive often takes a long time. To avoid overloading a busy network with an archive request during the day, schedule it late at night when the network has less traffic.

To make an archive request, follow these steps:

1. Choose Archive Requests from the Customize menu. The Archive Requests window appears, as shown on page 237.

2. Click the Create button; the Archive Requests window changes.

3. Enter the Name you want to assign to this archive request, and a brief Annotation to remind you of the purpose for the archive.

4. Click Start later for the Status choice to schedule this archive for that night.

5. Enter the Start time you want, or accept the default starting time of 3:33 a.m.

6. Enter the archive client machine

hostname

in the Client field, and the pathname(s) you want to archive in the Save set field.

7. Specify a custom Archive pool, or accept the default Archive volume pool.

8. Click Yes for the Verify choice to check that the archived data was saved correctly.

9. If you want to make a duplicate copy of this archive volume, click Yes for the Clone choice.

Accept the default Archive Clone pool.

10. To remove files and directories from disk after archiving them to tape, click remove for the Grooming choice.

11. To be notified when the archive completes, enter a command into the Archive completion field, for example:

------------------------------------------------

/usr/ucb/mail -s archive_request admin@titania

------------------------------------------------

12. Click the Apply button to activate your changes.

You have now requested an archive of a client machine hostname to begin that night at 3:33 a.m.

Checking the Archive Request

The next morning, you should check the outcome of the archive. If you set up an Archive completion notice, look for an e-mail message containing a log of the archive request.

To check details of the archive, follow these steps:

1. Choose Archive Request Control from the Server menu. The Archive Request Control window appears, as shown on page 239.

2. Click the Details button. The Archive Request Details window appears, as shown on page 240.

3. If this window shows the archive completed successfully, you can safely reconfigure the ex-employee's machine.

If the archive failed, you can reschedule it (see the following section.)

You may use the Archive Request Control window to start, schedule, disable, or stop another archive.

Rescheduling the Archive Request

Suppose that the archive of hostname did not complete last night because, for example, an energy-conscious employee turned off the computer. You decide to reschedule for the next night.

To reschedule the archive, follow these steps:

1. Choose Archive Request Control from the Server menu. The Archive Request Control window appears, as shown on page 239. Make sure that the archive request you want is highlighted.

2. Click the Schedule button. The Archive Request Schedule window appears, as shown.

3. Enter a new starting time in the Schedule Archive field, using the 24-hour clock, and click Ok.

The archive request executes again that night, at the time you specified. If you change your mind and want to discontinue the archive, click the Disable button in the Archive Request Control window.

Clone and Verify

Backup contains two preconfigured volume pools for use with archiving: Archive and Archive Clone. If you want to create a new volume pool for archives, follow these steps:

1. Choose Pools from the Media menu.

2. Click the Create button in the Pools window.

3. Fill out all the fields according to your needs.

Caution -

Make sure the Pool type is set to Archive, and the Store index entries field is set to No. These two traits distinguish archive pools from backup pools.

4. Click the Apply button to activate the new volume pool.

Now when you schedule a new archive request, you may use the new archive volume pool you created. If you choose to clone archive data, you should also create a new archive clone pool. Backup will write archive data only to an archive volume and archive clone data only to an archive clone volume.

If you have already made an archive and want to make a clone of it, follow these steps:

1. Choose Clone from the Save set menu.

2. Enter criteria for locating save sets in the Save Set Clone window.

Note - Click the More button and enter "Archive" in the Pool field.

3. Click the Query button to see save sets matching your criteria.

4. Select the save set you wish to clone and click the Clone button.

5. Click the Start button in the Save Set Status Clone window to activate the save set clone.

To verify data already archived on an archive volume, you have two alternatives:

Clone the archive data by choosing Clone from the Save set menu. During cloning, the original archive data will be verified as the save set is copied from one volume to another. When you are done, you may re-use the cloned volume.

Determine the save set ID of the archived data, for example by searching the Backup Archive Retrieve window. Then run the following command at the system prompt, substituting the save set ID of the archived data for ssid (this is how Backup verifies archived data):
```
------------------------------

# nsrretrieve -n -S ssid 

------------------------------
```

Both alternatives verify the integrity of the data, but do not actually compare archived data with data on disk.

Chapter 5 and Chapter 6 in this manual provide more information about volume pools and save set cloning.

Archiving Shortcut

To schedule and run an archive request, follow this general procedure:

1. Create and enable an archive client using the Clients window.

2. Using the Archive Requests window, fill in all or most of the fields with your preferences.

Name and Client are mandatory, as is the Start time if you Start later. Save set defaults to empty, Directive and Archive pool to the default, Verify and Clone to No, and Grooming to none.

3. Check the archive status in the Archive Request Details window.

4. To reschedule an archive, bring up the Archive Request Schedule window.

a. To discontinue a scheduled archive request, click the Disable button.

b. To start and stop an archive request (for example, to test it), click the Start and Stop buttons.

Understanding the Archive Feature

Archive save sets are similar to backup save sets. The principal difference is that archive save sets have no expiration date. Also, archives are always full - there are no levels of differential saves, or incremental saves.

Note - Archives are not recorded in the online file index, so they are not affected by the browse policy. This feature helps conserve disk space.

Retrieve is similar to recover, except that it works with archive save sets instead of backup save sets. Since archived files are not recorded in the file index, the user interface for retrieve is based on save sets, rather than on a directory hierarchy.

Users on the Backup administrator list have permission to configure archive services. These users, and users registered on the Archive users list in the Clients window, have permission to use the archiving and retrieval facilities. Registered users may archive any file for which they have read permission.

Anyone can browse archive save sets - that is, look at information in the media database. However, you may only retrieve files that you own, unless you are superuser, in which case you may retrieve any file.

Note - If you want to overwrite an archive tape, first make sure nobody will ever need the data again. Then simply relabel the volume, as you would a backup volume.

Archive Functions

The following describes the three categories of archive functions.

Data Archiving

Data archiving can be performed by end users or by the system administrator. Registered users can perform manual archives that start right away. System administrators can perform manual archives or schedule an archive to take place anytime during the next 24 hours. For example, the best time to perform a large archive might be in the middle of the night.

Both users and administrators can request an extra copy of their archive save set, called a clone. Backup Archive employs the volume pools feature to separate backup volumes from archive volumes and archive volumes from archive clone volumes.

After archiving is complete, users and administrators are given the option of deleting archived files and directories or leaving them in place. This option is called grooming. Grooming helps conserve disk space after a project is finished.

This chapter describes the Archive Application using the Backup GUI. For information on archiving using the command line interface, see the nsrarchive man page.

Data Verification

Since archived files are often deleted from the system, Backup provides an extra measure of security to make sure archived data is correct. Backup verifies data in two ways.

media verification - Backup checks the archive volume to ensure it is writable and contains no bad spots.

data verification - Backup reads data from the archive volume as if doing a retrieve, but does not actually write any archived data back to disk.

If a volume is suspect, or if there are problems with the data on the tape, Backup issues a warning and suspends grooming.

If you decide to groom files, we recommend you also select Verify or Clone to avoid deleting improperly archived files.

Data Retrieval

When you use Backup Retrieve, the Backup Retrieve window displays archived save sets for the selected server, listed by client name. You can only retrieve a save set if you have administrator or archive user privileges for that server, are the owner of files in the save set, or are root.

It is possible to search for specific archives and to alter the sort order of archive save sets in the viewing list. See Chapter 5, "Archiving and Retrieving Files" in the User's Guide for more details on retrieving archived files.

When the user picks an archive save set to retrieve, and the administrator ensures that the relevant archive volume (or a clone of that volume) is mounted on the Backup server, retrieval can begin.

Retrieved save sets can be relocated, renamed, or allowed to overwrite existing files of the same name, as with the Backup recover feature.

Methods for Protecting Data

Data backup is the process of storing copies of files and directories from local disk onto removable media, usually tapes. These copies can be recovered in case the original files are lost or damaged. The system administrator usually schedules backups on a daily basis. Any new files, or files that changed since the last backup, are copied to tape so they can be restored on disk if necessary.

Archiving is normally performed on data associated with specific projects, rather than on an entire system. Unlike data backup, end users usually archive their files on an as-needed basis, so a network-wide archiving policy is not needed. Archives, unlike backups, are not associated with a level (full, differential, or incremental).

When users archive project data, they can choose to automatically delete the files from the system disk to conserve space. In this case, archived files need to be placed on long-lasting archive media.

Hierarchical storage management (HSM) is a data management strategy where data is automatically migrated from one storage medium to another, based on a set of rules. The rule most often employed is access rate - the longer a file is inactive, the more likely it is to migrate.

Storage hierarchy is usually governed by the cost of storage for each media. The benefit of HSM is that it provides users with a seemingly infinite storage capacity, at the lowest possible cost.

The principal goals of backup, archiving, and HSM are as follows:

The goal of backup is to protect data against accidental loss or damage. Backups should be reliable and efficient.

The goal of archiving is to conserve online storage space. Storage media must be durable, safe, and reliable.

The goal of HSM is to conserve network storage resources. Migration and recall must be automatic and reliable.