This document describes the process necessary to merge two or more Netbackup 3.4 master
servers. The first section is an overview of the Netbackup (NBU) database structure.
The second section is an explanation of how to merge NBU master servers.
This process requires a small amount of Perl knowledge, and a strong background
in Veritas Netbackup. It is not an easy process, and should not be attempted
without a thorough knowledge of your environment, and common Netbackup processes
and procedures. In short, you’re on your own. Don’t do this if you don’t feel
comfortable with the process. This is a relatively rough explanation of an involved
process.
Veritas Netbackup is a multi-tier backup solution capable of backing up large
enterprises to centralized backup hardware. A master server schedules and maintains
backup information for a given set of systems. The master server keeps track
of which backup in on which volume (tape), and what volume pool the volume belongs
to.
As a backup environment grows, multiple NBU master servers may exist. Multiple
master servers are difficult to maintain. Centralized management, reporting,
and maintenance are the benefits of working in a centralized Netbackup environment.
Once a master server has been established, it is possible to merge its databases
with another master server, giving control over its set of server backups to
the new master server.
1. Netbackup Internal Databases
a. Image Database
· Manages images on tapes
· Flat text files in /usr/openv/netbackup/db/images directory tree
· Each client stores its image database in a directory /usr/openv/netbackup/db/images/[name
of client]
Typical Image Database looks like:
# ls -l /usr/openv/netbackup/db/images/wood01p-e03
total 76
rw——— 1 root sys 0 Oct 30 2000 .lck
drw-r-xr-x 2 root sys 1024 May 22 12:45 1012000000
drw-r-xr-x 2 root sys 1024 May 22 12:47 1014000000
drw-r-xr-x 2 root sys 1024 May 22 12:47 1017000000
drw-r-xr-x 2 root sys 2048 Jun 9 07:41 1020000000
drw-r-xr-x 2 root sys 2048 Jul 3 09:36 1022000000
drw-r-xr-x 2 root sys 2048 Aug 2 06:24 1025000000
drw-r-xr-x 2 root sys 2048 Jul 18 04:18 1026000000
drw-r-xr-x 2 root sys 2048 Jul 29 03:58 1027000000
drw-r-xr-x 2 root sys 1024 Aug 2 04:02 1028000000
drw-r-xr-x 2 root sys 22528 Aug 2 06:24 INDEX
rw-rw-rw 1 root sys 2 Oct 30 2000 INDEXLEVEL
rw——— 1 root sys 0 Aug 2 06:24 STREAMS
rw——— 1 root sys 17 Aug 2 06:24 STREAMS.lck
Each directory in the client image database has a function. The numbered databases:
drw-r-xr-x 2 root sys 1024 May 22 12:45 1012000000
hold the actual image data. The number "1012000000" is a unix time
stamp (the number of seconds since December 31st, 1969). Inside of each of these
numbered directories is the actual image data:
# ls -l /usr/openv/netbackup/db/images/wood01p-e03/1012000000
-rw-r—r—1 root sys 901 Jan 29 2002 SharedOracle-wood01p-monthly_1012317586_FULL
rw——— 1 root sys 45366094 Jan 29 2002 SharedOracle-wood01p-monthly_1012317586_FULL.f
-rw-r—r—1 root sys 906 Feb 1 2002 SharedOracle-wood01p-monthly_1012580494_FULL
rw——— 1 root sys 45391014 Feb 1 2002 SharedOracle-wood01p-monthly_1012580494_FULL.f
The files in this directory contain flat text data about what is included in
the image. Each image is described in two files, the first [classname][unixtimestamp][type]
contains image metadata:
# cat /usr/openv/netbackup/db/images/1012000000/ SharedOracle-wood01p-monthly_1012317586_FULL
KBYTES 38651107
NUM_FRAGMENTS 2
COPIES 1
VERSION 3
CLIENT_TYPE 0
RETENTION_LEVEL 8
SCHEDULE_TYPE 0
COMPRESSION 0
ENCRYPTION 0
FILES_FILE_COMPRESSED 0
MPX 1
TIR_INFO 0
TIR_EXPIRATION 0
PRIMARY_COPY 1
IMAGE_TYPE 0
ELAPSED 15418
EXPIRATION 1043853586
NUM_FILES 242982
EXTENDED_SECURITY_INFO 0
REQUEST_PID 0
IND_FILE_RESTORE_FROM_RAW 0
IMAGE_DUMP_LEVEL 0
FILE_SYSTEM_ONLY 0
PREV_BLOCK_INCR_TIME 0
BLOCK_INCR_FULL_TIME 0
STREAM_NUMBER 0
CATARC 0
BACKUP_STATUS 0
# FRAG: c# f# K rem mt den fn id/path host bs off md dwo f_flags f_unused1
exp mpx u3 u2 u1
FRAGMENT 1 1 23606730 0 2 6 1 X01120 wood01p-e03 64512 2 1012317586 1 0 NULL
1043853586 1 0 0 0
FRAGMENT 1 2 15044377 0 2 6 1 X00739 wood01p-e03 64512 2 1012317586 0 0 NULL
0 0 0 0 0
BACKUP_ID wood01p-e03_1012317586
CREATOR root
SCHED_LABEL monthly
FILES_FILE SharedOracle-wood01p-monthly_1012317586_FULL.f
HISTO_INFO -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
The second file [class] ][unixtimestamp][type].f contains a list of files
contained in the image:
# cat /usr/openv/netbackup/db/images/1012000000/ SharedOracle-wood01p-monthly_1012317586_FULL.f
0 0 1 50 0 0 0 0 36175872 / 16877 root root 0 1012316747 1012316006 1012316006
0 0 6 48 1 0 0 0 36175872 /chris/ 16877 root root 0 1012316631 975705477 975969924
1 0 10 51 2 1 0 0 36251872 /chris/app/ 16877 chrisndm chris 0 1012301999 978530459
978530459
2 0 21 48 3 1 0 0 36251872 /chris/app/lost+found/ 16877 root root 0 1012287065
976225664 976225664
3 0 27 49 4 1 0 0 36251872 /chris/app/lost+found/.fsadm 33152 root other 0 976225664
976225664 1012287065
4 0 15 53 5 1 0 0 36251872 /chris/app/msgs/ 16888 chrisndm chris 0 1012287156 1012301199
1012301199
{truncated for brevity}
b. Volume Database (voldb)
· Contains volume (tape) information, what pools tapes belong to, when
the tapes were assigned, mount information, robot locations, etc.
· Located at /usr/openv/volmgr/database/voldb
· Binary format, can’t be directly edited.
To see what’s inside the database, use vmquery:
# vmquery -rn 0
media ID: Y00239
media type: 1/2" cartridge tape (6)
barcode: Y00239
description: Added by Media Manager
volume pool: NetBackup (1)
robot type: TLD - Tape Library DLT (8)
robot number: 1
robot slot: 586
robot host: grain05p
volume group: 00_001_TLD
created: Tue Mar 26 16:06:02 2002
assigned: Fri May 03 09:47:51 2002
last mounted: Fri Aug 09 05:13:35 2002
first mount: Fri May 03 13:57:53 2002
expiration date:—-
number of mounts: 226
max mounts allowed:—-
status: 0×1
Each tape has a record in the voldb that is similar to the above. All volume
pool information is defined in /usr/openv/volmgr/database/poolDB:
# cat /usr/openv/volmgr/database/poolDB
0 21 None ANYHOST -1 -2 the None pool (for anyone)
1 21 NetBackup ANYHOST 0 -2 the NetBackup pool
2 21 old3 ANYHOST -1 -2 These are controlled by gts3
3 21 grain01d-nightly ANYHOST -1 -2——
4 21 Scratch ANYHOST -1 -2——
5 21 grain01d-monthly ANYHOST -1 -2——
6 21 grain02p-nightly ANYHOST -1 -2——
7 21 grain02p-monthly ANYHOST -1 -2——
8 21 grain06p-monthly ANYHOST -1 -2——
9 21 grain06p-nightly ANYHOST -1 -2——
10 21 noodle-nightly ANYHOST -1 -2——
11 21 grain06p-genesys ANYHOST -1 -2——
12 21 sunray ANYHOST -1 -2——
13 21 chrisp-grain06p ANYHOST -1 -2——
14 21 chrisp-monthly ANYHOST -1 -2——
15 21 grain02p-15th ANYHOST -1 -2——
2. Merging Master Servers
NOTE: Merging master servers is not a simple process. It is unsupported by
Veritas, and should be done only after a complete backup of the affected master
servers is completed.
a. Overview
The Netbackup master server databases are mostly flat text files, so they’re
easy to manipulate. To merge master servers, you need to copy the image db information
to the merged server. Because the volDB is binary, it is more difficult to merge,
but not impossible. Using vmquery and vmchange, the volDB can be migrated.
b. Merging the NBU Image Database
As discussed, information on NBU images are kept in a series of directories
on the master server. They are plain text, and can be copied directly from one
NBU master server to another:
# cd /usr/openv/netbackup/db
# tar cvf /tmp/[masterserver name]_images.tar images
FTP the tar bundle to the new server
On the new server:
# mv /[wherever]/[masterserver name]_images.tar /usr/openv/netbackup/db
# tar xvf [masterserver name]_images.tar
You can then either re-create the class and storage unit information by hand
on the new server, or copy directly from the old server to the new server:
# cd /usr/openv/netbackup/db
# tar cvf /tmp/[masterserver name]_classes.tar class
FTP the tar bundle to the new server
On the new server:
# mv /[wherever]/[masterserver name]_classes.tar /usr/openv/netbackup/db
# tar xvf [masterserver name]_classes.tar
c. Relocating Robotic Control, Volume Database Host, and Storage Units
Depending on your pre-merge environment, you’ll most likely need to redefine
robotic control. The new master server will (again, most likely, depending on
your environment) have some sort of robotic control. You’ll also need to relocate
your Volume Database Host.
This section is difficult to write in a general sense. Instead, we’ll discuss
the general concepts. You’ll need to take your environment into consideration
when completing these steps.
1. Create records (using tpconfig) for any new robotic controllers your new
merged master server will either control or reference. This may change the robot
number on any migrated robots. For instance, in our environment, we recently
merged two master servers. The first server (A) had two robots attached to it,
the second server (B) had one robot attached to it. When we moved B’s responsibilities
to A, we had to renumber B’s robot from robot 1 to robot 2. We then had to revise
the robotic definitions on each server using this robot. It can be an involved
process, but isn’t out of the scope of general Netbackup knowledge.
2. Define the new master server as the volume database host for the newly added
robot(s).
3. Update the volume inventory
4. Move the storage unit information from your old master server to your new
master server. The file /usr/openv/netbackup/db/config/storage_units
contains the storage unit definitions for the master server. The following is
an example line from that file:
STUNIT ralph-TLD1 2 ralph-e0.lasalle.na.abnamro.com 8 0 6 2 0 NULL 0 1 0
NULL
The sixth field, in this case "0" defines the robot number that this
storage unit will use for backups. If you needed to renumber your robot scheme,
you’ll have to change this field before you cut and paste your storage unit
definitions to the new server.
d. Merging the NBU Volume Database
Because the NBU volume database is a binary database, we can’t copy it directly
from one server to another. However, using NBU commands, we can recreate the
important information from the volDB on the new server.
1. Merge the volume pool information: cat /usr/openv/volmgr/database/poolDB
on both machines. Each file contains a list of the volume pools for each machine.
The first two groups (0 and 1) are typically None and Netbackup. Other than
those two lines, copy the rest of the file from the original master, to the
new master.
2. Renumber the new poolDB: The new (merged) poolDB will have duplicate numbered
pools. Renumber the pools.
3. Create the poolDB hash: The next step involves a Perl hash map of the old
volume pool numbers, mapped to the new volume pool numbers. Create a datastructure
like the following (in plain text), where the original pool number is pointing
to the new number:
my %poolDBmap = (
‘3’ => ‘15’,
‘4’ => ‘16’,
‘5’ => ‘17’,
‘6’ => ‘18’,
‘7’ => ‘19’,
‘8’ => ‘20’,
‘9’ => ‘21’,
‘10’ => ‘22’,
‘11’ => ‘23’,
‘12’ => ‘24’,
‘13’ => ‘25’,
‘14’ => ‘26’,
‘15’ => ‘27’,
‘16’ => ‘28’);
4. Add the robot definitions for the old servers robots to the new server,
including creating records for the added robots tapes.
5. Run the following script on the original master server. (Notice where we’ve
put the %poolDBmap created in step 3. Also, you need the Class::Date module,
found here: http://search.cpan.org/author/DLUX/Class-Date-1.1.0/ ):
#!/usr/local/bin/perl
# Chris McAvoy mcavoy76@hotmail.com
# v1.0
use strict;
use Class::Date;
chomp (my $date = `date`);
my @robots = qw/1/;
my %poolDBmap = (
‘3’ => ‘15’,
‘4’ => ‘16’,
‘5’ => ‘17’,
‘6’ => ‘18’,
‘7’ => ‘19’,
‘8’ => ‘20’,
‘9’ => ‘21’,
‘10’ => ‘22’,
‘11’ => ‘23’,
‘12’ => ‘24’,
‘13’ => ‘25’,
‘14’ => ‘26’,
‘15’ => ‘27’,
‘16’ => ‘28’);
my %month_conv = (
‘Jan’ => ‘1’,
‘Feb’ => ‘2’,
‘Mar’ => ‘3’,
‘Apr’ => ‘4’,
‘May’ => ‘5’,
‘Jun’ => ‘6’,
‘Jul’ => ‘7’,
‘Aug’ => ‘8’,
‘Sep’ => ‘9’,
‘Oct’ => ‘10’,
‘Nov’ => ‘11’,
‘Dec’ => ‘12’ );
# A log file will be kept in /var/tmp/scratch.log
open(LOG, ">>/var/tmp/scratch.log") or die "Can’t open
LOGFILE!";
######################################################################
## !McAvoy’s Generic vmquery engine! #################
######################################################################
# This function loads a hash of hashes (%vm) with data from vmquery. #
# It can be used for other vmquery related functionality. #
######################################################################
######################################################################
sub load_hash_from_vmquery {
my $robot = $_0;
open(VMQUERY, "/usr/openv/volmgr/bin/vmquery -rn $robot | grep -v = |")
or die "Cannot open a pipe from vmquery!";
my %vm;
my $tape;
while (defined(my $line = <VMQUERY>)) {
my $key;
if ($line = /(.):\s(.+)/) {
my $thing = $1;
my $value = $2;
$thing = s/\s+/_/;
if ($line = /media ID/) {$tape = $value;}
elsif ($line = /volume pool/) {$value = /(.)\s\((.+)\)/; $value = $2;};
$vm{$tape}{$thing} = $value;
};
};
return \%vm;
};
#####################################################################
#####################################################################
my $data;
#foreach my $rob_num (@robots) {
# $data[$rob_num] = &load_hash_from_vmquery($rob_num);
# };
$data = &load_hash_from_vmquery(1);
open (SCRIPT, ">voldb_merge.ksh");
print SCRIPT "#!/bin/ksh\n";
foreach my $tape (keys %$data) {
if ( !($data->{$tape}{assigned} = /—-/) ) {
my @date_str = split (/ /, $data->{$tape}{assigned});
my @time_str = split (/:/, $date_str3);
my $year = $date_str4;
my $origmonth = $date_str1;
my $month = $month_conv{$origmonth};
my $day = $date_str2;
my $hour = $time_str0;
my $minute = $time_str1;
my $second = $time_str2;
my $assign_date = new Class::Date[$year,$month,$day,$hour,$minute,$second];
my $newdate = $assign_date->epoch;
my $origdate = $assign_date->string;
my $pool = $$data{$tape}{volume_pool};
print SCRIPT "#changing $tape\n";
print SCRIPT "vmchange -p $poolDBmap{$pool} -m $tape\n";
print SCRIPT "vmquery -assignbyid $tape hcart $poolDBmap{$pool} 0×0 $newdate\n";
}
else { print SCRIPT "#$tape unchanged\n" };
};
6. The script from step 4, when run, creates a file called voldb_merge.ksh.
This file is a series of netbackup commands that will recreate the volume database
on the new server. The contents of the file look like this:
#changing Y00201
vmchange -p 20 -m Y00201
vmquery -assignbyid Y00201 hcart 20 0×0 1022918416
#changing Y00198
vmchange -p 27 -m Y00198
vmquery -assignbyid Y00198 hcart 27 0×0 1010505652
#changing Y00362
vmchange -p 21 -m Y00362
vmquery -assignbyid Y00362 hcart 21 0×0 1027843543
#Y00279 unchanged
3. Conclusion
Although not all information is merged at this point, enough has been transferred
from one master server to another that you will be able to restore from the
new server. I’ve tested this method on a large environment, and had no problems.
If you have something to add, please either email me chris@lonelylion.com,
or add a comment below. Good luck.