Josh-Daniel S. Davis (joshdavis) wrote,
Josh-Daniel S. Davis

GPFS cheatsheet (General Parallel File System)

Top Level GPFS goodies and notes:
Pick Library:
Pick GA22-7968-02 GPFS V2.3 Concepts, Planning, and Installation Guide:
   Chapter 5, Migration, coexistence and compatability.

NOTE: mmexportfs came about in GPFS 2.2.1

NOTE: Sometimes, especially on larger clusters, the mount will have
to wait for mmstartup to propagate.  If automount is set, just give it
a few minutes.  If you try to mount it manually too early, you may get:
   mount: 0506-324 Cannot mount /dev/gpfs on /gpfs: There is an input
   or output error.
Upgrading from GPFS 2.2 to 2.2.1
Make sure all disks are ready and up
   # mmlsdisk
Shutdown apps
Make sure data is backed up
Unmount any gpfs filesystems
Shut down GPFS:
   # mmshutdown -a
Install the new code
Reboot the nodes
Make sure everything works.
After you're 100% sure you won't be reverting:
   # mmchfs   -V
Upgrading from 2.2.1 to 2.3
Make sure all disks are ready and up
   # mmlsdisk
Shutdown apps
Make sure data is backed up
Unmount any gpfs filesystems
Shut down GPFS:
   # mmshutdown -a
Export your nodeset definitions:
   # mmexportfs all -o /tmp/outfile
Delete all nodes from each nodeset
   # mmdelnode -a (nodesetId)
For non SP clusters, delete the MMFS Cluster:
   # mmdelcluster -a
Make sure if there are any new nodes that they are properly attached
If this is an RPD cluster and you're not using VSDs:
   # rmrpdomain gpfsRPD
Uninstall GPFS and install the new version
   # mmcrcluster -C clustername -n NodeFile -p primary -s secondary -A
Import the cluster config
   # mmimportfs all -i /tmp/outfile
Start GPFS
   # mmstartup -a
After you're 100% sure you won't be reverting:
   # mmchfs   -V
Uninstalling GPFS 2.2
unmount all gpfs filesystems from all nodes
mmdelfs to delete all GPFSes
mmshutdown -a
installp -u
rm -r /var/mmfs /usr/lpp/mmfs /var/adm/ras/mm*
Creating GPFS 2.2 and HA cluster on a disk
# smitty clstart
(info on making the HA cluster not included)
# mmcrcluster -t hacmp -n nodelist -p gandalf -s frodo
# mmlscluster shows it there
# mmconfig -n nodelist -A -C set1 -D /tmp/mmfs -p 80 -V no -U yes
# mmlsconfig
   all looks good
# mmcrlv -F- < echo hdisk21::::
If you try to use a preexisting LV, you'll get:
   mmcrfs: 6027-1909 There are no available free disks.
    Disks must be prepared prior to invoking mmcrfs.
    Define the disks using the mmcrnsd command.
If mmcrlv doesn't like what's already there, you'll see:
   mkvg: 0516-1254   changing the pvid in the odm
   0516-1207 mkvg An invalid physical volume identifier has been
     detected on hdisk21
   0516-862 mkvg unable to create volume group
   6027-1306 /usr/lpp/mmfs/bin/mmvsdhelper -c -C -b -n -d hdisk21 -g
      gpfs10vg -l gpfs10lv     failed with return code 1
If you get this:
   The volume group cannot be varied on because there are no good
   copies of the descriptor area.
Then run this:
   # chdev -l hdisk21 -a pv=clear
# mmstartup -a
# mount /gpfs
If not using disks that GPFS supports persistant reserve,
mount will fail with: 
      GPFS: 6027-701 Some file system data are inaccessible at this time
   Set disk leasing with:
      # mmchconfig useDiskLease=yes
      NOTE: This will disable single node quorum.
There are some ssa fence ID issues I left out since you're Fibre
Creating GPFS 2.2 on RPD or LC cluster
For the manual pages:
   # export MANPATH=$MANPATH:/usr/lpp/mmfs/gpfsdocs/man/aix
   # catman -w
From every node in the cluster:
   # preprpnode (all nodes, space separated)
From one node:
   # mkrpdomain gpfsRPD node1 node2
   # startrpdomain gpfsRPD
Wait a few minutes for it to come online
   lsrpdomain on each node will show it's status.
   lsrpnode should show all online when it's done.
If the RSCT version doesn't match what's installed:
   # runact -c IBM.PeerDomain CompleteMigration Options=0
Set up the actual GPFS portion
   # vi nodelist
   # mmcrcluster -t rpd -n nodelist -p primary -s secondary
   # mmconfig -n nodelist -A -C set1 -D /tmp/mmfs -p 80M -V no -U yes
   # for i in 2 3 4 5 6; do
      chdev -l hdisk$i -a pv=clear
      echo hdisk$i:::: | mmcrlv -F-
# mmlsgpfsdisk -F
   should show free disks
# mmcrfs /gpfs /dev/gpfs -F /var/mmfs/etc/diskdsc -B 512K -C set1 -A yes
# mmstartup -a
If you ever get this:
   The current RSCT peer domain node number is 7. GPFS expects 1.
   Thu  9 Dec 16:03:27 2004 runmmfs: 6027-1242 GPFS is waiting for the
   RSCT peer domain node number mismatch to be corrected
   runmmfs: 6027-1127 There is a discrepancy in the RSCT node numbers
   for lc. (or rpd)
Then run these commands:
   # /usr/lpp/mmfs/bin/mmshutdown -a
   # /usr/lpp/mmfs/bin/mmcommon recoverPeerDomain
      I'm not sure, but you might need to stoprpdomain first.
   # /usr/lpp/mmfs/bin/mmstartup -a
# mount /gpfs
Creating GPFS 2.3 cluster on hdisks
NOTE: doesn't necessarily support this
NOTE: nodenames should be what comes back from "hostname" and this
   should have proper name resolution
NOTE: gpfs will use RCP, so we should make sure that rsh/rcp work
   see the GPFS manual for this.
Create the nodelist file, one hostname per line
   # vi /tmp/nodelist
Make the cluster itself
   # /usr/lpp/mmfs/bin/mmcrcluster -n /tmp/nodelist \
      -p primary_node -s secondary_node
Set a larger block size if needed
   # mmchcluster maxblocksize=512k
Set to autostart, single node quorum, 80M cache, don't verify disks
   # mmconfig -n nodelist -A -p 80M -v no -U yes
   NOTE:  There's a change in single-node quorum called tie-breaker
   disks which I haven't looked into yet.
You should see
   mmconfig: Command successfully completed
   mmconfig: 6027-1371 Propagating the changes to all affected nodes.
   This is an asynchronous process
At this point, you should be able to start gpfs
   # mmstartup -a
Now, create your disk list to be used for NSDs.
   # echo hdiskpower10::::1 > /tmp/disklist
NOTE: you can specify primary and secondary nodes if you don't have
fibre to all of the nodes who will access these disks.
Create the Network Shared Disk
   # mmcrnsd -F /tmp/disklist
List the NSDs to make sure it's there
# mmlsnsd
You should see:
   File system          NSD name       Primary Node         Backup Node
   (free disk)          gpfs01nsd
Create the file system now
   We're choosing a 256K block size, replicated metadata, one LV
   NOTE: If you want this to automount on boot, use a -A flag as well.
   If this fails about missing nodeset, add -C gpfs1
   # mmcrfs $mountpt $fsname -F /tmp/disklist  -B 256K -M 2 -m 1 \
     -n 1 -s balancedRandom
List the NSDs again to make sure it shows in use
   # mmlsnsd
You should see:
   File system          NSD name       Primary Node        Backup Node
   fsname               gpfs
List the disk to make sure it shows ready and available
   # mmlsdisk $fsname
You should see:
   disk   driver  sector  failure  holds      holds
   name   type      size    group  metadata    data   status  avail
   -----  ------ -------  -------  ---------  -----  -------  -----
   gpfs   disk       512        1  yes          yes    ready     up
List the filesystem to make sure it exists
   # mmlsfs $fsname
Mount the filesystem for the first time
   # mount $mountpt
Reboot to make sure everything autostarts
To add a new node to GPFS
   Create the node file
      echo nodename:manager:nonquorum
      where nodename is the hostname, manager and nonquorum are 0 or 1
   Add the nodes
      # mmaddnode  -n /tmp/nodefile
Other useful commands:
   Debugging:       export DEBUG=1
   Stop/Start:      mmshutdown -a ; mmstartup -a
   Remove a node:   mmdelnode ; mmdelcluster
   Removing a disk: mmdeldisk
   Removing an fs:  mmdelfs
   Removing an NSD: mmdelnsd   (only for free NSDs)
   Ex/Import:       mmexportfs all -o (filename)
                    mmimportfs all -i (filename)
   Logs:            /var/adm/ras/mmfs.log.LATEST
Creating GPFS 2.3 on Logical Volumes
This is what I tested in the lab:
Make the VG and LV:
   node1# chdev -l hdisk# -a pv=yes
   othernodes# rmdev -l hdisk#; mkdev -l hdisk#
   node1# mkvg -n -f -s pvside -c -x -y name hdisk###
   node1# varyonvg -c 
   node1# mklv -y  -t raw   hdisk###
   othernodes# importvg -y  -c -x hdisk##
   othernodes# varyonvg -c 
   NOTE: I named my LVs jdtest050418a and b on only one system.
   NOTE: It is questionable whether it's OK to have more than 1 LV per
   PV.  It's been made to work, but the documentation seems to indicate
   that you shouldn't do this.
Make the nodelist:
   # vi /tmp/nodelist
Make the NSDs
   # mmcrnsd -F ./tmp/nodelist
      mmcrnsd: Processing disk jdtest050418a
      mmcrnsd: Processing disk jdtest050418b
      mmcrnsd: 6027-1371 Propagating the changes to all affected nodes.
      This is an asynchronous process.
View the disk descriptor files:
   # cat 1
View the NSDs
   # mmlsnsd -a
   File system   Disk name    Primary node             Backup node
   homie         gpfs1nsd
   (free disk)   gpfs4nsd
   (free disk)   gpfs5nsd
Create the filesystem:
   # mmcrfs /jdtest050418 jdtest050418 -F 1 -B 256K -M2 -m1 -n 1
   GPFS: 6027-531 The following disks of jdtest050418 will be formatted
   on node
      gpfs4nsd: size 81920 KB
      gpfs5nsd: size 81920 KB
   GPFS: 6027-540 Formatting file system ...
   Creating Inode File
   Creating Allocation Maps
   Clearing Inode Allocation Map
   Clearing Block Allocation Map
   Flushing Allocation Maps
   GPFS: 6027-535 Disks up to size 207 MB can be added to this file syst
   GPFS: 6027-572 Completed creation of file system /dev/jdtest050418.
   mmcrfs: 6027-1371 Propagating the changes to all affected nodes.
   This is an asynchronous process.
Mount the filesystem:
   # mount /jdtest050418
Make sure it's there:
   # df -k /jdtest050418
   /dev/jdtest050418   163328    148224   10%    9     1% /jdtest050418
Tags: clusters, ibm, storage
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded