LUN locking problems on VIO servers

I had this problem and I think it is due to the order in which the VIOS servers were powered-up and claimed the LUNS.  The biggest clue is if you run “lspv isize” on your VIOS servers you get different answers for the sizes and PVIDs.

Here is an example of querying and releasing locks where required.

1. run # lsattr -El hdisk0 -a reserve_policy for each of your disks and ensure that only the VIOS internal disks are set to “reserve_policy single_path“. Al the others should be “no reserve“. If they need to be changed use the “-P” option and then reboot and check again.

2. Once all the disks are shown as shared, check their status as follows

# devrsrv -c query -l hdisk11

3. If any are still reserved attempt to break the locks as follows:

Use either depending which host has this LUN.

# devrsrv -c release -l hdisk11
# devrsrv -f -l hdisk11

Please check this link for disk reservation release:

3. Check the system error log using “errpt” and “lspath”.

4. Check all your FC adapters are set with “dynamic tracking” and “fast failover” eg.

# lsattr -El fscsi0
attach       switch    How this adapter is CONNECTED         False
dyntrk       yes       Dynamic Tracking of FC Devices        True
fc_err_recov fast_fail FC Fabric Event Error RECOVERY Policy True
scsi_id      0xc80100  Adapter SCSI ID                       False
sw_fc_class  3         FC Class for Fabric                   True

5. Login to each of the nodes and attempt to rediscover the virtual disks and VSCSI.

6. Ensure that the VSCSIs have “heartbeat checking” activated.

Note: You may have to delete and rediscover your devices a couple of times and do various reboots before you get this absolutely right.

Once you think everything is working OK clear all your error logs and and power-down all the clients and VIOS servers, then power everything up in the normal order and check that all the locs, paths, etc have taken as expected, and there are no further errors.



IBM Announces Elastic Storage

IBM has been busy lately developing it’s own cloud products as it attempts to re-invent itself. In the past few days they announced an extension to their GPFS (Global Parallel File System), and rebranded it as elastic storage:

It will be interesting to see how this fares against Redhat’s Gluster-based Open Storage offerings, however the future direction is now clearly away from tradition dedicated SAN and NAS devices. Having said that I think there will be a role for both because Gluster was not designed for multiple high-speed communications as SAN was, but it is the lear winner on cost and flexibility.




Viewing the contents of a disk without varying on the VG

It can be very dangerous to attempt to import or varyon a disk that has been used as a boot disk because if it has the same logical volume names as those in your rootvg, it will render the system unbootable.

To view a disk without varying it on:

# lqueryvg -p hdisk0 -L
00c9b8fb00004c000000013a8c97f698.1   backup_lv 1
00c9b8fb00004c000000013a8c97f698.2   loglv01 1
00c9b8fb00004c000000013a8c97f698.3   fslv19 1
00c9b8fb00004c000000013a8c97f698.4   was70bkp 1
00c9b8fb00004c000000013a8c97f698.5   paging00 1
00c9b8fb00004c000000013a8c97f698.6   fixes_lv 1

The above disk would not be a problem, however if it contained a rootvg you should not vary it on:

# lqueryvg -p hdisk2 -L
00c9b8fb00004c0000000132da76b3ae.1   hd5 1
00c9b8fb00004c0000000132da76b3ae.2   hd6 1
00c9b8fb00004c0000000132da76b3ae.3   hd8 1
00c9b8fb00004c0000000132da76b3ae.4   hd4 1
00c9b8fb00004c0000000132da76b3ae.5   hd2 1
00c9b8fb00004c0000000132da76b3ae.6   hd9v




The 10 best commands in AIX?

This seems to be a very popular subject on the Internet at present, and got me thinking. Whilst I did not get to ten, a better question would be: “which commands differentiate AIX from other UX’s”, and my answers are as follows:

  1. mksysb (Make system backup) – The ability to create a bootable build-tape, file, or DVD cannot be underestimated, and in my opinion sets AIX apart from its rivals. Mksysb is both powerful and versatile, and can be used (in conjunction with NIM) to clone systems, rescue and rebuild them, as well as to migrate to a new platform.
  2. smit (Systems management information tool) – For anyone used to Linux or Solaris this tool can come as a real surprise. Whilst it may not look perfect, and is maligned by some experts, it can be valuable as both a teaching aid, and an auditing tool as it maintains a log of everything you do, and provides a wealth of interactive help. The logs can also be used to review the commands used by the OS, and to create your own scripts and procedures.
  3. cfgmgr (Configuration Manager) – AIX is a plug-and-play operating system that supports hot-plugging and cfgmgr enables you to instantly create and update device drivers, and to start using them. This reduces almost to zero the need to reboot your system when hardware changes are made.
  4. import/exportvg (import and export volume group) – The AIX logical volume manage is in my opinion, way ahead all other OS’s. It enables to dynamically add and remove disks and LUNs, and to move storage to other systems with the minimum of effort. The thing that sets it apart is the close to zero configuration that is required for AIX to import storage to a new system.
  5. errpt (Error Reporting) – All versions of Linux and Unix provide syslog(ng) etc, but all lack the sophistication of the AIX error deamon. It provides an almost mainframe like error messaging service that requires little or no configuration, and works out of the box.
  6. alog (Advanced logging) – The alog files are circular log files which hold the output of things such as the virtual console during boot, NIM operations, console alert messages, etc, and can be vital when debugging system issues.

Whilst I am a huge fan of AIX I do see that there are also some areas where it is weak and those are:

  1. Lack of Linux YUM-Style repositories which could aid Linux integration and afinity
  2. Old and poorly maintained versions of commands such as RPM and Samba.
  3. Lack of tools to aid cross-platform development and the enivitable migration to Linux.
  4. AIX Licencing – In my opinion AIX could become a lot more popular if they provided developer licences and Redhat like self-help subscriptions to encourage a community to develop.

Andrew Cowan
Solution Architect at SystemScan AIX



How can you better manage your AIX systems?

We all know the mantra “What gets measured gets done”, and as corny as it may sound, it is actually very true when it comes to managing your computer systems. If you do not know exactly what is on your system, how it was built, and how it differs from any other in your estate, you can never be sure that it is secure, or running at its best.

It all starts the moment the OS is installed. Most are content to accept the installation defaults and to install everything without testing and documenting, whereas the best practice is actually to install only the things you actually require, and to tune the system to be as secure as possible.

Manufacturers face a difficult balancing act between using their OS as an advertisement of what they are capable of by enabling every possible option, or satisfying customers who require more security. AIX and IBM are no different in this regard and many unnecessary packages and services are installed.

Should you take the time to remove unwanted packages and to disable all but the services that are necessary to support your workload, you will not only increase your security, but also increase performance and make it easier for you to achieve and maintain compliance with many common legal and industry standards.

Maintaining an accurate record of your build procedure and an up-to-date journal detailing all configuration changes not only helps to improve security, and to drive-up standards, it also makes it easier to manage change and to reduce the chance of failures and mistakes.

Managers and auditors will also find it easier prove and maintain compliance if you can clearly demonstrate that your practices match documentation, and that you know exactly is installed.

All this can be done manually, however it takes time, and can be prone to simple mistakes and omission, and this is why we created systemscanaix. The idea being that adminstrators, managers, and auditors can use the tool:

  • As a teaching aid
  • To reduce mistakes and differences between systems
  • To enable managers and auditors to monitor and improve quality and compliance.

Systemscan consists entirely of shell-scripts, made up of standard AIX command, and is installed as a single RPM. The reason we did this was that we wanted to be able to give the software to a (prospective) customer enabling them to examine both the code and results, before deciding what if anything to share with us. This not only saves a huge amount of time but means we did not have to wait for access or to follow complex procedures.

The scripts carry out almost 800 tests in minutes that would take days to run manually. The results are then presented as clear and concise reports and accompanied by a risks and issues report. Each test result also includes a help file which can be used to explain the findings, and to offer a potential remedy.

Systemscanaix is a living product that is being continually enhanced and improved and I would welcome the thoughts and suggestions of my fellow professionals, and to build a vibrant community of users, and perhaps in time to provide a complete open-source version.

Extra tests can already be added at will, and kept private, or shared with the rest of the community. Please take the time to look at what I have done, apply for a free trial version, and email any comments to