mercredi 11 mai 2016

Restore iSCSI configuration for Cinder / Nova

In few cases (i.e. cinder-volume crash), some cinder volumes cannot be accessed by a VM (I/O errors), but are still displayed as associated when using cinder or nova CLI. Looking at the hypervisor's log, you may see:
May 11 13:26:45 cloudhyp1 iscsid: conn 0 login rejected: target error (03/01)
May 11 13:26:45 cloudhyp1 iscsid: conn 0 login rejected: initiator failed authorization with target
May 11 13:26:45 cloudhyp1 iscsid: conn 0 login rejected: initiator failed authorization with target


On the cinder-volume host, check the configuration of iSCSI target:
[root@controller ~]# targetcli ls
o- / ......................................................................................................................... [...]
  o- backstores .............................................................................................................. [...]
  | o- block .................................................................................................. [Storage Objects: 1]
  | | o- iqn.2010-10.org.openstack:volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e  [/dev/cinder-volumes/volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e (20.0GiB) write-thru activated]
  | o- fileio ................................................................................................. [Storage Objects: 0]
  | o- pscsi .................................................................................................. [Storage Objects: 0]
  | o- ramdisk ................................................................................................ [Storage Objects: 0]
  o- iscsi ............................................................................................................ [Targets: 7]
  | o- iqn.2010-10.org.openstack:volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e ............................................. [TPGs: 1]
  | | o- tpg1 .......................................................................................... [no-gen-acls, auth per-acl]
  | |   o- acls .......................................................................................................... [ACLs: 0]
  | |   o- luns .......................................................................................................... [LUNs: 1]
  | |   | o- lun0  [block/iqn.2010-10.org.openstack:volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e (/dev/cinder-volumes/volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e)]
  | |   o- portals .................................................................................................... [Portals: 1]
  | |     o- 192.168.1.1:3260 ................................................................................................. [OK]
  o- loopback ......................................................................................................... [Targets: 0]


In that case, the cloudhyp1 cannot connect to the target because no ACL are defined ([ACLs: 0])

 You have to setup the ACL manually:

[root@controller ~]# mysql -u cinder -p -e "select provider_auth from volumes where id='6e95e5b6-83e1-4958-a5e1-ba5afc94559e'" cinder
Enter password:
+------------------------------------------------+
| provider_auth                                  |
+------------------------------------------------+
| CHAP xjrFIwOQ66ktk
xjrFIwO vr2twXxoDww7wvr2twXx |
+------------------------------------------------+


The first entry is the username and the second the password. You can check that you have the same value on the hypervisor (1, 2):
[root@cloudhyp1 ~]# grep node.session.auth /var/lib/iscsi/nodes/iqn.2010-10.org.openstack:volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e/192.168.1.1,3260,1/default
node.session.auth.authmethod = CHAP
node.session.auth.username = xjrFIwOQ66ktkxjrFIwO
node.session.auth.password = vr2twXxoDww7wvr2twXx


On the hypervisor, you need also to get the initiator id (3):

[root@cloudhyp1 ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1994-05.com.redhat:1abc12d345e6
 
To update the ACL, first save the targetcli configuration:
[root@controller ~]# targetctl save
[root@controller ~]# cp /etc/target/saveconfig.json /etc/target/saveconfig.old

Replace:
          "node_acls": [] 

By for the right volume (iqn.2010-10.org.openstack:volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e in our case) : 
          "node_acls": [
            {
              "attributes": {
                "dataout_timeout": 3,
                "dataout_timeout_retries": 5,
                "default_erl": 0,
                "nopin_response_timeout": 30,
                "nopin_timeout": 15,
                "random_datain_pdu_offsets": 0,
                "random_datain_seq_offsets": 0,
                "random_r2t_offsets": 0
              },
              "chap_password": "
vr2twXxoDww7wvr2twXx",
              "chap_userid": "
xjrFIwOQ66ktkxjrFIwO",
              "mapped_luns": [
                {
                  "index": 0,
                  "tpg_lun": 0,
                  "write_protect": false
                }
              ],
              "node_wwn": "iqn.1994-05.com.redhat:1abc12d345e6"
            }

          ]

You have to replace chap_userid, chap_password and node_wwn by values obtained in steps 1, 2 and 3 respectively.

Then check and load the configuration:
[root@controller ~]# cat /etc/target/saveconfig.json | json_verify
JSON is valid
[root@controller ~]# targetctl restore

You can connect again to the iSCSI target from the hypervisor:
[root@cloudhyp1 ~]# iscsiadm -m node -T iqn.2010-10.org.openstack:volume-6e95e5b6-83e1-4958-a5e1-ba5afc94559e -l