Understanding and analyzing group membership

Thank you for your feedback!

Group membership is one of the key troubleshooting tools for clusters at scale, where the group composition may change frequently as drives and other components degrade and are SmartFailed out. As such, an understanding of OneFS group membership reporting allows you to determine the current health of a cluster. It also enables a cluster's history to be reconstructed when triaging and troubleshooting issues that involve cluster stability, network health, and data integrity.

Under normal operating conditions, every node and its requisite disks are part of the current group. This can be viewed by running the ‘sysctl efs.gmp.group’ CLI command from any healthy node of the cluster.

A OneFS group consists of three parts:

Item	Description	Example
Sequence number	Provides identification for the group	<2,288>
Membership list	Describes the group	‘1-3:0-11’
Protocol list	Shows which nodes are supporting which protocol services	‘smb: 1-3, nfs: 1-3, hdfs: 1-3, all_enabled_protocols: 1-3, isi_cbind_d: 1-3, lsass: 1-3, s3: 1-3’

For the sake of ease of reading, the protocol information has been removed from each of the group strings in all the following examples.

If more detail is wanted, the ‘sysctl efs.gmp.current_info’ CLI command provides extensive current GMP information.

Consider the membership list { 1-3:0-11 }. This represents a healthy three node X210 cluster, with Node IDs 1 through 3. Each node contains 12 hard drives, numbered zero through 11.

The numbers before the colon in the group membership list represent the participating Array IDs.
The numbers after the colon represent Drive IDs.

 Array IDs differ from Logical Node Numbers (LNNs), the node numbers that occur within node names, and displayed by the ‘isi stat’ CLI command. These numbers may also be retrieved on a node using the ‘isi_nodes’ command. LNNs may be reused, whereas Array IDs are never reused. Drive IDs are also never recycled. When a drive is failed, the cluster will identify the replacement drive with the next unused number. However, unlike Array IDs, Drive IDs start at 0 rather than at 1.

The numbers before the colon in the group membership string represent the participating Array IDs, and the numbers after the colon are the Drive IDs.

Each node’s info is maintained in the /etc/ifs/array.xml file. For example, the entry for node 1 of the X210 cluster above reads:

<array_id>2</array_id>

<array_lnn>2</array_lnn>

<onefs_version>0x800005000100083</onefs_version>

<ondisk_onefs_version>0x800005000100083</ondisk_onefs_version>

<soft_fail>0</soft_fail>

<read_only>0x0</read_only>

<type>storage</type>

</device>

It is worth noting that the Array IDs (or Node IDs as they are also often known) differ from a cluster’s Logical Node Numbers (LNNs). LNNs are the numberings that occur within node names, as displayed by isi stat for example.

Fortunately, the ‘isi_nodes’ command provides a useful cross-reference of both LNNs and Array IDs:

F200-1# isi_nodes "%{name}: LNN %{lnn}, Array ID %{id}"

F200-1: LNN 1, Array ID 1

F200-2: LNN 2, Array ID 2

F200-3: LNN 3, Array ID 3

Usually, LNNs can be reused within a cluster, whereas Array IDs are never recycled. In this case, node 1 was removed from the cluster and a new node was added instead:

F200-1: LNN 1, Array ID 4

The LNN of node 1 remains the same, but its Array ID has changed to ‘4’. Regardless of how many nodes are replaced, Array IDs will never be reused.

A node’s LNN, however, is based on the relative position of its primary backend IP address, within the allotted subnet range.

The numerals following the colon in the group membership string represent drive IDs that, like Array IDs, are also not recycled. If a drive is failed, the node will identify the replacement drive with the next unused number in sequence.

Unlike Array IDs though, Drive IDs (or Lnums, as they are sometimes known) begin at 0 rather than at 1 and do not typically have a corresponding ‘logical’ drive number.

For example:

F200-3# isi devices drive list

Lnn Location Device Lnum State Serial

-----------------------------------------------------

3 Bay 1 /dev/da1 12 HEALTHY PN1234P9H6GPEX

3 Bay 2 /dev/da2 10 HEALTHY PN1234P9H6GL8X

3 Bay 3 /dev/da3 9 HEALTHY PN1234P9H676HX

3 Bay 4 /dev/da4 8 HEALTHY PN1234P9H66P4X

3 Bay 5 /dev/da5 7 HEALTHY PN1234P9H6GPRX

3 Bay 6 /dev/da6 6 HEALTHY PN1234P9H6DHPX

3 Bay 7 /dev/da7 5 HEALTHY PN1234P9H6DJAX

3 Bay 8 /dev/da8 4 HEALTHY PN1234P9H64MSX

3 Bay 9 /dev/da9 3 HEALTHY PN1234P9H66PEX

3 Bay 10 /dev/da10 2 HEALTHY PN1234P9H5VMPX

3 Bay 11 /dev/da11 1 HEALTHY PN1234P9H64LHX

3 Bay 12 /dev/da12 0 HEALTHY PN1234P9H66P2X

-----------------------------------------------------

Total: 12

Note that the drive in Bay 5 has an Lnum, or Drive ID, of 7, the number by which it will be represented in a group statement.

Drive bays and device names may refer to different drives at different points in time, and either could be considered a "logical" drive ID. While the best practice is definitely not to switch drives between bays of a node, if this does happen OneFS will correctly identify the relocated drives by Drive ID and therefore prevent data loss.

Depending on device availability, device names ‘/dev/da*’ may change when a node comes up, so cannot be relied upon to refer to the same device across reboots. However, Drive IDs and drive bay numbers do provide consistent drive identification.

The drives’ status info is kept in the /etc/ifs/drives.xml file on each node. For example, the entry for drive Lnum 0 on node Lnn 3 appears as:

Group messages combine the xml lists into a pair of numbers separated by dashes to make reporting more efficient and easier to read. Consider the following list, for example: ‘ 1-3:0-11 ‘.

However, when a replacement disk (Lnum 12) is added to node 2, the list becomes:

{ 1:0-11, 2:0-1,3-12, 3:0-11 }.

Unfortunately, changes like these can make cluster groups more challenging to parse.

For example:

{ 1:0-23, 2:0-5,7-10,12-25, 3:0-23, 4:0-7,9-36, 5:0-35, 6:0-9,11-36 }

This describes a cluster with two node pools. Nodes 1 to 3 each have 24 drives, and nodes 4 through 6 each have 36 drives. Nodes 1, 3, and 5 contain all their original drives, whereas node 2 has lost drives 6 and 11, and node 6 is missing drive 10.

Your Browser is Out of Date

Understanding and analyzing group membership

Understanding and analyzing group membership