Home > Storage > PowerScale (Isilon) > Product Documentation > Storage (general) > PowerScale OneFS: Cluster Composition, Quorum, and Group State > Constructing an event timeline
When investigating a cluster issue, it can be helpful to build a human-readable timeline of what occurred. This is useful in instances with multiple, non-simultaneous group changes. This timeline should include which nodes have come up or down and can be interpolated with panic stack summaries to describe an event.
As such, a timeline of the cluster event above could read:
The next step would be to review the logs from the other nodes in the cluster for this time period and construct similar timeline. Once done, these can be distilled into one comprehensive, cluster-wide timeline.
Before triangulating log events across a cluster, it is important to ensure that the constituent nodes' clocks are all synchronized. To check this, run the ‘isi_for_array –q date’ command on all nodes and confirm that they match. If not, apply the time offset for a particular node to the timestamps of its logfiles.
Here is another example of how to interpret a series of group events in a cluster. Consider the following group info excerpt from the logs on node 1 of the cluster:
2024-04-15-T18:01:17 -04:00 <0.4> tme-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 5681="kt: gmp-config") new group: <1,270>: { 1:0-11, down: 2, 6-11, diskless: 6-8 }
2024-04-15-T18:02:05 -04:00 <0.4> tme-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 5681="kt: gmp-config") new group: <1,271>: { 1-2:0-11, 6-8, 9-11:0-11, soft_failed: 11, diskless: 6-8 }
2024-04-15-T18:08:56 -04:00 <0.4> tme--1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 10899="kt: gmp-split") new group: <1,272>: { 1-2:0-11, 6-8, 9-10:0-11, down: 11, soft_failed: 11, diskless: 6-8 }
2024-04-15-T18:08:56 -04:00 <0.4> tme-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 10899="kt: gmp-config") new group: <1,273>: { 1-2:0-11, 6-8, 9-10:0-11, diskless: 6-8}
2024-04-15-T18:09:49 -04:00 <0.4> tme-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 10998="kt: gmp-config") new group: <1,274>: { 1-2:0-11, 6-8, 9-10:0-11, soft_failed: 10, diskless: 6-8 }
2024-04-15-T18:15:34 -04:00 <0.4> tme-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 12863="kt: gmp-split") new group: <1,275>: { 1-2:0-11, 6-8, 9:0-11, down: 10, soft_failed: 10, diskless: 6-8 }
2024-04-15-T18:15:34 -04:00 <0.4> tme-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1863] (pid 12863="kt: gmp-config") new group: <1,276>: { 1-2:0-11, 6-8, 9:0-11, diskless: 6-8 }
The timeline of events here can be interpreted as such:
Because group changes document the cluster's actual configuration from OneFS’ perspective, they are a vital tool in understanding which devices the cluster considers available and which it considers as having failed, at a specific point in time. This information, when combined with other data from cluster logs, can provide a succinct but detailed cluster history - simplifying both debugging and failure analysis.