Hey Checkyourlogs Fans,

 

Dave Kawula here, thrilled to share an invaluable insight into monitoring S2D (Storage Spaces Direct) and Azure Stack HCI environments. As an operator of these solutions, I understand the challenges of identifying and efficiently addressing potential failures. Recently, I embarked on a journey to build an S2D Solution in a completely isolated environment, sans internet access. In this blog post, I’ll walk you through a simple yet powerful method using PowerShell to monitor clusters effectively, ensuring seamless operations without third-party monitoring solutions.

 

When confronted with devising a monitoring strategy devoid of internet connectivity, I turned to PowerShell as my trusted ally. While options like SCOM or Windows Admin Center are undoubtedly viable, PowerShell offers unparalleled flexibility and customization tailored to our needs. The initial iteration of our monitoring solution aimed to accomplish several key objectives seamlessly:

  1. Identify S2D Clusters: The script was designed to autonomously locate and discern all S2D Clusters within the domain, eliminating the need for manual input and streamlining the monitoring process.
  2. Monitor Virtual Disks: Our script utilizes Get-VirtualDisk to retrieve vital information regarding the operational and health status of virtual disks, serving as a reliable indicator of cluster health.
  3. Track Storage Jobs: By running Get-StorageJob, our solution monitors storage jobs, which ideally shouldn’t be active unless anomalies such as node reboots, patching, or failures occur, thus alerting operators to potential issues promptly.
  4. Detect Noisy Neighbors with Storage QoS Flow: The most challenging aspect, identifying noisy neighbors, is addressed using Get-StorageQosFlow. This feature enables us to monitor IOPS per-VM, allowing for granular insights into resource utilization and potential performance bottlenecks.

 

To streamline the monitoring process, we encapsulated these functionalities within a PowerShell function named Get-DiskInfo. This function retrieves virtual disks, physical disks, storage jobs, and storage QoS flow information remotely, ensuring comprehensive monitoring without manual intervention.

 

 

# Function to get virtual disks, physical disks, storage jobs, and storage QoS flow information
function Get-DiskInfo {
Param(
[string]$ClusterName
)

# Get disks, storage jobs, and storage QoS flow information remotely
$disksInfo = Invoke-Command -ComputerName $ClusterName -ScriptBlock {
$virtualDisks = Get-VirtualDisk | select friendlyname,opertationalstatus, healthstatus
$physicalDisks = Get-PhysicalDisk | select number,friendlyname,canpool,operationalstatus,healthstatus
$storageJobs = Get-StorageJob
$storageQosFlow = Get-StorageQosFlow | select initiatorname,initiatoriops,initiatorlatency,initiatorbandwidth
return [PSCustomObject]@{
VirtualDisks = $virtualDisks
PhysicalDisks = $physicalDisks
StorageJobs = $storageJobs
StorageQosFlow = $storageQosFlow
}
}

# Return disks, storage jobs, and storage QoS flow information
return $disksInfo
}

# Get the domain name
$domainName = $env:USERDOMAIN

# Infinite loop
while ($true) {
# Get all failover clusters in the Active Directory domain
$clusters = Get-Cluster -Domain $domainName

# Iterate through each cluster
foreach ($cluster in $clusters) {
$clusterName = $cluster.Name

# Retrieve virtual disks, physical disks, storage jobs, and storage QoS flow information for the current cluster
$disksInfo = Get-DiskInfo -ClusterName $clusterName

# Output virtual disks information
Write-Host "Virtual Disks information for Cluster: $clusterName"
$disksInfo.VirtualDisks | Format-Table -AutoSize

# Output physical disks information
Write-Host "Physical Disks information for Cluster: $clusterName"
$disksInfo.PhysicalDisks | Format-Table -AutoSize

# Output storage jobs information
Write-Host "Storage Jobs information for Cluster: $clusterName"
$disksInfo.StorageJobs | Format-Table -AutoSize

# Output storage QoS flow information
Write-Host "Storage QoS Flow information for Cluster: $clusterName"
$disksInfo.StorageQosFlow | Format-Table -AutoSize
}

# Wait for 10 seconds before the next iteration
Start-Sleep -Seconds 10
}


 

In essence, by leveraging the power of PowerShell, we’ve crafted a robust monitoring solution that empowers operators to proactively identify and address issues within S2D and Azure Stack HCI environments. Stay tuned for more insights and tips as we continue to explore the dynamic world of Microsoft technologies.

 

Stay vigilant, stay empowered, and stay tuned for the next installment!

Cheers,

 

Dave Kawula