Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Troubleshoot bare metal machine issues using the
There might be situations where a user needs to investigate and resolve issues with an on-premises bare metal machine. Azure Operator Nexus provides a prescribed set of data extract commands via az networkcloud baremetalmachine run-data-extract. These commands enable users to get diagnostic data from a bare metal machine.
The command produces an output file containing the results of the data extract. By default, the data is sent to the Cluster Manager storage account. There's also a preview method where users can configure the Cluster resource with a storage account and identity that has access to the storage account to receive the output.
Prerequisites
- This article assumes that the Azure command line interface and the
networkcloudcommand line interface extension are installed. For more information, see How to Install CLI Extensions. - The target bare metal machine is on and ready.
- The syntax for these commands is based on the 0.3.0+ version of the
az networkcloudCLI. - Get the Cluster Managed Resource group name (cluster_MRG) that you created for Cluster resource.
Send command output to a user specified Storage Account
To configure the Storage Account and container to which command output is sent, see Azure Operator Nexus Cluster support for managed identities and user provided resources.
To access the output of a command, users need the appropriate access to the storage blob, including both having the necessary Azure role assignments and ensuring that any networking restrictions are properly configured.
For role assignments, a user must have the following role assignments on the blob container or its Storage Account:
- A data access role, such as Storage Blob Data Reader or Storage Blob Data Contributor
- The Azure Resource Manager Reader role, at a minimum
For information on assigning roles to storage accounts, see Assign an Azure role for access to blob data.
For networking restrictions, if the Storage Account allows public endpoint access via a firewall, the firewall must be configured with a networking rule to allow that user's IP address through. If it allows only private endpoint access, a user must be part of a network that has access to the private endpoint.
For information on allowing access through the storage account firewall using networking rules or private endpoints, see the respective documentation.
Verify access to the specified Storage Account
Before running commands, you might wish to verify you have access to the specified Storage Account:
- From the Azure portal, navigate to the Storage Account.
- In the Storage Account details, select Storage browser from the navigation menu on the left side.
- In the Storage browser details, select Blob containers.
- Find the container to which command output is to be sent and select it.
- If you encounter errors while accessing the Storage Account or container, the user you're using might need a role assignment for the Storage Account or container. Alternatively, the Storage Account’s firewall settings might need to be updated to include your IP address.
Execute a run-data-extract command
The run data extract command executes one or more predefined scripts to extract data from a bare metal machine.
Warning
Microsoft doesn't provide or support any Operator Nexus API calls that expect plaintext username and/or password to be supplied. Note any values sent are logged and are considered exposed secrets, which should be rotated and revoked. The Microsoft documented method for securely using secrets is to store them in an Azure Key Vault. If you have specific questions or concerns, submit a request via the Azure portal.
The current list of supported commands are
SupportAssist/TSR collection for Dell troubleshooting
Command Name:hardware-support-data-collection
Arguments: Type of logs requestedSysInfo- System InformationTTYLog- Storage TTYLog dataDebug- debug logs
Warning
As of the v20250701preview API version and above, this command will no longer be supported by the non-restricted run-data-extract command. To run mde-agent-information, See Executing a run-data-extracts-restricted Command.
Collect Microsoft Defender for Endpoints (MDE) agent information
Command Name:mde-agent-information
Arguments: NoneCollect MDE diagnostic support logs
Command Name:mde-support-diagnostics
Arguments: NoneCollect Dell Hardware Rollup Status
Command Name:hardware-rollup-status
Arguments: None
Warning
As of the v20250701preview API version and above, this command will no longer be supported by the non-restricted run-data-extract command. To run cluster-cve-report, See Executing a run-data-extracts-restricted Command.
Generate Cluster Common Vulnerabilities and Exposures (CVE) Report
Command Name:cluster-cve-report
Arguments: NoneCollect Helm Releases
Command Name:collect-helm-releases
Arguments: NoneCollect
systemctl statusOutput
Command Name:platform-services-status
Arguments: NoneCollect System Diagnostics
Command Name:collect-system-diagnostics
Arguments: None
The command syntax is:
az networkcloud baremetalmachine run-data-extract --name "<machine-name>" \
--resource-group "<cluster_MRG>" \
--subscription "<subscription>" \
--commands '[{"arguments":["<arg1>","<arg2>"],"command":"<command1>"}]' \
--limit-time-seconds "<timeout>"
Specify multiple commands using json format in --commands option. Each command specifies command and arguments. For a command with multiple arguments, provide as a list to the arguments parameter. See Azure CLI Shorthand for instructions on constructing the --commands structure.
These commands can be long running so the recommendation is to set --limit-time-seconds to at least 600 seconds (10 minutes). The Debug option or running multiple extracts might take longer than 10 minutes.
In the response, the operation performs asynchronously and returns an HTTP status code of 202. See the How to view the full output of a command in the associated Storage Account section for details on how to track command completion and view the output file.
Hardware Support Data Collection
This example executes the hardware-support-data-collection command and get SysInfo and TTYLog logs from the Dell Server. The script executes a racadm supportassist collect command on the designated bare metal machine. The resulting tar.gz file contains the zipped extract command file outputs in hardware-support-data-<timestamp>.zip.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "cluster_MRG" \
--subscription "subscription" \
--commands '[{"arguments":["SysInfo", "TTYLog"],"command":"hardware-support-data-collection"}]' \
--limit-time-seconds 600
hardware-support-data-collection Output
====Action Command Output====
Executing hardware-support-data-collection command
Getting following hardware support logs: SysInfo,TTYLog
Job JID_814372800396 is running, waiting for it to complete ...
Job JID_814372800396 Completed.
---------------------------- JOB -------------------------
[Job ID=JID_814372800396]
Job Name=SupportAssist Collection
Status=Completed
Scheduled Start Time=[Not Applicable]
Expiration Time=[Not Applicable]
Actual Start Time=[Thu, 13 Apr 2023 20:54:40]
Actual Completion Time=[Thu, 13 Apr 2023 20:59:51]
Message=[SRV088: The SupportAssist Collection Operation is completed successfully.]
Percent Complete=[100]
----------------------------------------------------------
Deleting Job JID_814372800396
Collection successfully exported to /hostfs/tmp/runcommand/hardware-support-data-2023-04-13T21:00:01.zip
================================
Script execution result can be found in storage account:
https://cm2p9bctvhxnst.blob.core.windows.net/bmm-run-command-output/dd84df50-7b02-4d10-a2be-46782cbf4eef-action-bmmdataextcmd.tar.gz?se=2023-04-14T01%3A00%3A15Zandsig=ZJcsNoBzvOkUNL0IQ3XGtbJSaZxYqmtd%2BM6rmxDFqXE%3Dandsp=randspr=httpsandsr=bandst=2023-04-13T21%3A00%3A15Zandsv=2019-12-12
Example list of hardware support files collected
Archive: TSR20240227164024_FM56PK3.pl.zip
creating: tsr/hardware/
creating: tsr/hardware/spd/
creating: tsr/hardware/sysinfo/
creating: tsr/hardware/sysinfo/inventory/
inflating: tsr/hardware/sysinfo/inventory/sysinfo_CIM_BIOSAttribute.xml
inflating: tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml
inflating: tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml
inflating: tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml
inflating: tsr/hardware/sysinfo/inventory/sysinfo_CIM_Capabilities.xml
inflating: tsr/hardware/sysinfo/inventory/sysinfo_CIM_StatisticalData.xml
creating: tsr/hardware/sysinfo/lcfiles/
inflating: tsr/hardware/sysinfo/lcfiles/lclog_0.xml.gz
inflating: tsr/hardware/sysinfo/lcfiles/curr_lclog.xml
creating: tsr/hardware/psu/
creating: tsr/hardware/idracstateinfo/
inflating: tsr/hardware/idracstateinfo/avc.log
extracting: tsr/hardware/idracstateinfo/avc.log.persistent.1
[..snip..]
Collect MDE Agent Information
Data is collected with the mde-agent-information command and formatted as JSON
to /hostfs/tmp/runcommand/mde-agent-information.json. The JSON file is found
in the data extract zip file located in the storage account. The script executes a
sequence of mdatp commands on the designated bare metal machine.
This example executes the mde-agent-information command without arguments.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "cluster_MRG" \
--subscription "subscription" \
--commands '[{"command":"mde-agent-information"}]' \
--limit-time-seconds 600
mde-agent-information Output
====Action Command Output====
Executing mde-agent-information command
MDE agent is running, proceeding with data extract
Getting MDE agent information for bareMetalMachine
Writing to /hostfs/tmp/runcommand
================================
Script execution result can be found in storage account:
https://cmzhnh6bdsfsdwpbst.blob.core.windows.net/bmm-run-command-output/f5962f18-2228-450b-8cf7-cb8344fdss63b0-action-bmmdataextcmd.tar.gz?se=2023-07-26T19%3A07%3A22Z&sig=X9K3VoNWRFP78OKqFjvYoxubp65BbNTq%2BGnlHclI9Og%3D&sp=r&spr=https&sr=b&st=2023-07-26T15%3A07%3A22Z&sv=2019-12-12
Example JSON object collected
{
"diagnosticInformation": {
"realTimeProtectionStats": $real_time_protection_stats,
"eventProviderStats": $event_provider_stats
},
"mdeDefinitions": $mde_definitions,
"generalHealth": $general_health,
"mdeConfiguration": $mde_config,
"scanList": $scan_list,
"threatInformation": {
"list": $threat_info_list,
"quarantineList": $threat_info_quarantine_list
}
}
Collect MDE Support Diagnostics
Data collected from the mde-support-diagnostics command uses the MDE Client Analyzer tool to bundle information from mdatp commands and relevant log files. The storage account tgz file contains a zip file named mde-support-diagnostics-<hostname>.zip. The zip should be sent along with any support requests to ensure the supporting teams can use the logs for troubleshooting and root cause analysis, if needed.
This example executes the mde-support-diagnostics command without arguments.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "cluster_MRG" \
--subscription "subscription" \
--commands '[{"command":"mde-support-diagnostics"}]' \
--limit-time-seconds 600
mde-support-diagnostics Output
====Action Command Output====
Executing mde-support-diagnostics command
[2024-01-23 16:07:37.588][INFO] XMDEClientAnalyzer Version: 1.3.2
[2024-01-23 16:07:38.367][INFO] Top Command output: [/tmp/top_output_2024_01_23_16_07_37mel0nue0.txt]
[2024-01-23 16:07:38.367][INFO] Top Command Summary: [/tmp/top_summary_2024_01_23_16_07_370zh7dkqn.txt]
[2024-01-23 16:07:38.367][INFO] Top Command Outliers: [/tmp/top_outlier_2024_01_23_16_07_37aypcfidh.txt]
[2024-01-23 16:07:38.368][INFO] [MDE Diagnostic]
[2024-01-23 16:07:38.368][INFO] Collecting MDE Diagnostic
[2024-01-23 16:07:38.613][WARNING] mde is not running
[2024-01-23 16:07:41.343][INFO] [SLEEP] [3sec] waiting for agent to create diagnostic package
[2024-01-23 16:07:44.347][INFO] diagnostic package path: /var/opt/microsoft/mdatp/wdavdiag/5b1edef9-3b2a-45c1-a45d-9e7e4b6b869e.zip
[2024-01-23 16:07:44.347][INFO] Successfully created MDE diagnostic zip
[2024-01-23 16:07:44.348][INFO] Adding mde_diagnostic.zip to report directory
[2024-01-23 16:07:44.348][INFO] Collecting MDE Health
[...snip...]
================================
Script execution result can be found in storage account:
https://cmmj627vvrzkst.blob.core.windows.net/bmm-run-command-output/7c5557b9-b6b6-a4a4-97ea-752c38918ded-action-bmmdataextcmd.tar.gz?se=2024-01-23T20%3A11%3A32Z&sig=9h20XlZO87J7fCr0S1234xcyu%2Fl%2BVuaDh1BE0J6Yfl8%3D&sp=r&spr=https&sr=b&st=2024-01-23T16%3A11%3A32Z&sv=2019-12-12
After you download the execution result file, the support files can be unzipped for analysis.
Example list of information collected by the MDE Client Analyzer
Archive: mde-support-diagnostics-rack1compute02.zip
inflating: mde_diagnostic.zip
inflating: process_information.txt
inflating: auditd_info.txt
inflating: auditd_log_analysis.txt
inflating: auditd_logs.zip
inflating: ebpf_kernel_config.txt
inflating: ebpf_enabled_func.txt
inflating: ebpf_syscalls.zip
inflating: ebpf_raw_syscalls.zip
inflating: messagess.zip
inflating: conflicting_processes_information.txt
[...snip...]
Hardware Rollup Status
Data is collected with the hardware-rollup-status command and formatted as JSON to /hostfs/tmp/runcommand/rollupStatus.json. The JSON file is found
in the data extract zip file located in the storage account. The data collected shows the health of the machine subsystems.
This example executes the hardware-rollup-status command without arguments.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "clusete_MRG" \
--subscription "subscription" \
--commands '[{"command":"hardware-rollup-status"}]' \
--limit-time-seconds 600
hardware-rollup-status Output
====Action Command Output====
Executing hardware-rollup-status command
Getting rollup status logs for b37dev03a1c002
Writing to /hostfs/tmp/runcommand
================================
Script execution result can be found in storage account:
https://cmkfjft8twwpst.blob.core.windows.net/bmm-run-command-output/20b217b5-ea38-4394-9db1-21a0d392eff0-action-bmmdataextcmd.tar.gz?se=2023-09-19T18%3A47%3A17Z&sig=ZJcsNoBzvOkUNL0IQ3XGtbJSaZxYqmtd%3D&sp=r&spr=https&sr=b&st=2023-09-19T14%3A47%3A17Z&sv=2019-12-12
Example JSON Collected
{
"@odata.context" : "/redfish/v1/$metadata#DellRollupStatusCollection.DellRollupStatusCollection",
"@odata.id" : "/redfish/v1/Systems/System.Embedded.1/Oem/Dell/DellRollupStatus",
"@odata.type" : "#DellRollupStatusCollection.DellRollupStatusCollection",
"Description" : "A collection of DellRollupStatus resource",
"Members" :
[
{
"@odata.context" : "/redfish/v1/$metadata#DellRollupStatus.DellRollupStatus",
"@odata.id" : "/redfish/v1/Systems/System.Embedded.1/Oem/Dell/DellRollupStatus/iDRAC.Embedded.1_0x23_SubSystem.1_0x23_Current",
"@odata.type" : "#DellRollupStatus.v1_0_0.DellRollupStatus",
"CollectionName" : "CurrentRollupStatus",
"Description" : "Represents the subcomponent roll-up statuses.",
"Id" : "iDRAC.Embedded.1_0x23_SubSystem.1_0x23_Current",
"InstanceID" : "iDRAC.Embedded.1#SubSystem.1#Current",
"Name" : "DellRollupStatus",
"RollupStatus" : "Ok",
"SubSystem" : "Current"
},
{
"@odata.context" : "/redfish/v1/$metadata#DellRollupStatus.DellRollupStatus",
"@odata.id" : "/redfish/v1/Systems/System.Embedded.1/Oem/Dell/DellRollupStatus/iDRAC.Embedded.1_0x23_SubSystem.1_0x23_Voltage",
"@odata.type" : "#DellRollupStatus.v1_0_0.DellRollupStatus",
"CollectionName" : "VoltageRollupStatus",
"Description" : "Represents the subcomponent roll-up statuses.",
"Id" : "iDRAC.Embedded.1_0x23_SubSystem.1_0x23_Voltage",
"InstanceID" : "iDRAC.Embedded.1#SubSystem.1#Voltage",
"Name" : "DellRollupStatus",
"RollupStatus" : "Ok",
"SubSystem" : "Voltage"
},
[..snip..]
Generate Cluster CVE Report
Vulnerability data is collected with the cluster-cve-report command and formatted as JSON to {year}-{month}-{day}-nexus-cluster-vulnerability-report.json. The JSON file is found in the data extract zip file located in the storage account. The data collected includes vulnerability data per container image in the cluster.
This example executes the cluster-cve-report command without arguments.
Note
The target machine must be a control-plane node or the action doesn't execute.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "cluster_MRG" \
--subscription "subscription" \
--commands '[{"command":"cluster-cve-report"}]' \
--limit-time-seconds 600
cluster-cve-report Output
====Action Command Output====
Nexus cluster vulnerability report saved.
================================
Script execution result can be found in storage account:
https://cmkfjft8twwpst.blob.core.windows.net/bmm-run-command-output/20b217b5-ea38-4394-9db1-21a0d392eff0-action-bmmdataextcmd.tar.gz?se=2023-09-19T18%3A47%3A17Z&sig=ZJcsNoBzvOkUNL0IQ3XGtbJSaZxYqmtd%3D&sp=r&spr=https&sr=b&st=2023-09-19T14%3A47%3A17Z&sv=2019-12-12
CVE Report Schema
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Vulnerability Report",
"type": "object",
"properties": {
"metadata": {
"type": "object",
"properties": {
"dateRetrieved": {
"type": "string",
"format": "date-time",
"description": "The date and time when the data was retrieved."
},
"platform": {
"type": "string",
"description": "The name of the platform."
},
"resource": {
"type": "string",
"description": "The name of the resource."
},
"clusterId": {
"type": "string",
"description": "The resource ID of the cluster."
},
"runtimeVersion": {
"type": "string",
"description": "The version of the runtime."
},
"managementVersion": {
"type": "string",
"description": "The version of the management software."
},
"vulnerabilitySummary": {
"type": "object",
"properties": {
"uniqueVulnerabilities": {
"type": "object",
"properties": {
"critical": { "type": "integer" },
"high": { "type": "integer" },
"medium": { "type": "integer" },
"low": { "type": "integer" },
"unknown": { "type": "integer" }
},
"required": ["critical", "high", "medium", "low", "unknown"]
},
"totalVulnerabilities": {
"type": "object",
"properties": {
"critical": { "type": "integer" },
"high": { "type": "integer" },
"medium": { "type": "integer" },
"low": { "type": "integer" },
"unknown": { "type": "integer" }
},
"required": ["critical", "high", "medium", "low", "unknown"]
}
},
"required": ["uniqueVulnerabilities", "totalVulnerabilities"]
}
},
"required": [
"dateRetrieved",
"platform",
"resource",
"clusterId",
"runtimeVersion",
"managementVersion",
"vulnerabilitySummary"
]
},
"containers": {
"type": "object",
"additionalProperties": {
"type": "array",
"items": {
"type": "object",
"properties": {
"namespace": {
"type": "array",
"description": "The namespaces where the container image is in-use.",
"items": { "type": "string" }
},
"digest": {
"type": "string",
"description": "The digest of the container image."
},
"observedCount": {
"type": "integer",
"description": "The number of times the container image has been observed."
},
"os": {
"type": "object",
"properties": {
"family": {
"type": "string",
"description": "The family of the operating system."
}
},
"required": ["family"]
},
"vulnerabilities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": { "type": "string" },
"vulnerabilityID": { "type": "string" },
"fixedVersion": { "type": "string" },
"installedVersion": { "type": "string" },
"referenceLink": { "type": "string", "format": "uri" },
"publishedDate": { "type": "string", "format": "date-time" },
"dataSource": { "type": "string" },
"score": { "type": "number" },
"severity": { "type": "string" },
"severitySource": { "type": "string" },
"resource": { "type": "string" },
"target": { "type": "string" },
"packageType": { "type": "string" },
"exploitAvailable": { "type": "boolean" }
},
"required": [
"title",
"vulnerabilityID",
"fixedVersion",
"installedVersion",
"referenceLink",
"publishedDate",
"dataSource",
"score",
"severity",
"severitySource",
"resource",
"target",
"packageType",
"exploitAvailable"
]
}
}
},
"required": ["namespace", "digest", "os", "observedCount", "vulnerabilities"]
}
}
}
},
"required": ["metadata", "containers"]
}
CVE Data Details
The CVE data is refreshed per container image every 24 hours or when there's a change to the Kubernetes resource referencing the image.
Collect Helm Releases
Helm release data is collected with the collect-helm-releases command and formatted as json to {year}-{month}-{day}-helm-releases.json. The JSON file is found in the data extract zip file located in the storage account. The data collected includes all helm release information from the Cluster, which consists of the standard data returned when running the command helm list.
This example executes the collect-helm-releases command without arguments.
Note
The target machine must be a control-plane node or the action doesn't execute.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "cluster_MRG" \
--subscription "subscription" \
--commands '[{"command":"collect-helm-releases"}]' \
--limit-time-seconds 600
collect-helm-releases Output
====Action Command Output====
Helm releases report saved.
================================
Script execution result can be found in storage account:
https://cmcr5xp3mbn7st.blob.core.windows.net/bmm-run-command-output/a29dcbdb-5524-4172-8b55-88e0e5ec93ff-action-bmmdataextcmd.tar.gz?se=2024-10-30T02%3A09%3A54Z&sig=v6cjiIDBP9viEijs%2B%2BwJDrHIAbLEmuiVmCEEDHEi%2FEc%3D&sp=r&spr=https&sr=b&st=2024-10-29T22%3A09%3A54Z&sv=2023-11-03
Helm Release Schema
{
"$schema": "http://json-schema.org/schema#",
"type": "object",
"properties": {
"metadata": {
"type": "object",
"properties": {
"dateRetrieved": {
"type": "string"
},
"platform": {
"type": "string"
},
"resource": {
"type": "string"
},
"clusterId": {
"type": "string"
},
"runtimeVersion": {
"type": "string"
},
"managementVersion": {
"type": "string"
}
},
"required": [
"clusterId",
"dateRetrieved",
"managementVersion",
"platform",
"resource",
"runtimeVersion"
]
},
"helmReleases": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"namespace": {
"type": "string"
},
"revision": {
"type": "string"
},
"updated": {
"type": "string"
},
"status": {
"type": "string"
},
"chart": {
"type": "string"
},
"app_version": {
"type": "string"
}
},
"required": [
"app_version",
"chart",
"name",
"namespace",
"revision",
"status",
"updated"
]
}
}
},
"required": [
"helmReleases",
"metadata"
]
}
Collect Systemctl Status Output
Service status is collected with the platform-services-status command. The output is in plain text format and
returns an overview of the status of the services on the host and the systemctl status for each found service.
This example executes the platform-services-status command without arguments.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "clusete_MRG" \
--subscription "subscription" \
--commands '[{"command":"platform-services-status"}]' \
--limit-time-seconds 600
--output-directory "/path/to/local/directory"
platform-services-status Output
====Action Command Output====
UNIT LOAD ACTIVE SUB DESCRIPTION
aods-infra-vf-config.service not-found inactive dead aods-infra-vf-config.service
aods-pnic-config-infra.service not-found inactive dead aods-pnic-config-infra.service
aods-pnic-config-workload.service not-found inactive dead aods-pnic-config-workload.service
arc-unenroll-file-semaphore.service loaded active exited Arc-unenrollment upon shutdown service
atop-rotate.service loaded inactive dead Restart atop daemon to rotate logs
atop.service loaded active running Atop advanced performance monitor
atopacct.service loaded active running Atop process accounting daemon
audit.service loaded inactive dead Audit service
auditd.service loaded active running Security Auditing Service
azurelinux-sysinfo.service loaded inactive dead Azure Linux Sysinfo Service
blk-availability.service loaded inactive dead Availability of block devices
[..snip..]
-------
● arc-unenroll-file-semaphore.service - Arc-unenrollment upon shutdown service
Loaded: loaded (/etc/systemd/system/arc-unenroll-file-semaphore.service; enabled; vendor preset: enabled)
Active: active (exited) since Tue 2024-11-12 06:33:40 UTC; 11h ago
Main PID: 11663 (code=exited, status=0/SUCCESS)
CPU: 5ms
Nov 12 06:33:39 rack1compute01 systemd[1]: Starting Arc-unenrollment upon shutdown service...
Nov 12 06:33:40 rack1compute01 systemd[1]: Finished Arc-unenrollment upon shutdown service.
-------
○ atop-rotate.service - Restart atop daemon to rotate logs
Loaded: loaded (/usr/lib/systemd/system/atop-rotate.service; static)
Active: inactive (dead)
TriggeredBy: ● atop-rotate.timer
[..snip..]
Collect System Diagnostics
System Diagnostics logs are collected with the collect-system-diagnostics command. It retrieves all the necessary logs giving deeper visibility within the bare metal machine. It collects following types of diagnostics data.
This example executes the collect-system-diagnostics command without arguments.
- System and kernel diagnostics
- Kernel information: Logs, human-readable messages, version, and architecture, for in-depth kernel diagnostics.
- Operating System Logs: Essential logs detailing system activity and container logs for system services.
- Hardware and resource usage
- CPU and IO throttled processes: Identifies throttling issues, providing insights into performance bottlenecks.
- Network Interface Statistics: Detailed statistics for network interfaces to diagnose errors and drops.
- Software and services
- Installed packages: A list of all installed packages, vital for understanding the system's software environment.
- Active system services: Information on active services, process snapshots, and detailed system and process statistics.
- Container runtime and Kubernetes components logs: Logs for Kubernetes components and other vital services for cluster diagnostics.
- Networking and connectivity
- Network connection tracking information: Conntrack statistics and connection lists for firewall diagnostics.
- Network configuration and interface details: Interface configurations, IP routing, addresses, and neighbor information.
- Any additional interface configuration and logs: Logs related to the configuration of all interfaces inside the Node.
- Network connectivity tests: Tests external network connectivity and Kubernetes API server communication.
- DNS resolution configuration: DNS resolver configuration for diagnosing domain name resolution issues.
- Networking configuration and logs: Comprehensive networking data including connection tracking and interface configurations.
- Container network interface (CNI) configuration: Configuration of CNI for container networking diagnostics.
- Security and compliance
- SELinux status: Reports the SELinux mode to understand access control and security contexts.
- IPtables rules: Configuration of IPtables rulesets for insights into firewall settings.
- Storage and filesystems
- Mount points and volume information: Detailed information on mount points, volumes, disk usage, and filesystem specifics.
- Azure Arc azcmagent logs
- Collects log files for the Azure connected machine agent and extensions into a ZIP archive.
- Configuration and management
- System configuration: Sysctl parameters for a comprehensive view of kernel runtime configuration.
- Kubernetes configuration and health: Kubernetes setup details, including configurations and service listings.
- Container runtime information: Configuration, version information, and details on running containers.
- Container runtime interface (CRI) information: Operations data for container runtime interface, aiding in container orchestration diagnostics.
az networkcloud baremetalmachine run-data-extract --name "bareMetalMachineName" \
--resource-group "cluster_MRG" \
--subscription "subscription" \
--commands '[{"command":"collect-system-diagnostics"}]' \
--limit-time-seconds 900
collect-system-diagnostics Output
====Action Command Output====
Trying to check for root...
Trying to check for required utilities...
Trying to create required directories...
Trying to check for disk space...
Trying to start collecting logs... Trying to collect common operating system logs...
Trying to collect mount points and volume information...
Trying to collect SELinux status...
Trying to collect Containerd daemon information...
Trying to collect Containerd running information...
Trying to collect Container Runtime Interface (CRI) information... Trying to collect CRI information...
Trying to collect kubelet information...
Trying to collect Multus logs if they exist...
Trying to collect azcmagent logs... time="2025-09-09T15:21:55Z" level=info msg="Adding directory /var/opt/azcmagent/log to zip"
time="2025-09-09T15:21:55Z" level=info msg="Adding directory /var/lib/GuestConfig/arc_policy_logs to zip"
time="2025-09-09T15:21:57Z" level=info msg="Adding directory /var/lib/GuestConfig/ext_mgr_logs to zip"
time="2025-09-09T15:21:57Z" level=info msg="Adding directory /var/lib/GuestConfig/extension_logs to zip"
time="2025-09-09T15:21:57Z" level=info msg="Adding directory /var/lib/GuestConfig/extension_reports to zip"
time="2025-09-09T15:21:57Z" level=info msg="Adding directory /var/lib/GuestConfig/gc_agent_logs to zip"
time="2025-09-09T15:21:57Z" level=info msg="Diagnostic logs have been saved to /tmp/azcmagent-logs-3765466.zip."
Collecting System logs
Trying to collect kernel logs...
Trying to collect installed packages...
Trying to collect active system services...
Trying to collect sysctls information...
Trying to collect CPU Throttled Process Information...
Trying to collect IO Throttled Process Information...
Trying to collect conntrack information... conntrack v1.4.8 (conntrack-tools): 1917 flow entries have been shown.
Trying to collect ipvsadm information...
Trying to collect kernel command line...
Trying to collect configuration files... Collecting Networking logs
Trying to collect networking information... conntrack v1.4.8 (conntrack-tools): 1916 flow entries have been shown.
Trying to collect CNI configuration information...
Trying to collect iptables information...
Trying to archive gathered information...
Finishing up...
Done... your bundled logs are located in /hostfs/tmp/runcommand/system_diagnostics_bareMetalMachineName_2025-09-09_1519-UTC.tar.gz
================================
Script execution result can be downloaded from storage account using the command:
az storage blob download --blob-url https://simdev4003469vm1sa.blob.core.windows.net/command-output-blob/runcommand-output-7d601db8-75b7-4af2-94dd-f4f49ee0b0b7.tar.gz --file runcommand-output-7d601db8-75b7-4af2-94dd-f4f49ee0b0b7.tar.gz --auth-mode login > /dev/null 2>&1
How to view the full output of a command in the associated Storage Account
To access the output of a command, users need the appropriate access to the storage blob, including both having the necessary Azure role assignments and ensuring that any networking restrictions are properly configured.
For role assignments, a user must have the following role assignments on the blob container or its Storage Account:
- A data access role, such as Storage Blob Data Reader or Storage Blob Data Contributor
- The Azure Resource Manager Reader role, at a minimum
For information on assigning roles to storage accounts, see Assign an Azure role for access to blob data.
For networking restrictions, if the Storage Account allows public endpoint access via a firewall, the firewall must be configured with a networking rule to allow that user's IP address through. If it allows only private endpoint access, a user must be part of a network that has access to the private endpoint.
For information on allowing access through the storage account firewall using networking rules or private endpoints, see the respective documentation.
With the necessary permissions and access configured, you can then use the link or command from the output summary to download the zipped output file (tar.gz).
You can also download it via the Azure portal:
- From the Azure portal, navigate to the Storage Account.
- In the Storage account details, select Storage browser from the navigation menu on the left side.
- In the Storage browser details, select on Blob containers.
- Select the blob container.
- Select the output file from the command. The file name can be identified from the output summary. Additionally, the Last modified timestamp aligns with when the command was executed.
- You can manage & download the output file from the Overview pop-out.
The downloaded tar.gz file contains the full output and the zipped extract command file outputs.
The command provides a link (if using cluster manager storage) or another command (if using user provided storage) to download the full output. The tar.gz file also contains the zipped extract command file outputs. Download the output file from the storage blob to a local directory by specifying the directory path in the optional argument --output-directory.
Warning
Using the --output-directory argument overwrites any files in the local directory that have the same name as the new files being created.
Note
Storage Account could be locked resulting in 403 This request is not authorized to perform this operation. due to networking or firewall restrictions. Refer to the user managed storage sections for procedures to verify access.
Executing a run-data-extracts-restricted Command
Prerequisites
- minimum supported API of
v20250701previeworv20250901and above - Storage Blob Container has been configured
- The target bare metal machine is on and ready.
- Required
az networkcloudCLI extension version 4.0.0b1+ version . - Get the Cluster Managed Resource group name (cluster_MRG) that you created for Cluster resource.
The run-data-extracts-restricted command functionality mirrors non-restricted run-data-extracts command and includes fine-grained access control via RBAC (Role-Based Access Control). It allows customers to run sensitive data extraction operations on BareMetalMachines with elevated privileges.
The run-data-extracts-restricted is implemented as a new and separate API action. The action is to be introduced in the v20250701preview and v20250901 GA API, and is designed to mirror the behavior of the original command but with restricted access to specific sub-commands. The following list contains the allowed sub commands forrun-data-extracts-restricted:
Collect Microsoft Defender for Endpoints (MDE) agent information
Command Name:mde-agent-information
Arguments: NoneGenerate Cluster Common Vulnerabilities and Exposures (CVE) Report
Command Name:cluster-cve-report
Arguments: None
Command execution can be performed using az networkcloud baremetalmachine run-data-extracts-restricted and it accepts arguments similarly to the run-data-extract.
Example
az networkcloud baremetalmachine run-data-extracts-restricted --name "<machine-name>" \
--resource-group "<cluster_MRG>" \
--subscription "<subscriptionID>" \
--commands '[{"arguments":["--min-severity=8"],"command":"cluster-cve-report"}]' \
--limit-time-seconds "600"
--output-directory ~/path/to/my/output/directory
Storage and Output
Output from run command executions are by default stored in the blob container defined by the commandOutputSettings. Override of the commandOutputSettings value is supported per command output type (i.e.BareMetalMachineRunDataExtractsRestricted). For how to specify the commandOutputSettings override for runcommand see Azure Operator Nexus Cluster support for managed identities and user provided resources.