Topology aware flow analytics with NVIDIA NetQ
NVIDIA Cumulus Linux 5.11 for AI / ML describes how NVIDIA 400/800G Spectrum-X switches combined with the latest Cumulus Linux release deliver enhanced real-time telemetry that is particularly relevant...
View ArticleReplay pcap files using sflowtool
It can be very useful to capture sFlow telemetry from production networks so that it can be replayed later to perform off-line analysis, or to develop or evaluate sFlow collection tools. sudo tcpdump...
View ArticleAI Metrics
AI Metrics is available on GitHub. The application provides performance metrics for AI/ML RoCEv2 network traffic, for example, large scale CUDA compute tasks using NVIDIA Collective Communication...
View ArticleCapture to pcap file using sflowtool
Replay pcap files using sflowtool describes how to capture sFlow datagrams using tcpdump and replay them in real time using sflowtool. However, using tcpdump for the capture has the downside of...
View ArticleComparing AI / ML activity from two production networks
AI Metrics describes how to deploy the open source ai-metrics application. The application provides performance metrics for AI/ML RoCEv2 network traffic, for example, large scale CUDA compute tasks...
View ArticleDropped packet notifications with Cisco 8000 Series Routers
The availability of the Cisco IOS XR Release 25.1.1 brings sFlow dropped packet notification support to Cisco 8000 series routers, making it easy to capture and analyze packets dropped at router...
View ArticleAI Metrics with Prometheus and Grafana
The Grafana AI Metrics dashboard shown above tracks performance metrics for AI/ML RoCEv2 network traffic, for example, large scale CUDA compute tasks using NVIDIA Collective Communication Library...
View Article