Anuket Project

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

This wiki is a WIP. Please feel free to modify this page with relevant information

The intention of this wiki is to list the metrics and events we need to collect for the NFVI. In addition to the metrics/events collected about the NFVI, some information about the monitoring process (the process which collects the information and metrics) itself is also required.

 

This list should be developed in conjunction with the  Doctor (Faults) and VES Projects in OPNFV.

 

Monitoring Process information:

  • A Unique Process identifier.
  • Heartbeat/ping to check liveliness.

NFVI Events

What about entire node and switch failures?  In terms of service affecting priority, host and switch failures are at the top as they can affect the most VMs / Containers / VNFs...

While the status of switches and hosts might be the domain of services that have a system-wide view, a host-resident component might be part of the monitoring functionality.

Compute

At a minimum the following events should be monitored:

  • Machine check exceptions (System, Processor, Memory...) [TODO: Break this down further]
    • DIMM corrected and uncorrected Errors

Networking

At a minimum the following events should be monitored for a Networking interface:

  • Link Status
  • Dropped Receive Packets – An increasing count could indicate the failure or service interruption of an upstream processes.  
  • vSwitch liveliness

Storage

NFVI Metrics

Compute

At a minimum the following metrics should be collected:

  • CPU utilization TODO: Break this down further]
  • vCPU utilization TODO: Break this down further]
  • Memory utilization TODO: Break this down further]
  • vMemory utilization TODO: Break this down further]
  • Cache utilization
    • Hits
    • Misses
    • Instructions per clock (IPC)
    • Last level cache utilization
    • Memory Bandwidth utilization
  • Platform Metrics (thermals, fan-speed) [TODO: Break this down further]

 

Networking

[TODO] Add a note on the vSwitch and add vSwitch specific metrics

At a minimum the following metrics should be collected for a Networking interface:

  • Total Packets received and transmitted
  • Dropped packets (TX and RX)
  • Error packets (TX and RX) [TODO: Break this down further]
  • Broadcast Packets (TX and RX)
  • Multicast Packets (TX and RX)

Other Metrics that should be collected for a Networking interface (if possible):

  • Average bitrate
  • Average latency

 

Storage

Disk Utilization

NFVI Other/Additional Information

Compute

BIOS information

Networking

Storage

 

  • No labels