Statistics in collectd consist of a value list. A value list includes:
Value list | Example | comment | |||||
---|---|---|---|---|---|---|---|
Values | 99.8999 | percentage | |||||
Value length | the number of values in the data set. | ||||||
Time | timestamp at which the value was collected. | 1475837857 | epoch | ||||
Interval | interval at which to expect a new value. | 10 | interval | ||||
Host | used to identify the host. | localhost | can be uuid for vm or host… or can give host a name | ||||
Plugin | used to identify the plugin. | cpu | |||||
Plugin instance (optional) | used to group a set of values together. For e.g. values belonging to a DPDK interface. | 0 | |||||
Type | unit used to measure a value. In other words used to refer to a data set. | percent | |||||
Type instance (optional) | used to distinguish between values that have an identical type. | user | |||||
meta data | an opaque data structure that enables the passing of additional information about a value list. “Meta data in the global cache can be used to store arbitrary information about an identifier” |
Notifications:
Notifications in collectd are generic messages containing:
An associated severity, which can be one of OKAY, WARNING, and FAILURE. | ||||||
A time. | ||||||
A Message | ||||||
A host. | ||||||
A plugin. | ||||||
A plugin instance (optional). | ||||||
A type. | ||||||
A types instance (optional). | ||||||
Meta-data. |
Example notification:
Severity:FAILURE |
Time:1472552207.385 |
Host:pod3-node1 |
Plugin:dpdkevents |
PluginInstance:dpdk0 |
Type:gauge |
TypeInstance:link_status |
DataSource:value |
CurrentValue:1.000000e+00 |
WarningMin:nan |
WarningMax:nan |
FailureMin:2.000000e+00 |
FailureMax:nan |
Hostpod3-node1, plugin dpdkevents (instance dpdk0) type gauge (instance link_status): Data source "value" is currently 1.000000. That is below the failure threshold of 2.000000. |
Supported Metrics and Events
Full list can be found here: https://github.com/collectd/collectd/blob/master/src/types.db
But below is a mapping of the "base" plugins that would run on the host/the guest.
Where collectd is running | Plugin | Type | Type Instance | Description | comment |
---|---|---|---|---|---|
Host/guest | CPU | percent/jiffies | idle | Time CPU spends idle. | Can be per cpu/aggregate across all the cpus. For more info, please see: http://man7.org/linux/man-pages/man1/top.1.html
http://blog.scoutapp.com/articles/2015/02/24/understanding-linuxs-cpu-stats |
percent/jiffies | nice | Time the CPU spent running user space processes that have been niced. The priority level a user space process can be tweaked by adjusting its niceness. | |||
percent/jiffies | interrupt | Time the CPU has spent servicing interrupts. | |||
percent/jiffies | softirq | ||||
percent/jiffies | steal | CPU steal is a measure of the fraction of time that a machine is in a state of “involuntary wait.” It is time for which the kernel cannot otherwise account in one of the traditional classifications like user, system, or idle. It is time that went missing, from the perspective of the kernel. http://www.stackdriver.com/understanding-cpu-steal-experiment/ | |||
percent/jiffies | system | Time that the CPU spent running the kernel. | |||
percent/jiffies | user | Time CPU spends running un-niced user space processes. | |||
percent/jiffies | wait | The time the CPU spends idle while waiting for an I/O operation to complete | |||
Interface | if_dropped | in | |||
if_errors | in | ||||
if_octets | in | ||||
if_packets | in | ||||
if_dropped | out | ||||
if_errors | out | ||||
if_octets | out | ||||
if_packets | out | ||||
Memory | memory | buffered | |||
memory | cached | ||||
memory | free | ||||
memory | slab_recl | ||||
memory | slab_unrecl | ||||
memory | used | ||||
disk | disk_io_time | io_time | |||
disk_io_time | weighted_io_time | ||||
disk_merged | read | ||||
disk_merged | write | ||||
disk_octects | read | ||||
disk_octects | write | ||||
disk_ops | read | ||||
disk_ops | write | ||||
disk_time | read | ||||
disk_time | write | ||||
pending_operations | |||||
Ping | ping | Latency | |||
ping_droprate | droprate | ||||
ping_stddev | standard deviation | ||||
load | load | shortterm | |||
load | midterm | ||||
load | longterm | ||||
OVS events | gauge | link_status | |||
OVS Stats | collisions | per interface | |||
rx_bytes | |||||
rx_crc_err | |||||
rx_dropped | |||||
rx_errors | |||||
rx_frame_err | |||||
rx_over_err | |||||
rx_packets | |||||
tx_bytes | |||||
tx_dropped | |||||
tx_errors | |||||
tx_packets | |||||
Hugepages | bytes | used | total/pernode/both | ||
bytes | free | ||||
vmpage_number | used | ||||
vmpage_number | free | ||||
percent | used | ||||
percent | free | ||||
processes | fork_rate | ||||
ps_state | blocked | ||||
ps_state | paging | ||||
ps_state | running | ||||
ps_state | sleeping | ||||
ps_state | stopped | ||||
ps_state | zombies | ||||
Host only | Libvirt | disk_octets | read | ||
disk_octets | write | ||||
disk_ops | read | ||||
disk_ops | write | ||||
if_dropped | in | ||||
if_dropped | out | ||||
if_errors | in | ||||
if_errors | out | ||||
if_octets | in | ||||
if_octets | out | ||||
if_packets | in | ||||
if_packets | out | ||||
memory | actual | ||||
memory | balloon | ||||
memory | rss | ||||
memory | swap_in | ||||
memory | total | ||||
virt_cpu_total | This is in jiffies! | ||||
virt_vcpu | This is in jiffies! | ||||
RDT | ipc | per core group | |||
memory_bandwidth | local | ||||
memory_bandwidth | remote | ||||
bytes | llc | ||||
dpdkstats compatible with DPDK 16.04 (based on ixgbe, vhost support will be enabled in DPDK 16.11, patch support being upgraded to DPDK 16.07 in progress) | derive | rx_l3_l4_xsum_error | |||
errors | flow_director_filter_add_errors | ||||
flow_director_filter_remove_errors | |||||
mac_local_errors | |||||
mac_remote_errors | |||||
if_rx_dropped | rx_fcoe_dropped | ||||
rx_mac_short_packet_dropped | |||||
rx_management_dropped | |||||
rx_priorityX_dropped | where X is 0 to 7 | ||||
if_rx_errors | rx_crc_errors | ||||
rx_errors | |||||
rx_fcoe_crc_errors | |||||
rx_fcoe_mbuf_allocation_errors | |||||
rx_fcoe_no_direct_data_placement | |||||
rx_fcoe_no_direct_data_placement_ext_buff | |||||
rx_fragment_errors | |||||
rx_illegal_byte_errors | |||||
rx_jabber_errors | |||||
rx_length_errors | |||||
rx_mbuf_allocation_errors | |||||
rx_oversize_errors | |||||
rx_priorityX_mbuf_allocation_errors | where X is 0 to 7 | ||||
rx_q0_errors | if more queues are allocated then you get the errors per Queue | ||||
rx_undersize_errors | |||||
if_rx_octets | rx_error_bytes | bug - will move this to errors | |||
rx_fcoe_bytes | |||||
rx_fcoe_bytes | |||||
rx_good_bytes | |||||
rx_q0_bytes | per queue | ||||
rx_total_bytes | |||||
if_rx_packets | rx_broadcast_packets | ||||
rx_fcoe_packets | |||||
rx_flow_control_xoff_packets | |||||
rx_flow_control_xon_packets | |||||
rx_good_packets | |||||
rx_management_packets | |||||
rx_multicast_packets | |||||
rx_priorityX_xoff_packets | where X is 0 to 7 | ||||
rx_priorityX_xon_packets | where X is 0 to 7 | ||||
rx_q0_packets | per queue | ||||
rx_size_1024_to_max_packets | |||||
rx_size_128_to_255_packets | |||||
rx_size_256_to_511_packets | |||||
rx_size_512_to_1023_packets | |||||
rx_size_64_packets | |||||
rx_size_65_to_127_packets | |||||
rx_total_missed_packets | |||||
rx_total_packets | |||||
rx_xoff_packets | |||||
rx_xon_packets | |||||
if_tx_errors | tx_errors | ||||
if_tx_octets | tx_fcoe_bytes | ||||
tx_good_bytes | |||||
tx_q0_bytes | per queue | ||||
if_tx_packets | tx_broadcast_packets | ||||
tx_fcoe_packets | |||||
tx_flow_control_xoff_packets | |||||
tx_flow_control_xon_packets | |||||
tx_good_packets | |||||
tx_management_packets | |||||
tx_multicast_packets | |||||
tx_priorityX_xoff_packets | where X is 0 to 7 | ||||
tx_priorityX_xon_packets | where X is 0 to 7 | ||||
tx_q0_packets | per queue | ||||
tx_size_1024_to_max_packets | |||||
tx_size_128_to_255_packets | |||||
tx_size_256_to_511_packets | |||||
tx_size_512_to_1023_packets | |||||
tx_size_64_packets | |||||
tx_size_65_to_127_packets | |||||
tx_total_packets | |||||
tx_xoff_packets | |||||
tx_xon_packets | |||||
operations | flow_director_added_filters | ||||
flow_director_matched_filters | |||||
flow_director_missed_filters | |||||
flow_director_removed_filters |
un-niced