...
1.0 | Use Linux perf interface to collect data about performance events on a per core basis | |||
2.0 | Use jevents library (PMU tools) | |||
3.0 | Report hardware cache events, kernel PMU events, software events, hardware specific events | 4.0 | Should have a configurable interval | |
54.0 | Should have configurable hardware specific events list | |||
65.0 | Provide SNMP support for any collectd values, through an PMU MIB | |||
76.0 | Provide support for multi PMU uncore events | |||
7.0 | Provide option to choose all the events from json event list file |
Overview
Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots. Linux perf interface provides rich generalized abstractions over hardware specific capabilities.
...
The intel_pmu plugin collects information provided by Linux perf interface. It is not done directly, but through jevents API. Using this interface, the intel_pmu plugin should collect collects the following metrics:
...
Name
...
Type
...
Type Instance
...
Description
...
cpu-cycles
...
counter
...
cpu-cycles
...
instructions
...
counter
...
instructions
...
cache-references
...
counter
...
cache-references
...
cache-misses
...
counter
...
cache-misses
...
Branches
...
counter
...
Branches
...
branch-misses
...
counter
...
branch-misses
...
bus-cycles
...
counter
...
bus-cycles
...
L1-dcache-loads
...
counter
...
L1-dcache-loads
...
L1-dcache-load-misses
...
counter
...
L1-dcache-load-misses
...
L1-dcache-stores
...
counter
...
L1-dcache-stores
...
L1-dcache-store-misses
...
counter
...
L1-dcache-store-misses
...
L1-dcache-prefetches
...
counter
...
L1-dcache-prefetches
...
L1-dcache-prefetch-misses
...
counter
...
L1-dcache-prefetch-misses
...
L1-icache-loads
...
counter
...
L1-icache-loads
...
L1-icache-load-misses
...
counter
...
L1-icache-load-misses
hardware specific metrics defined in event list file which should contain definitions of PMU events. The list of events to monitor is configurable.
...
L1-icache-prefetches
...
counter
...
L1-icache-prefetches
...
L1-icache-prefetch-misses
...
counter
...
L1-icache-prefetch-misses
...
LLC-loads
...
counter
...
LLC-loads
...
LLC-load-misses
...
counter
...
LLC-load-misses
...
LLC-stores
...
counter
...
LLC-stores
...
LLC-store-misses
...
counter
...
LLC-store-misses
...
LLC-prefetches
...
counter
...
LLC-prefetches
...
LLC-prefetch-misses
...
counter
...
LLC-prefetch-misses
...
dTLB-loads
...
counter
...
dTLB-loads
...
dTLB-load-misses
...
counter
...
dTLB-load-misses
...
dTLB-stores
...
counter
...
dTLB-stores
...
dTLB-store-misses
...
counter
...
dTLB-store-misses
...
dTLB-prefetches
...
counter
...
dTLB-prefetches
...
dTLB-prefetch-misses
...
counter
...
dTLB-prefetch-misses
...
iTLB-loads
...
counter
...
iTLB-loads
...
iTLB-load-misses
...
counter
...
iTLB-load-misses
...
branch-loads
...
counter
...
branch-loads
...
branch-load-misses
...
counter
...
branch-load-misses
...
cpu-clock
...
counter
...
cpu-clock
...
task-clock
...
counter
...
task-clock
...
context-switches
...
counter
...
context-switches
...
cpu-migrations
...
counter
...
cpu-migrations
...
page-faults
...
counter
...
page-faults
...
minor-faults
...
counter
...
minor-faults
...
major-faults
...
counter
...
major-faults
...
alignment-faults
...
counter
...
alignment-faults
...
emulation-faults
...
counter
...
emulation-faults
Plugin configuration
The following configuration options should be supported by intel_pmu collectd plugin:
Name | Description | Comment | |||||
Interval | The interval within which to retrieve statistics on monitored events in seconds | Interval option is supported by collectd and is defined in <LoadPlugin> block. No additional functionality should be developed in intel_pmu plugin to support this option.ReportHardwareCacheEvents | |||||
Enable/disable monitoring of hardware cache events | ReportKernelPMUEvents | Enable/disable monitoring of kernel PMU events | ReportSoftwareEvents | Enable/disable monitoring of software vents | EventList | Path to hardware events list file for current CPU. | File can be downloaded by event_download.py script which is part of pmu-tools package. |
HardwareEvents | String containing comma separated list of hardware specific events to monitor | ||||||
Cores | Core groups definition. Monitored metrics are reported only for configured cores. If this option is omitted all available cores are monitored. If a group is enclosed in square brackets each core is added individually to a separate group (that is statistics are not aggregated). | Allowed formats: | |||||
DispatchMultiPmu | Enable/disable dispatching of cloned multi PMU for uncore events. If disabled only total sum is dispatched as single event. If enabled separate metric is dispatched for every counter. | Uncore event example: UNC_CHA_DIR_LOOKUP.NO_SNP. If enabled information about event type is added to type_instance, e.g.: "UNC_CHA_DIR_LOOKUP.NO_SNP:type=30". It allows to distinguish between multiple counters for one event. |
...
Here is an example of the plugin configuration section of collectd.conf file:
<Plugin intel_pmu>
ReportHardwareCacheEvents true
ReportKernelPMUEvents true
ReportSoftwareEvents true
EventList "/var/cache/pmu/GenuineIntel-6-55-core.json"
...