Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

1.0

Use Linux perf interface to collect data about performance events on a per core basis


2.0

Use jevents library (PMU tools)


3.0

Report hardware cache events, kernel PMU events, software events, hardware specific events

4.0

Should have a configurable interval


54.0

Should have configurable hardware specific events list


65.0

Provide SNMP support for any collectd values, through an PMU MIB


76.0

Provide support for multi PMU uncore events


7.0Provide option to choose all the events from json event list file


Overview

Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots. Linux perf interface provides rich generalized abstractions over hardware specific capabilities. 

...

The intel_pmu plugin collects information provided by Linux perf interface. It is not done directly, but through jevents API. Using this interface, the intel_pmu plugin should collect collects the following metrics:

...

Name

...

Type

...

Type Instance

...

Description

...

cpu-cycles

...

counter

...

cpu-cycles

...

instructions

...

counter

...

instructions

...

cache-references

...

counter

...

cache-references

...

cache-misses

...

counter

...

cache-misses

...

Branches

...

counter

...

Branches

...

branch-misses

...

counter

...

branch-misses

...

bus-cycles

...

counter

...

bus-cycles

...

L1-dcache-loads

...

counter

...

L1-dcache-loads

...

L1-dcache-load-misses

...

counter

...

L1-dcache-load-misses

...

L1-dcache-stores

...

counter

...

L1-dcache-stores

...

L1-dcache-store-misses

...

counter

...

L1-dcache-store-misses

...

L1-dcache-prefetches

...

counter

...

L1-dcache-prefetches

...

L1-dcache-prefetch-misses

...

counter

...

L1-dcache-prefetch-misses

...

L1-icache-loads

...

counter

...

L1-icache-loads

...

L1-icache-load-misses

...

counter

...

L1-icache-load-misses

hardware specific metrics defined in event list file which should contain definitions of PMU events. The list of events to monitor is configurable.

...

L1-icache-prefetches

...

counter

...

L1-icache-prefetches

...

L1-icache-prefetch-misses

...

counter

...

L1-icache-prefetch-misses

...

LLC-loads

...

counter

...

LLC-loads

...

LLC-load-misses

...

counter

...

LLC-load-misses

...

LLC-stores

...

counter

...

LLC-stores

...

LLC-store-misses

...

counter

...

LLC-store-misses

...

LLC-prefetches

...

counter

...

LLC-prefetches

...

LLC-prefetch-misses

...

counter

...

LLC-prefetch-misses

...

dTLB-loads

...

counter

...

dTLB-loads

...

dTLB-load-misses

...

counter

...

dTLB-load-misses

...

dTLB-stores

...

counter

...

dTLB-stores

...

dTLB-store-misses

...

counter

...

dTLB-store-misses

...

dTLB-prefetches

...

counter

...

dTLB-prefetches

...

dTLB-prefetch-misses

...

counter

...

dTLB-prefetch-misses

...

iTLB-loads

...

counter

...

iTLB-loads

...

iTLB-load-misses

...

counter

...

iTLB-load-misses

...

branch-loads

...

counter

...

branch-loads

...

branch-load-misses

...

counter

...

branch-load-misses

...

cpu-clock

...

counter

...

cpu-clock

...

task-clock

...

counter

...

task-clock

...

context-switches

...

counter

...

context-switches

...

cpu-migrations

...

counter

...

cpu-migrations

...

page-faults

...

counter

...

page-faults

...

minor-faults

...

counter

...

minor-faults

...

major-faults

...

counter

...

major-faults

...

alignment-faults

...

counter

...

alignment-faults

...

emulation-faults

...

counter

...

emulation-faults

Plugin configuration

The following configuration options should be supported by intel_pmu collectd plugin:  

Name

Description

Comment

Interval

The interval within which to retrieve statistics on monitored events in seconds

Interval option is supported by collectd and is defined in <LoadPlugin> block. No additional functionality should be developed in intel_pmu plugin to support this option.ReportHardwareCacheEvents

Enable/disable monitoring of hardware cache events

ReportKernelPMUEvents

Enable/disable monitoring of kernel PMU events

ReportSoftwareEvents

Enable/disable monitoring of software vents

EventListPath to hardware events list file for current CPU.File can be downloaded by event_download.py script which is part of pmu-tools package.

HardwareEvents

String containing comma separated list of hardware specific events to monitor


Cores

Core groups definition. Monitored metrics are reported only for configured cores. If this option is omitted all available cores are monitored.

If a group is enclosed in square brackets each core is added individually to a separate group (that is statistics are not aggregated).

Allowed formats:
"0,1,2,3"
"0-3"
"[0-3]"

DispatchMultiPmuEnable/disable dispatching of cloned multi PMU for uncore events. If
disabled only total sum is dispatched as single event. If enabled separate
metric is dispatched for every counter.

Uncore event example: UNC_CHA_DIR_LOOKUP.NO_SNP.

If enabled information about event type is added to type_instance, e.g.: "UNC_CHA_DIR_LOOKUP.NO_SNP:type=30". It allows to distinguish between multiple counters for one event.

...

Here is an example of the plugin configuration section of collectd.conf file:

  <Plugin intel_pmu>
    ReportHardwareCacheEvents true
    ReportKernelPMUEvents true
    ReportSoftwareEvents true

    EventList "/var/cache/pmu/GenuineIntel-6-55-core.json"

...