Anuket Project

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »


Data



Failure TypeFailure parameter

  Failure Event

 Infrastructure Metrics

Comments
Links

Link Down.

Link removed




VM

Deployment/Start Failures:

  1. Failed to start*
  2. Failed to boot*

Post-Deployment/Start failures:

  1. Shutdown
  2. Crash
  3. Hang
  4. Panic

nova-compute.log

nova-api.log

nova-scheduler.log

libvirt.log

qemu/$vm.log

neutron-server.log

glance/cinder - 

flavor

Node and Core-mapping


cpu: per-core utilization

memory

Interfaces statistics - sent, recv, drops

Disk Read/Write


If possible, Infrastructure metrics and  syslogs from within the VM should be collected.

Deployment/Start failures can be the first step.


Container

Deployment/Start Failures:

  1. Failed to start*
  2. Failed to boot*

Post-Deployment/Start failures:

  1. Shutdown
  2. Crash
  3. Hang
  4. Panic

cpu: per-core utilization

memory

Interfaces statistics - sent, recv, drops

Disk Read/Write


Node

A node failure (hardware failure, OS crash, etc)


C)  Fabric component failure -- N/A, assuming redundant/highly available configuration

  1. ZK
  2. DB
  3. RPC

D) Failure of other OpenStack services -- N/A, assuming redundant/highly available configuration

  1. Glance
  2. Keystone

A) node network connectivity failure

  1. management network
  2. VMs communication network
  3. storage network

B) nova service failure (e.g., process crashed) -- detected and restarted by a local watchdog process

  1. compute
  2. volume
  3. network
  4. scheduler
  5. api.


ApplicationCrash/Connectivity/Non-Functional

Application Log i.e. If it is Apache then logs of Apache

Packet Drops, Latency, Throughput, Saturation, Resource UsageDeploy Collectd within the application and collect both application logs and infrastructure metrics
Middleware Services



Models



Gaps



Enhancements

  • No labels