Page History

Versions Compared

Old Version 1

changes.mady.by.user Sridhar Rao

Saved on Jun 18, 2021

compared with

New Version 2

changes.mady.by.user Girish L

Saved on Jun 22, 2021

Key

This line was added.
This line was removed.
Formatting was changed.

...

Survey if completed.
Testbed is assigned - Pod12-Jump
Framework : Acumos (too many issues).
Problem Domain - Failure Prediction
Clear Definition of Failure Prediction - Ongoing.
Existing Models with FP - ARIMA or RNN - Used to deploy and test.
Enhancement to Existing works on FP - Not yet started
Data Gathering: (Important*)
1. Publicly Available: Searching...
2. Collecting from existing testbeds: WIP

Sl. No.

Topic

Presenter

Notes

1

Framework Deployment Status

Rohit Singh Rathaur

Acumos - Container/K8S based approach.

Vanilla deployment - Failure to deploy for both approached (with and without cluster deployment).

Work on Acumos on Pod18 - Existing Cluster - Girish
Work on Other framework on Pod12-Jump - Rohit. Decision on 'other' framework by EoW.

2

Survey - Implementation details - Status

Rohit Singh Rathaur

Completed

https://docs.google.com/spreadsheets/d/15XRdrWvbSCPsg1zZ9PfT9yvnElq21AvB/edit#gid=971676644

3

Model Deployment Status

Rohit Singh Rathaur

Waiting for the Framework to be UP - to run on testbed.

Currently running locally - Google Collab. (Jupyter Notebooks).

Data: CPU consumption.

Failure: VM.

4

Publicly Available Data

Rohit Singh Rathaur

To be added by Girish/Rohit:

Dataset	Description	Experiments/Papers Published	Link

4

Failure Prediction Definition - Status

Rohit Singh Rathaur

Existing works:

Mostly VM and Application Failures.
Failure - Crash and Connectivity

Gaps:

Hardware, Containers
Other failure types aren't considered

How to collect Data:

Take advantage of Chaos Engg Project - Litmus, Pumba, blockade etc.