TimedFailureMonitor

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- com.mesosphere.sdk.scheduler.recovery.monitor.DefaultFailureMonitor
- - com.mesosphere.sdk.scheduler.recovery.monitor.TimedFailureMonitor

All Implemented Interfaces:

FailureMonitor
```
public class TimedFailureMonitor
extends DefaultFailureMonitor
```
Implements a FailureMonitor with a time-based policy.
Note that, for safety reasons, this only sets a lower bound on when task is determined failed. Since during an outage system clocks can be accidentally misconfigured (for instance, when adding new nodes), we cannot rely on system time (since we might underestimate the wait), and so we must reset our clock from zero when the framework restarts. This unfortunately means that if the framework is also being frequently restarted, this detector may never trigger. A monotonic clock built on ZooKeeper could solve this, by recording each passing second, so that we only need to rely on the fact that the clock proceeds at 1 second per second, rather than on the clocks being synchronized across machines.

Constructor Summary

Constructors
Constructor and Description
`TimedFailureMonitor(java.time.Duration durationUntilFailed, StateStore stateStore, ConfigStore<ServiceSpec> configStore)` Creates a new `FailureMonitor` that waits for at least a specified duration before deciding that the task has failed.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`boolean`	`hasFailed(TaskInfo terminatedTask)` Determines whether the given task has failed, by tracking the time delta between the first observed failure and the current time.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - TimedFailureMonitor
```
public TimedFailureMonitor(java.time.Duration durationUntilFailed,
                           StateStore stateStore,
                           ConfigStore<ServiceSpec> configStore)
```
    Creates a new FailureMonitor that waits for at least a specified duration before deciding that the task has failed.
    
    Parameters:
    
    durationUntilFailed - The minimum amount of time which must pass before a stopped Task can be considered failed.
- Method Detail
  - hasFailed
```
public boolean hasFailed(TaskInfo terminatedTask)
```
    Determines whether the given task has failed, by tracking the time delta between the first observed failure and the current time.
    The first time a task is noticed to be failed, we record that time into a map, keyed by the task's TaskID. Then, we return true if at least the configured amount of time has passed since then.
    
    Specified by:
    
    hasFailed in interface FailureMonitor
    
    Overrides:
    
    hasFailed in class DefaultFailureMonitor
    
    Parameters:
    
    terminatedTask - The task that stopped and might be failed
    
    Returns:
    
    true if the task has been stopped for at least the configured interval

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method