SCTP's multihoming failure detection time depends on three tunable parameters:
RTO.min (minimum retransmission timeout)
RTO.max (maximum retransmission timeout), and
Path.Max.Retrans (threshold number of consecutive timeouts that must be exceeded to detect failure).
RFC2960 recommends these values:
RTO.min - 1 second
RTO.max - 60 seconds
Path.Max.Retrans - 5 attempts per destination address
If the timer expires for the destination address, set RTO = RTO * 2 ("back off the timer").
The maximum value discussed (RTO.max) may be used to provide an upper bound to this doubling operation.
Since Path.Max.Retrans = 5 attempts, this translates to a failure detection time of at least 63 seconds (1 + 2 + 4 + 8 + 16 + 32).
In the worse case scenario, taking the maximum of 60 seconds, the failure detection time is 360 seconds (6 * 60).
In another example, where the following parameters are used,
RTO.min - 100ms
RTO.max - 400ms
Path.Max.Retrans - 4 attempts
Then,
Max. failure detection time = (1 + PMR)* RTO.max = 5*400 = 2000ms
Min. failure detection time = 100 + 200 + 400 + 400 + 400 = 1500ms
First Entry - A New Chapter
7 years ago
Hi Rimmon. I've been working on a few calculations with regards to sctp failure/trigger and have a couple of questions. You state that it only takes minRTO, maxRTO and Pathmax for the calculation. I was under the impression that initRTO and heartbeats were used as well? Here's what I'm currently working with:
associationMaxRtx = 20
pathMaxRtx = 10
minimumRto = 100ms
initialRto = 200ms
maximumRto = 600ms
What would be the time for the path failure?
Thanks,
Matt