The RTPS protocol, defined in the DDSI specification, typically uses UDP transport. As the UDP protocol is not reliable, the RTPS protocol defines it’s own way to ensure reliable transmission of data from RTPS writers to RTPS readers when needed. This protocol can be tuned to improve latency, throughput or resource consumption depending the underlying network characteristics and the system needs. In particular, the DDSI standard suggests 3 configurable values that are also configurable in Vortex Café :
- The heartbeat period : ddsi.data.writers.heartbeat.period
- The heartbeat response delay : ddsi.data.readers.heartbeat.responseDelay
- The nack response delay : ddsi.data.writers.acknack.responseDelay
The DDSI reliability protocol basics
Before we can understand what the above options mean and what are their impacts on the protocol, we need to understand how the DDSI reliability protocol works.
In order to reliably send data, RTPS writers and RTPS readers exchange 3 type of RTPS SubMessages :
- DATA SubMessages are sent by writers to readers and contain data samples.
- HEARTBEAT SubMessages are sent by writers to readers and indicate which data samples are available in the writer and so should have been received by the matching readers.
- ACKNACK SubMessages are sent by readers to writers and indicate which data samples have been received by the reader and which ones the reader is asking to be resent.
(In this article, we will not consider GAP SubMessages as well as fragmentation : DATA_FRAG, HEARTBEAT_FRAG and NACK_FRAG SubMessages).
Each time a data sample is written, the RTPS Writer sends the sample to matching RTPS Readers in a DATA SubMessage. Each data sample is identified with a sequence number. Those messages are typically sent over multicast if several remote RTPS Readers match the local RTPS writer.
The RTPS Writer also starts to periodically (heartbeat period) send HEARTBEAT SubMessages to matching RTPS Readers (if not already) until all matching RTPS Readers positively acknowledged reception of all available data samples. HEARTBEAT SubMessages contain a range of sequence numbers indicating to RTPS Readers which data samples are available in the RTPS Writer. HEARTBEAT SubMessages are also typically sent over multicast. Note also that, when it sends DATA SubMessages, Vortex Café DDSI implementation packs a HEARTBEAT SubMessage in the same RTPS Message (and so the same UDP datagram) if possible.
Each time a RTPS Reader receives a HEARTBEAT SubMessage, it must answer back an ACKNACK SubMessage indicating which data samples (sequence numbers) it received and which data samples should be resent by the RTPS Writer. RTPS Readers do not answer back an ACKNACK SubMessage immediately but wait for a given delay (heartbeat response delay) before answering back.
Each time a RTPS Writer receives an ACKNACK SubMessage asking for data samples to be resent, the RTPS Writer must resend the asked data samples to the asking RTPS Reader through DATA SubMessages. RTPS Readers do not resend asked data samples immediately but wait for a given delay (nack response delay) before answering back.
The heartbeat period allows to configure the rate of HEARTBEAT SubMessages emission.
- It is useless to set a heartbeat period smaller than the smallest roundtrip time between a writer and a reader in the system plus the heartbeat response delay. Indeed it is useless to send a heartbeat again before any remote reader had the chance to answer the previous one.
heartbeat period > roundtrip time + heartbeat response delay
- A small heartbeat period will ensure that a lost data sample is resent as soon as possible and will lower latency but will consume more bandwidth and may cause unnecessary retransmitions and so may not result in optimal throughput.
- A larger heartbeat period may increase latency but will consume less bandwidth and avoid unnecessary retransmitions and so improve throughput.
- A too large heartbeat period may cause the reliability protocol to block waiting for retransmissions if the queues and buffers fill up leading to both bad latency and bad throughput.
Heartbeat response delay
The heartbeat response delay indicates how much time a reliable RTPS Reader should wait before replying to a received HEARTBEAT SubMessage by sending an ACKNACK SubMessage. There are mainly 2 reasons why we may want to delay the reply :
- The data samples indicated available in a HEARTBEAT SubMessage may reach the RTPS Reader just after the HEARTBEAT SubMessage (even if sent before, UDP protocol does not guarantee ordering). Delaying the ACKNACK reply may avoid sending a negative ACKNACK SubMessage and so avoid unneeded retransmission.
- Delaying the reply may allow to answer multiple HEARTBEAT SubMessages with a single ACKNACK SubMessage and save resources.
As for the heartbeat period :
- A zero or small heartbeat response delay will make sure that missed data samples are resent as soon as possible and will lower latency.
- A larger heartbeat response delay will consume less bandwidth and avoid unnecessary retransmitions improving throughput.
- A too large heartbeat response delay will badly impact both latency and throughput.
Nack response delay
The nack response delay indicates how much time a reliable RTPS Reader should wait before replying to a negative ACKNACK SubMessage by sending a repair DATA SubMessage. There are mainly 2 reasons why we may want to delay the repair :
- Several matching remote RTPS Readers may send ACKNACK SubMessages asking for the same missing data samples. The RTPS Writer can in this case send a single repair DATA SubMessage through multicast to all asking RTPS Readers at once saving a lot of bandwidth.
- A negative ACKNACK SubMessage may be followed by a positive one for the same data sample(s). Delaying the repair may avoid unnecessary retransmissions.
- A zero or small nack response delay will make sure that missed data samples are resent as soon as possible and will lower latency.
- A larger nack response delay will allow to group retransmissions and may avoid unnecessary ones improving throughput.
- A too large nack response delay will badly impact both latency and throughput.