Fixed Bugs and Changes in Vortex OpenSplice 6.11.x

This page lists all the fixed bugs and changes in the Vortex OpenSplice 6.11.x releases.

Regular releases of Vortex OpenSplice contain fixed bugs, changes to supported platforms and new features are made available on a regular basis.

There are two types of release, major releases and minor releases.  Upgrading Vortex OpenSplice contains more information about the differences between these releases and the impact of upgrading.  We advise customers to move to the most recent release in order to take advantage of these changes.  This page details all the fixed bugs and changes between different Vortex OpenSplice releases.  There is also a page which details the new features that are the different Vortex OpenSplice releases.

There are two different types of changes. Bug fixes and changes that do not affect the API and bug fixes and changes that may affect the API. These are documented in separate tables.

Fixed Bugs and Changes in Vortex OpenSplice 6.11.x

Vortex OpenSplice 6.11.0

Report ID.Description
OSPL-13920 / 00020795Node.js DCPS : Errors when importing IDL for topics using typedef references

In Node.js DCPS, an importIDL api is provided to import topic types defined in IDL. The importIDL api generates xml using idlpp, then processes the xml. If the IDL included typedef references to other typedef references, the end user would see errors. The processing of the idlpp generated xml did not handle typedef references to other typedefs.

Solution: The OSPL Node.js DCPS code has been fixed to handle the cases where topics defined in IDL include typedef references to other typedefs.
OSPL-13943 / 20692Durability alignment is not consistent among several nodes when using a REPLACE policy.

When the durability service performs a REPLACE alignment policy the corresponding instances based on the timestamp of the alignment are first wiped from the transient store before the aligned samples are injected. When in the meantime a dispose of a DCPSPublication corresponding to some of the aligned data is handled by the spliced daemon then it may occur that these instances are placed on a purge list before the aligned samples are injected. In this case the injection of the samples will incorrectly not remove these instances from the purge list.

Solution: An instance is always removed from the empty purge list when a sample is injected and the instance becomes not empty.
OSPL-13909 Durability should wait for the presents of remote durability protocol readers when using ddsi.

When the ddsi service is being used and in case the durability service detects a fellow durability service the durability service should wait with sending messages to that fellow until it has detected the remote durability readers. Due to some configuration parameter changes this function has mistakenly been disabled.

Solution: The check for the presents of remote durability readers is enabled when the ddsi service is used as networking service.
OSPL-13724 / 00020481 The Vortex.idlImportSlWithIncludePath function call of Simulink Vortex DDS blockset was causing error on Windows platform.

On the Windows platform, the Vortex.idlImportSlWithIncludePath function call of Simulink Vortex DDS blockset was causing error for passing the includedirpaths argument. It is because the function was passing the arguments in the wrong order to the IDLPP tool.

Solution: The bug is now fixed. The Vortex.idlImportSlWithIncludePath function has been updated to pass the arguments in the correct order to the IDLPP tool.
OSPL-12485 Possible incomplete transaction when aligned by durability

It was possible that a transaction was incomplete when aligned by durability as all transactional samples were treated as EOT. All transactional samples were compared as if they were EOTs which could lead to transactional samples being discarded as duplicates and not aligned

Solution: Made sure only EOT messages are compared
OSPL-12877 Alignment may stall when a new local group is created while a merge request to acquire the data via a merge for the same group is being scheduled

When a durability service learns about a partition/topic combination, it may have to acquire data for this group by sending a request for samples to its master. When at the same time a merge conflict with the fellow is being handled, this may also lead to the sending of a request for samples to the same fellow. Both paths are for performance reasons decoupled, and so there is a window of opportunity that may lead to two identical requests for the same data to the same fellow. Since the requests are identical only one of them is answered. The remaining one never gets answered, with may potentially stalls conflict resolution.

Solution: The requests are distinguished so that they are never identical
OSPL-13307 / 00019125 When running mmstat -M some of the numbers created are incorrect

The variables which are created by mmstat that represent a difference are output as unsigned long int. This means that negative numbers are incorrectly output.

Solution: Changed the data type of variables that represent a difference from unsigned long int to signed long int to avoid incorrect output in mmstat -M.
OSPL-13532 Stalling alignment when an asymmetrical disconnect occurs during the durability handshake

When durability services discover each the must complete a handshake before data alignment can start. The handshake consists of a several stages: # Whenever a durability service discover another durability service it pushes a so-called Capability message to the discovered durability service. A precondition for this to happen is that the Capability reader of the remote durability service must have been discovered, otherwise the Capability message may get lost. These Capability messages are essential to detect, and recover from, asymmetric disconnects. # Once a Capability message has been received from a remote durability service it is possible to request its namespaces by sending a so-called nameSpacesRequest message (ofcourse, after having discovered the reader for this message on the remote durability service). This should trigger the remote durability service to send its namespaces, after which the handshake is completed There are two problems with the handshake. First of all, when a durability service sends its request for namespaces to the remote durability service, there is no guarantee that the remote durability service has discovered its namespaces reader at the time the namespaces are being published, so they can get lost. Secondly, and more likely, when an asymmetric disconnect occurs while establishing the handshake, it is not possible anymore to detect that an asymmetric disconnect has occurred , and therefore it is not possible anymore to recover from this situation. This will effectively lead to a situation where the handshake is not completed, and therefore alignment is stalled.

Solution: There are two soutions ingeredients # When a Capability is published to a remote durability service ALL its relevant readers must have discovered iso only the Capability reader. # To resolve the stalling handshake due to asymmetric disconnects occurring during the handshake, Capability message and nameSpacesRequest message are being republished when the handshake takes too long to complete. This can be controlled using two environment variables ## The environment variable OSPL_DURABILITY_CAPABILITY_RECEIVED_PERIOD specifies the time (in seconds) after which to republish a capability. The default is 3.0 seconds. ## The environment variable OSPL_DURABILITY_NAMESPACES_RECEIVED_PERIOD specifies the time after which to resend a nameSpacesRequest. The default is 30.0 seconds
OSPL-13692 / 00020677 Networking throttling may slow down communication too long.

When a receiver experiences high packet loss the backlog will increase which is communicated to the sender. In that case the sender will apply throttling to reduce the load on the receiver, However throttling is also applied to the resending of the lost packets. This may cause that the backlog at the receiver decreases at a low rate causing the throttling to applied longer than necessary.

Solution: A parameter (ResendThrottleLimit) is added which sets the lower throttling limit for resending lost packets. Further when the sender detects that there are gaps in the received acknowledgements resends are performed earlier.
OSPL-13698 When the node causing a master conflict is disconnected before resolving the master conflict may remain unresolved.

When using legacy master selection and a master conflict is detected because another node has selected a different master then the current master is directly set to pending. When before resolving the master conflict the node that caused the master conflict has been disconnected the master conflict could be dropped although the master conflict still exists and because the master is set to pending no new master conflict is raised.

Solution: A master conflict is always resolved independent of the node that has caused the master conflict has been disconnected and has been removed from the durability administration.
OSPL-13705 / 00020691 Provide option to the RT networking service to optimize durability traffic by using point-to-point communication.

For the RT networking service, the durability service is just another application. Protocol messages sent by the durability service will therefore be sent to all the nodes in the system. The protocol messages sent by the durability service are either addressed to all, a subset or to only one fellow durability service. To limit the networking load caused by the durability service it would be beneficial when the networking service has some knowledge of the durability protocol and sent durability messages that are addressed to one fellow to be sent point-to-point. This requires that the capability to send messages point-to-point is added to the RT networking service.

Solution: Support for point-to-point communication for durability message addressed to one fellow added.
OSPL-13748 / 00020708 The RT networking service can run out of buffers when receive socket is overloaded.

To limit the chance of packet loss to occur in the receive socket the networking receive thread tries to read as much packets from the receive socket before processing these packets further. However when the receive socket remains full the number of received packets that are waiting to be processes is increasing which may cause that the networking service will run out of buffers.

Solution: When reading packets from the receive socket the size it is checked if the number of packets waiting to be processed does not exceed a threshold. When the threshold is reached the networking receive thread will first process some waiting packets before attending the receive socket.
OSPL-13753 / 00020714 When installing a Visual Studio 2005 version silently a popup window appears and stops the installation

The installations now ensure Visual Studio redistributables will not force a reboot of Windows before the main installer has completed by using an optional parameter when running the redistributable . This has created a problem for Visual Studio 2005 versions as this parameter is illegal and an error message is produced.

Solution: An additional page has been included in the installer to allow users to not install the Visual Studio redistributable. This option can also be used when installing silently and allows a customer to skip the redistributable which creates the error condition.
OSPL-13756 Provide the option to have the RT networking service perform the distribution of the builtin topics.

When using the RT networking service the durability service will be responsible for alignment of the builtin topic which is not the case when the DDSI service is used. In a large system the number of builtin topics can become very large. When the networking service is made responsible for aligning the builtin topics only the own builtin topics of a node have been aligned when two nodes detected each other. Especially when a disconnect/reconnect occurs it will reduce the number of builtin topics that have to be aligned.

Solution: The ability to align the builtin topics by RT networking has be added and a configuration parameter ManageBuiltinTopics has been added by which this ability can be enabled. Note to maintain similarity with the DDSI service this applies to: DCPSParticipant, DCPSPublication, DCPSSubscription, DCSPTopic and the CM related builtin topics.
OSPL-13771 / 00020719 Python API: 'Out Of Resources' exceptions when using conditions and shared memory.

A memory leak was introduced in 6.10.4p1. In the python class Condition, dealloc was removed, resulting in improper cleanup. This change was introduced as a fix for OSPL-13503 Cleanup error: dds_condition_delete: Bad parameter Not a proper condition.

Solution: The change to remove the Condition dealloc was for a minor logging issue. This OSPL-13503 change was rolled back in order to fix the more serious Out of Resources exceptions. With this rollback, extra error messages may be logged. The memory leak for Condition is fixed.
OSPL-13773 The durability service may send an alignment request to a not confirmed master.

When during the master selection a master is proposed but that master is not yet confirmed and in parallel the need to align the data of a topic/partition is triggered then it may occur that an alignment request is sent to this not yet confirmed master which may not become the actual selected master.

Solution: Delay during initial alignment requesting alignment of data until master selection has finished and a the master is confirmed.
OSPL-13781 Allow setting the master selection protocol on each durability namespace independently.

Either legacy master selection of master selection based on master priorities can be configured for the durability namespaces. When master selection based on master priority it should be configured on all the namespaces. However it should be allowed to configure the master selection protocol on each namespace independently.

Solution: The global setting of the master selection protocol is removed and the master selection protocol configured for each namespace is applied when selecting a master for that namespace.
OSPL-13784 / 00020725 The RT networking service the synchronization on the first expected fragment from a newly detected fellow could fail.

When the networking service receives a first packet from an other node it has to determine the first sequence number of the packet that is both used by the sending node and the receiving node as the starting sequence number of the reliable communication. This first sequence number is determined either from the offset present in the first received packet or on the expected packet number indicated by the sync message when SyncMessageExchange has been enabled. The sender will then resent the packets from that agreed sequence number until the sequence number of the packet already received. In this case packets with lower sequence numbers than the sequence number of the first receive packet should be accepted. However this may fail when the first sync message is lost which may cause that packets are rejected by the receive but are already acknowledged. In that case the received will not receive the expected packet .

Solution: When waiting for the sync message which sets the expected sequence number packet received with a sequence number lower than the sequence number of the first received packet are accepted and placed on the out-of-order list.
OSPL-13791 / 00020706 Potential memory leak in Java5 DataReader

The Java5 DataReader offers two different modes of reading data: it either returns an Iterator holding all requested samples, or you preallocate an ArrayList and pass that as input parameter to your read/take call. The latter is more efficient if you want to benefit from recycling samples allocated in the previous invocation of your read/take calls. For this purpose, the DataReader keeps track of a MAP containing for each previous ArrayLists all the relevant recyclable intermediate objects. However, if you keep feeding new ArrayList objects to each subsequent read/take call, the MAP will grow indefinitely and leak away all your previous data. Although invking the preallocated read/take calls this way is against its intended usage and design, some examples are doing exactly that.

Solution: Examples have been modified not to feed new ArrayList objects to every subsequent read/tale call. Also the MAP that keeps track of all previous ArrayLists and their corresponding intermediate objects has been replaced with a one-place buffer. That means you can still benefit from recycling intermediate data if you use the same ArrayList over and over, but it will garbage collect anything related to any prior ArrayList you passed to the read/take calls.
OSPL-13795 / 00020698 Issues during termination of spliced when configuration specifies thread attributes for heartbeat-manager

When the configuration file specifies attributes such as stack size or scheduling class for the heartbeat-manager in spliced (//OpenSplice/Domain/Daemon/Heartbeat), termination fails and triggers an error report "Failed to join thread (null):0x0 (os_resultSuccess)"

Solution: The code was changed to cover a specific path where after stopping the thread an invalid return code was propagated leading to failed termination.
OSPL-13797 Unnecessary alignment may occur when a node with a namespace with aligner=false (temporarily) chooses a different master for this namespace. This unnecessarily increases network load.

When a node with aligner=false for a namespace enters a network, this node starts looking for an aligner. If there are potentially multiple aligners but not all of them have been discovered yet, then it could happen that this node chooses a different master for the namespace than somebody else. When the nodes that chose a different master for the namespace detect each other, then a master conflict is generated. Resolving this master conflict leads to alignment. Although functionally there is nothing wrong, the unfortunate situation in this scenario is that the alignment for nodes with alignment=false is not necessary, because by definition of aligner=false this node will not provide any alignment data to the master (whichever one is chosen). Still the master bumps its state, and causes all slaves to align from the master again. These alignments are superfluous.

Solution: The situation where are node with aligner=false has (temporarily) chosen a different master is not considered a valid reason to start alignment.
OSPL-13812 Trying to unregister a non-existent instance leads to an entry in ospl-error. This is incorrrect.

Trying to unregister a non-existent instance is a legitimate application action that should return PRECONDITION_NOT_MET. However, as a side effect also an error message would appear in ospl-error.log.

Solution: The error message is not generated anymore when a non-existent instance gets unregistered.
OSPL-13844 Spliced will crash during shutdown if builtin topics have been disabled.

There is a bug in the spliced that causes it to crash during shutdown when you configured OpenSplice not to communicate the builtin topics. This was caused by spliced forgetting to set the Writers for the builtin topics to NULL in that case, which during shutdown would result in the spliced attempting to release dangling random pointers.

Solution: The writers for the builtin topics are now properly set to NULL when you disabled the builtin topics, and therefore spliced will not attempt to clean them up during shutdown.
OSPL-13868 Configuration files for NetworkPartitions example were incorrect

The example configuration files include a ddsi2 service and not ddsie2 so extra additional values are not visible in configuration tool and would not be used in OpenSplice. Additionally a number of the elements are incorrectly cased.

Solution: Updated Example files have been included.
OSPL-13888 The durability service leaks memory when handling a received namespace message.

When a namespace message from a fellow is received and that namespace message is a duplicate of an earlier received namespace message allocated namespace leaks.

Solution: The duplicate namespace is freed.
OSPL-13892 Potential backlog in the processing of builtin topics by spliced

The spliced is responsible for processing incoming builtin topic samples. This processing is needed to for example modify the local notion of the liveliness of remote entities and the instances they have written. Having the wrong notion of the liveliness of a remote entity could result in instances being marked ALIVE, while they should have been marked NOT_ALIVE or vice-versa. Also, the failure to notice the deletion of a remote entity could result in extended waiting times in case of for example the synchronous write, where a writer is still waiting for acknowledgments of a Reader that already left the system. Due to a bug in the spliced code, the spliced could under certain conditions postpone processing of builtin topics for potentially long time intervals, resulting in incorrect liveliness representations during this interval, which in turn might cause extended waiting times in case of a synchronous write call.

Solution: The spliced now no longer postpones the processing of builtin topics, causing the representation of the liveliness of entities and instances to be up to date, and avoiding unnecessary waiting times in the synchronous write call for readers that have already been deleted.
OSPL-13923 MATLAB Query and Reader failure with waitsetTimeout()

The MATLB Vortex.Query class would throw on calls to take() or read() if a non-zero value had previously been provided to waitsetTimeout(). BAD_PARAMTER messages would be written to ospl-error.log. In a similar situation, a Vortex.Reader class instance would appear to succeed, but ospl-error.log would still contain BAD_PARAMETER messages, and a DDS entity would be leaked with each call to read() or take()

Solution: The problems have been fixed. Uninstall the currently installed Vortex_DDS_MATLAB_API toolbox and install the new toolbox distributed with this release. (The toolbox is located under tools/matlab in the OpenSplice installation directory.)
OSPL-13929 / 00020814 Alignment of DCPSPublication may cause that instances that were explicitly unregistered and disposed are not purged and leak from shared memory.

When detecting a disconnect of a node the instances written by writers on that disconnected node are unregistered. When the same node reconnects then alignment of DCPSPublication will indicate which writer are still alive. These DCPSPublication will then be used to update of the liveliness of the corresponding instances. However explicitly unregistered instances are also updated which causes that they are removed from the purge list which results in a memory leak.

Solution: When handling the re-alignment of a DCPSPublication the corresponding instances that were explicitly unregistered are ignored.
OSPL-13931 Potential alignment issue for unions in generic copy routines for C

The generic copy routines for the C API may potentially misalign union attributes, causing the fields following the union to contain bogus values.

Solution: The algorithm to determine proper alignment for unions has been corrected.
OSPL-13937 Enable or disable tracing for Dlite

In some situtions users want to disable tracing in production environments, and enable tracing in testing environment. So far, there has not been an easy way other than commenting out the tracing section in the configuration. This is cumbersome.

Solution: An attribute //OpenSplice/Dlite/Tracing[@enabled] is added that can be used to enable/disable tracing for Dlite.
OSPL-13803 Possible crash at termination of NodeJS with DDS Security

The DDS Security implementation relies on a certain termination path to cleanup all it's resources, part of it dependent on an exit handler. This exit handler does not run reliably at the same moment, eg. before or after certain threads are joined, depending on context, such as a single-process deployment running in NodeJS.

Solution: The cleanup was changed to work regardless of the exact moment when the exit handler is executed.
OSPL-13799 / 00020745Generate a logging message for dispose_all_event

The invocation of the dispose_all() function on a topic is an important event, that should appear in the ospl-info.log file.

Solution: A message is written into the ospl-info.log by the node that invokes the dispose_all() function. Note: although all other nodes respond by also disposing their corresponding topic data, they don't mention this event in their ospl-info.log.
OSPL-13870Add a parameter to RT networking to allow the independent setting of the acknowledgement interval.

A reliable channel uses acknowledgement messages to notify the sender that a packet has been successfully received. To limit the rate at which acknowledgements are sent the acknowledgements are accumulates during the acknowledgement interval. Currently the acknowledgement interval is set to the configured resolution interval of the channel. However it could be useful to have the ability have an independent parameter which specifies the acknowledgement interval.

Solution: The AckResolution parameter has been added to the RT networking configuration. When set it will determine the interval at which acknowledgements are sent. When not set the acknowledgement interval is set to the resolution of the reliable channel.
OSPL-13871Add an configuration parameter to RT networking to disable the sync exchange at initial detection of a node.

When the SyncMessageExchange is disabled reliable communication with another node is started when receiving the first acknowledge from that node. When the SyncMessageExchange is enabled reliable communication will also start when receiving a first message from a node. The sync message will communicate the sequence number from which reliable communication is provided. However this may cause a very high backlog of packets that have to be resend to the newly detected node especially when initial latencies are large. Therefore an option should be provided to enable the SyncMessageExchange only on receiving the first acknowledgement which will reduce this initial backlog or resends to occur.

Solution: A mode attribute is added to the SyncMessageExchange parameter which indicates if the reliable synchronization should occur on the initial received packer or the first received acknowledgement.
OSPL-13875Add option to RT networking to disable the initial sequence number offset.

To establish reliable communication between a sender and receiver both have to agree on the initial packet sequence number from which reliable communication is established. This sequence number is based on the first sequence number that is acknowledged minus a small offset which in included in each packet sent. The initial sequence number is than the first acknowledged sequence number minus the offset. This offset then determines the number of packets that have to be resend immediately. To reduce this initial backlog an configuration parameter has to be added to disable this offset.

Solution: The configuration parameter OffsetEnabled is added to allow disabling the offset calculation.
OSPL-13939 / 00020822Looser type-checks for the Python API

It is no longer required to use python built-ins as long as the cast is defined. For string types only support for encoding and length determination is needed. For sequences and arrays iteration and length determination are the only requirements.

Solution: Loosened type requirements on integers, bools, strings and floats.

Fixed Bugs and Changes affecting API/behavior in Vortex OpenSplice 6.11.x

Vortex OpenSplice 6.11.0

Report ID.Description
OSPL-12968 / 00019801DCPS Python API Out of scope entities aren't closed, causing memory leak

When a python object holding a DDSEntity loses all references, the underlying DDS entity is not deleted, leaking the resource. In a domain where many entities are created dynamically, but not closed explicitly with the close(), this will eventually result in an Out of Resources error.

Solution: Python objects automatically garbage collected when all references to the object are gone. There was code in the DCPS python API that deletes the underlying DDS entity when this occurs, but this never triggered because the were strong references to all DDSEntity python objects held in a persistent dictionary in the dds module. To remedy this, this dictionary was changed to store only weak references, so now the entity is deleted when the python object is garbage collected. There is an important thing for developers using this API to note, now. Before this change, it was possible to create a DDSEntity (typically with a listener) and just let it go out of scope, relying on the listener to just do it's thing, and keeping only a parent entity (like the participant) as the main means to control the lifecycle. This is no longer possible. Python code must keep a reference to a DDSEntity object to keep it active. However, note that DDSEntity still maintains a strong reference to its parent entity, meaning that once a reader or writer reference made, one can let go of a participant, publisher, and/or subscriber reference without having it be garbage collected. Once the last reference to the reader/writer is gone, only then is the entire chain of entities deleted.