Jump to content


Photo

Problems when writing to publisher


  • Please log in to reply
14 replies to this topic

#1 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 02 November 2010 - 08:41 PM

Open Splice is hanging on this function call

    DDS_ReturnCode_t ret(Typedefs<T>::write(mDataWriter, &data));

where mDataWriter is created with this call

    mTopic = DDS_DomainParticipant_create_topic(dp,
                                                name.c_str(), 
                                                Typedefs<T>::typeName(ts),
                                                topicQos,
                                                0,
                                                DDS_STATUS_MASK_NONE);
  
    //data writer
    mDataWriter = DDS_Publisher_create_datawriter(publisher,
                                                  mTopic,
                                                  DDS_DATAWRITER_QOS_USE_TOPIC_QOS,
                                                  0,
                                                  DDS_STATUS_MASK_NONE);
  
    //data reader
    mDataReader = DDS_Subscriber_create_datareader(subscriber,
                                                   mTopic,
                                                   DDS_DATAREADER_QOS_USE_TOPIC_QOS,
                                                   0,
                                                   DDS_STATUS_MASK_NONE);
  
    //data listener
    mDataListener = DDS_DataReaderListener__alloc();

    mDataListener->on_data_available = onDataAvailable<T>;
    mDataListener->on_requested_deadline_missed = 0;
    mDataListener->on_requested_incompatible_qos = 0;
    mDataListener->on_sample_rejected = 0;
    mDataListener->on_liveliness_changed = 0;
    mDataListener->on_subscription_matched = 0;
    mDataListener->on_sample_lost = 0;
    mDataListener->listener_data = &mListenerData;
    mListenerData.topic = this;

Any ideas? I dont really know where else to look. I have been sifting through docs all day, and it just seems like everything is right, yet it just doesnt want to work. Any help would be greatly greatly appreciated

#2 James Butcher

James Butcher

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 02 November 2010 - 10:02 PM

Hi

A few questions that would help our understanding of the problem : what sort of data are you trying to write (i.e. how is it represented in idl?), how do you call register_type for the type in question? Are there any clues in the ospl-info.log file? Is there an ospl-error.log file?

I notice you have some template code here, is it possible for you to try to test this with the standard C API calls to track down whether it is an issue with that, and build up from there? See the PingPong and/or Tutorial examples if required.

Also are you checking the return values of each call?

Failing that, have you tried attaching with a debugger and try to find whereabouts the hang is happening?

Thanks,
James

#3 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 03 November 2010 - 01:22 PM

Hi

A few questions that would help our understanding of the problem : what sort of data are you trying to write (i.e. how is it represented in idl?), how do you call register_type for the type in question? Are there any clues in the ospl-info.log file? Is there an ospl-error.log file?

I notice you have some template code here, is it possible for you to try to test this with the standard C API calls to track down whether it is an issue with that, and build up from there? See the PingPong and/or Tutorial examples if required.

Also are you checking the return values of each call?

Failing that, have you tried attaching with a debugger and try to find whereabouts the hang is happening?

Thanks,
James



Hey James, thanks for the reply!!

I do not have an error log, And the info.log is a little too long to post here, but i can give you a few of the messages i am seeing.

Report : INFO
Date : Tue Nov 02 16:32:02 2010
Description : service 'durability': Locking disabled
Node : mycomp
Process : durability <3836>
Thread : 2176
Internals : V5.3.0OSS/lockPages/u_service.c/170/0/330774933
========================================================================================
Report : INFO
Date : Tue Nov 02 16:32:02 2010
Description : service 'networking': Locking disabled
Node : mycomp
Process : networking <2904>
Thread : 4440
Internals : V5.3.0OSS/lockPages/u_service.c/170/0/365160000
========================================================================================
Report : INFO
Date : Tue Nov 02 16:32:03 2010
Description : Durability identification is: 285669896
Node : mycomp
Process : durability <3836>
Thread : 2176
Internals : V5.3.0OSS/DurabilityService/d_durability.c/459/0/64308106
========================================================================================
Report : WARNING
Date : Tue Nov 02 16:32:09 2010
Description : Persistence not enabled!
Node : mycomp
Process : durability <3836>
Thread : 2176
Internals : V5.3.0OSS/DurabilityService/d_groupLocalListener.c/2727/0/590158738



my idl file looks like this
module Generic
{
  struct Str
  {
    string text;
  };
  #pragma keylist Str
};

The return codes seem to be fine, But I cant figure out why I am getting stuck in that write call. Ill take a look at the PingPong tutorial, where can I find it?

#4 James Butcher

James Butcher

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 03 November 2010 - 09:53 PM

Well the ospl-info.log seems ok - the messages shown are to do with the networking and durability services rather than an issue with your topic. By the way messages are appended to that log file each time you run so I usually delete it before each new run. If you'd like to simplify the scenario until the write call is working you could temporarily remove those services from the ospl.xml.

The idl looks fine too, so from the code posted above I suspect it is either an issue with the templates, or type registration problem. What are the values of the two string parameters to the create_topic call above? Maybe you could post a bit more code, including that for register_type. Are you able to test writing a sample with a C call such as :

DDS_ReturnCode_t result = Generic_StrDataWriter_write (mDataWriter, &data, DDS_HANDLE_NIL);

The PingPong and Tutorial examples are located under examples/dcps/standalone/C in an OpenSplice distribution. If you are user a dev version you'll need to do "make install" to produce the distribution.

#5 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 04 November 2010 - 05:27 PM

Here is what I have to register_type in my typdefs.h file.

    static DDS_ReturnCode_t registerType(TypeSupport ts, DDS_DomainParticipant dp) \
    { \
      return TYPE##TypeSupport_register_type(ts, dp, 0); \
    } \

and here is how i create the topic

    mTopic = DDS_DomainParticipant_create_topic(dp,
                                                name.c_str(), 
                                                Typedefs<T>::typeName(ts),
                                                topicQos,
                                                0,
                                                DDS_STATUS_MASK_NONE);

if its easier i can just send you all of the files i am using via email. this is code i did not write but i am debugging, so i dont know all of the terminology that you do so sorry if i didnt give you exactly the code you were looking for.

again, thanks for your help

#6 James Butcher

James Butcher

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 04 November 2010 - 09:56 PM

So you are using nil for the 3rd parameter to register_type - that means that the default type name is used for the registration. This must match the 3rd parameter to create_topic, you could check that is the case? i.e. in your scenario it should be "Generic::Str". I suppose that must be the correct though, otherwise you would get errors at that stage - you already said the return values were ok, so I still guess the template code is adding the complexity?

Can you gather together the code that shows the problem in its simplest form that could be easily tested, simple C where possible?

#7 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 05 November 2010 - 01:08 PM

Well, this code isnt written in C, its C++. I am using C-Bindings to make functions calls to opensplice. This i realize, might be the cause of the issue, since i dont believe it is something that is supported. I am going to paste the entirety of my type and topic code. maybe that will help.

//type.cpp
namespace test
{
  template<typename T>
  Type<T>::Type()
  {
    if (!isInitialized())
    {
      initialize();
    }
  
    //create type
    mTypeSupport = Typedefs<T>::newTypeSupport();
    chkPtr(mTypeSupport, std::string("Failed to create ")+
                         typeid(mTypeSupport).name());
  
    //register type
    DDS_ReturnCode_t ret(Typedefs<T>::registerType(mTypeSupport, dp));
    chkRetcode(ret, std::string("Failed to register ")+
                    typeid(mTypeSupport).name());
  }
  
  template<typename T>
  Type<T>::~Type()
  {
    typename TopicMap::iterator it(mTopicMap.begin());
    while (it != mTopicMap.end())
    {
      delete it->second;
      it->second = 0;
      mTopicMap.erase(it++);
    }
  
    DDS_free(mTypeSupport);
  }
  
  template<typename T>
  Topic<T>* Type<T>::createTopic(const std::string& name)
  {
    Topic<T>* ret(new Topic<T>(mTypeSupport, name));
    mTopicMap.insert(std::pair<std::string, Topic<T>*>(name, ret));
    return ret;
  }
  
  template<typename T>
  void Type<T>::publish(T& data, const std::string& topicname)
  {
    Topic<T>* topic(findTopic(topicname, true));
    if (topic)
    {
      topic->publish(data);
    }
  }
  
  template<typename T>
  bool Type<T>::read(T& data, const std::string& topicname)
  {
    bool ret(false);
    Topic<T>* topic(findTopic(topicname, true));
    if (topic)
    {
      ret = topic->read(data);
    }
    return ret;
  }
  
  template<typename T>
  void Type<T>::subscribe(Callback<T>& callback, const std::string& topicname)
  {
    Topic<T>* topic(findTopic(topicname, true));
    if (topic)
    {
      topic->subscribe(callback);
    }
  }
  
  template<typename T>
  void Type<T>::unsubscribe(const std::string& topicname)
  {
    Topic<T>* topic(findTopic(topicname, true));
    if (topic)
    {
      topic->unsubscribe();
    }
  }
  
  template<typename T>
  Topic<T>* Type<T>::findTopic(const std::string& name, bool create) 
  {
    Topic<T>* ret(0);
    typename TopicMap::const_iterator it(mTopicMap.find(name));
    if (it == mTopicMap.end() && create)
    {
      ret = createTopic(name);
    }
    else if (it != mTopicMap.end())
    {
      ret = it->second;
    }
    return ret;
  }
}  


Topic.cpp

namespace test
{
  template<typename T>
  inline void onDataAvailable(void* listenerData, DDS_DataReader )
  {
    ListenerData<T>* ld(reinterpret_cast<ListenerData<T>*>(listenerData));
    if (ld && ld->topic && ld->callback)
    {
      ld->topic->read(ld->callback->mData);
      ld->callback->onData();
    }
  }

  template<typename T>
  Topic<T>::Topic(typename Typedefs<T>::TypeSupport ts, const std::string& name)
  {
    //check pointer passed into constructor
    chkPtr(ts, "Topic::Topic: ts is null pointer", true);
  
    //create topic
    mTopic = DDS_DomainParticipant_create_topic(dp,
                                                name.c_str(), 
                                                Typedefs<T>::typeName(ts),
                                                topicQos,
                                                0,
                                                DDS_STATUS_MASK_NONE);
    chkPtr(mTopic, std::string("Topic::Topic: Failed to create topic for ")+
                   typeid(typename Typedefs<T>::Type).name());
  
    //data writer
    mDataWriter = DDS_Publisher_create_datawriter(publisher,
                                                  mTopic,
                                                  DDS_DATAWRITER_QOS_USE_TOPIC_QOS,
                                                  0,
                                                  DDS_STATUS_MASK_NONE);
    chkPtr(mDataWriter, std::string("Topic::Topic: Failed to create datawriter for ")+
                        typeid(typename Typedefs<T>::Type).name());
  
    //data reader
    mDataReader = DDS_Subscriber_create_datareader(subscriber,
                                                   mTopic,
                                                   DDS_DATAREADER_QOS_USE_TOPIC_QOS,
                                                   0,
                                                   DDS_STATUS_MASK_NONE);
    chkPtr(mDataReader, std::string("Topic::Topic: Failed to create datareader for ")+
                        typeid(typename Typedefs<T>::Type).name());
  
    //data listener
    mDataListener = DDS_DataReaderListener__alloc();
    chkPtr(mDataListener, std::string("Topic::Topic: Failed to create datalistener for ")+
                          typeid(typename Typedefs<T>::Type).name());

    mDataListener->on_data_available = onDataAvailable<T>;
    mDataListener->on_requested_deadline_missed = 0;
    mDataListener->on_requested_incompatible_qos = 0;
    mDataListener->on_sample_rejected = 0;
    mDataListener->on_liveliness_changed = 0;
    mDataListener->on_subscription_matched = 0;
    mDataListener->on_sample_lost = 0;
    mDataListener->listener_data = &mListenerData;
    mListenerData.topic = this;
  }
  
  template<typename T>
  Topic<T>::~Topic()
  {
    DDS_free(mDataListener);
    DDS_Publisher_delete_datawriter(publisher, mDataWriter);
    DDS_Subscriber_delete_datareader(subscriber, mDataReader);
    DDS_DomainParticipant_delete_topic(dp, mTopic);
  }
  
  template<typename T>
  std::string Topic<T>::name() const
  {
    std::string ret;
    if (mTopic)
    {
      ret = DDS_Topic_get_name(mTopic);
    }
    return ret;
  }
  
  template<typename T>
  void Topic<T>::publish(T& data)
  {
    //send data
    DDS_ReturnCode_t ret2(DDS_Publisher_resume_publications(publisher));
    DDS_ReturnCode_t ret(Typedefs<T>::write(mDataWriter, &data));
    chkRetcode(ret, std::string("Failed to send ")+typeid(data).name());
  }
  
  template<typename T>
  bool Topic<T>::read(T& data)
  {
    bool ret(false);
  
    //gather sequences
    typename Typedefs<T>::Seq* seq(Typedefs<T>::newSequence());
    seq->_length = 0;
    seq->_maximum = 1;
    seq->_release = true;
    seq->_buffer = &data;
    DDS_SampleInfoSeq* infoSeq(DDS_SampleInfoSeq__alloc());
    infoSeq->_length = seq->_length;
    infoSeq->_maximum = seq->_maximum;
    infoSeq->_release = seq->_release;
    infoSeq->_buffer = DDS_SampleInfoSeq_allocbuf(1);
    //read
    DDS_ReturnCode_t r(Typedefs<T>::read(mDataReader, seq, infoSeq));
    chkRetcode(r, std::string("Failed to read ")+typeid(data).name());
    if (seq->_length)
    {
      if (r == DDS_RETCODE_OK)
      {
        ret = true;
      }
    }
    DDS_free(infoSeq);
    DDS_free(seq);
  
    return ret;
  }
  
  template<typename T>
  void Topic<T>::subscribe(Callback<T>& callback)
  {
    mListenerData.callback = &callback;
    DDS_DataReader_set_listener(mDataReader, mDataListener, DDS_DATA_AVAILABLE_STATUS);
  }
  
  template<typename T>
  void Topic<T>::unsubscribe()
  {
    DDS_DataReader_set_listener(mDataReader, 0, DDS_DATA_AVAILABLE_STATUS);
  }
}


#8 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 08 November 2010 - 02:03 PM

So you are using nil for the 3rd parameter to register_type - that means that the default type name is used for the registration. This must match the 3rd parameter to create_topic, you could check that is the case? i.e. in your scenario it should be "Generic::Str". I suppose that must be the correct though, otherwise you would get errors at that stage - you already said the return values were ok, so I still guess the template code is adding the complexity?

Can you gather together the code that shows the problem in its simplest form that could be easily tested, simple C where possible?


Let me rephrase my last post. We are interfacing with OpenSplice C Libraries with C++ code. The code I posted worked in version 5.1, but since upgrading to 5.3, the code has stopped working. Does this provide any clues as to what my problem could be??

#9 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 08 November 2010 - 03:08 PM

also, my error log is spitting out these errors, very fast, my log file gets to about 60mb in less than a minute.

Report : ERROR
Date : Mon Nov 08 10:01:33 2010
Description : os_condInit failed; os_result = 5.
Node : mp4056laptopjs
Process : C:\Development\...\release\chat.exe <5464>
Thread : 5832
Internals : V5.3.0OSS/c_condInit/c_sync.c/317/0/890146338
========================================================================================
Report : ERROR
Date : Mon Nov 08 10:01:33 2010
Description : c_mutex or c_cond operation failed.
Node : mp4056laptopjs
Process : C:\Development\...\release\chat.exe <5464>
Thread : 5952
Internals : V5.3.0OSS/c_sync/c_sync.c/28/0/891068243
========================================================================================

Report : ERROR
Date : Mon Nov 08 10:01:33 2010
Description : os_condWait failed; os_result = 5.
Node : mp4056laptopjs
Process : C:\Development\...\release\chat.exe <5464>
Thread : 5952
Internals : V5.3.0OSS/c_condWait/c_sync.c/344/0/893186389
========================================================================================

#10 James Butcher

James Butcher

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 09 November 2010 - 09:15 PM

Ok I would say eliminating these errors is crucial in order to get OpenSpliceDDS working properly. It looks like a similar issue to this : log file in Windows

Is it the same issue? Do you see errors being reported prior to the hang in the write call?

#11 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 10 November 2010 - 02:54 PM

Ok I would say eliminating these errors is crucial in order to get OpenSpliceDDS working properly. It looks like a similar issue to this : log file in Windows

Is it the same issue? Do you see errors being reported prior to the hang in the write call?


It is literally the exact same issue. jftaylor21 is actually a colleague of mine working on this same issue. The errors are being reported before the write call. As soon as i start up my chat.exe program, the error log starts growing very, very quickly. 100MB in roughly 30 seconds.

#12 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 10 November 2010 - 03:44 PM

After more debugging, it seems the error log starts spitting out errors on the create_participant function call:

    //initializing variables
    static DDS_DomainParticipantFactory dpf(0);
    static DDS_DomainId_t domain(0);
    DDS_DomainParticipant dp(0);

    //domain participant factory
    dpf = DDS_DomainParticipantFactory_get_instance();
    chkPtr(dpf, "Failed to create DDS::DomainParticipantFactory", true);

    //domain participant
    domain = DDS_string_alloc(domainId.size());
    strcpy(domain, domainId.c_str());

    dp = DDS_DomainParticipantFactory_create_participant(dpf,
                                                         domain,
                                                         DDS_PARTICIPANT_QOS_DEFAULT,
                                                         0,
                                                         DDS_STATUS_MASK_NONE);


#13 James Butcher

James Butcher

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 14 November 2010 - 09:58 PM

Hi again

I think I see the problem here. I was able to successfully run the "C++OnC" PingPong example from 5.3 on Windows without any messages appearing in the ospl-error.log file. I then tried to run the snippets of code above and from here and got the errors you mention. The code is very similar to the example, but create_participant in the example uses a nil DomainId (meaning the OSPL_URI variable selects the domain), whereas your code snippets have a DomainId value set for the call. I tried using a "nil" value instead like in the examples and the error messages went away => so unfortunately it looks like you have found a bug on the Windows version of OpenSplice.

Can you work around the issue by using the OSPL_URI variable like the examples do? Or maybe you could debug into this? Or you could raise a bug http://www.opensplic...y/BugReporting. If you have a commercial support subscription for OpenSpliceDDS then you can influence the priority of bug fixes.

#14 slewman

slewman

    Advanced Member

  • Members
  • PipPipPip
  • 51 posts

Posted 22 November 2010 - 01:23 PM

Hi again

I think I see the problem here. I was able to successfully run the "C++OnC" PingPong example from 5.3 on Windows without any messages appearing in the ospl-error.log file. I then tried to run the snippets of code above and from here and got the errors you mention. The code is very similar to the example, but create_participant in the example uses a nil DomainId (meaning the OSPL_URI variable selects the domain), whereas your code snippets have a DomainId value set for the call. I tried using a "nil" value instead like in the examples and the error messages went away => so unfortunately it looks like you have found a bug on the Windows version of OpenSplice.

Can you work around the issue by using the OSPL_URI variable like the examples do? Or maybe you could debug into this? Or you could raise a bug http://www.opensplic...y/BugReporting. If you have a commercial support subscription for OpenSpliceDDS then you can influence the priority of bug fixes.


Yea, I saw someone mention the OSPL_URI method online. I tried it, and it still doesnt work.

#15 esc

esc

    DDS Expert

  • Members
  • PipPipPip
  • 82 posts
  • Company:none

Posted 25 November 2010 - 01:35 PM

Hi Slewman,

Thank you very much for submitting a bugzilla report. We very much appreciate it when community members take time to file a bug report. We will schedule a fix for this bug in for a future release. Unfortunately I can not give any indication at this time for which release this fix will be scheduled. Usually a specific number of bugzilla bugs are fixed each release and of course the important bug fixes needed by commercial clients. If this is a blocking problem for you and you would like it fixed faster then it is possible to contact our sales department (sales@prismtech.com) for commercial support which would allow for a priority increase and of course getting a release of the software before it is pushed to the community edition.

Regards,

Emiel




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users