Networking and the state of NetMF 4.4

Having implemented the changes (handling each Socket.Accept call in its own thread) as suggested by @ RobvanSchelven the problem remains exactly the same with the code at some point hanging on Socket.Accept()

Steve

@ sh - Yes, the accept does still hang, but now you can check this from the original thread and cancel the blocking thread after a timeout that you can choose.

Correct. The concept is that a separate thread is spun and then when the thread doesn’t perform it is killed. This is safe and used by most.

OK thanks to everyone. As often happens in the forums a few code suggestions are offered and then there is a presumption that everything is fixed.

I am trying to tread gently as I appreciate everyone’s help and don’t want to upset a potential supplier but at some point I will have to walk away as it seems very difficult to get any straight answers to my questions.

I don’t want to start another round of discussion but on the points about how to deal with the socket issues I have (as I have stated many times) tried any number of ways of wrapping the socket.accept call in a mechanism that allows the thread to continue. As to the suggested fix of spawning new threads I have tried threads, timers and an execution constraints on both the listening and working socket and the behaviour is the same every time.

For sure you can continue the thread but any subsequent calls to the socket also hang and trigger a chain of constraintexceptions or threadabortexceptions so the net effect is no different. Once one of the these hung states occurs the entire network stack is crashed.If we are saying that we used the timed out thread to restart the whole network stack then I would consider that completely unacceptable.

This is exactly the reason I didn’t offer a code example in the first place; they are not being used in the right way. If someone would run the sample I provided, replicate the problem and offer a solution then I would be most grateful, but what we get is general advice about improving code (which was a thrown together sample in the first place) and no focus on the underlying issue or questions. I get the feeling that behind these posts there is a general assumption that the ‘end users’ don’t really know what they are doing and that’s where the problem lies.

Telling me that “The concept is that a separate thread is spun and then when the thread doesn’t perform it is killed. This is safe and used by most.” really makes it clear where you think we are all at. The fact of the matter is that 99% of C# socket code in production is asynchronous by nature and the focus on this as being the cause completely missed the point of my questions or any of the reports, feedback or code I have provided.

After many posts and a code sample demonstrating the problem there is no acceptance that the problem even exists.

Just one more try from me and at the expense of having to sour a relationship and go elsewhere

  • Is the NetMF 4.4 Lwip integration changes likely to have any effect on these socket issues. I don’t need comments like "don’t hold your breath’ or ‘its more complicated than that’. We come to GHI because they are experts in NetMF and there are lots of changes afoot in that framework so it would be great to get some guidance about those changes from a company that is, I presume, in close contact or at least following developments in NetMF.

  • If there is some hope of an improvement under 4.4 then would GHI have any guidance (not commitments) on plans to build a NetMF 4.4 based SDK / Firmware. We are cautioned not to make assumptions and I couldn’t agree more but they are inevitable when there is a total lack of concrete information. We are told that GHI will keep us informed but that simply isn’t the case. No doubt when it’s done we will be told that it’s done but that really isn’t the level of information required.

Sorry for my frustrated tone, I have spent many hours on this problem at the end of many weeks building a prototype. I would like nothing more than for it to work, and when it doesn’t I would like to be able to report at my end on what the roadmap to a fix looks like. For these reasons I am pretty much done.

BIG SHAME !!!

2 Likes

Shame?!
Sound like you made your mind up without giving GHI a chance to help. If still interested, we will be more than happy to help. Our contact info are on the contact us page.

It’s only been 5 hours since you posted the code. Aside from not engaging directly with GHI, your expectations for community support don’t seem very realistic.

@ mcalsyn - correct and we are on holidays. I just happend to be checking the forum from home.

@ Gus - I haven’t made my mind up, you guys have !!!

OK, just letting off steam and very genuinely I do want a custom G400 custom board from GHI to be at the centre of my product. There is an opportunity to really turn an existing industry on it’s head by focusing on the interface between hardware and software and not on the very old embedded style of electronics. I am heavily ‘invested’ into this and want it to succeed.

I will happily contact you via the ‘contact’ pages but I did that recently to start the commercial discussion and it took 12 days to get a response and that e-mail thread has also gone cold. Tell me how to help you to help me and I will do it.

Thanks Gus,

Steve

@ sh - 12 days! This never happens and I will dig into this to see why the delay.

Please take a deep breath after all this venting and let’s talk calmly about how can we assist. Clearly you like our products but you want a good example on networking. We can help with that.

@ mcalsyn - Can I be really clear about this - I don’t want advice on coding I want a simple response to 2 questions and it’s like bashing your head against a wall.

Whether anyone wants to accept it or not there is an issue with this network stack and I don’t want wrappers, fixes or workarounds. None of them are sufficient to be considered of production quality on a commercial product and quite honestly none of them actually work.

I asked 2 simple questions - firstly whether GHI could advise me what they knew about the changes within NetMF 4.4 and whether there was likely to be any improvements to the Lwip integration which would affect this issue. I presume they keep abreast of the changes and are at least a little better informed than me.

Secondly I asked for some guidance on whether GHI had any plans to roll out the NetMF 4.4 based firmware and on what time scales.

Simple questions, they can be 2 negatives if no-one knows the answers, it might be commercially sensitive or maybe they simply don’t want to inform their users but it would take no more than a few sentences to answer them. Instead there is a reluctance to accept that the problem exists and no attempt to answer my questions.

I did contact GHI a while ago and it took nearly 2 weeks to get a one line response.

@ Gus - I have spent all day researching alternatives and I don’t want to be doing that as it means abandoning a lot of work and a path that I believe is correct.

If there is a working example of a socket based server that will stay up and running then I would be delighted to have a look over the code and / or test it. Also if there are issues with any impending NetMF rollouts I am happy to sign an NDA so that I can learn more.

Thanks again

Steve

It may comfort you a little to look at this thread where Gus indicates that 4.4 support is coming very soon.
https://www.ghielectronics.com/community/forum/topic?id=22026&page=1

Sorry - didn’t realize that the code was perfect. Surely must be a firmware problem then.

1 Like

@ mcalsyn - If someone could provide a working example then of course I would be delighted but as you well know every single TPC/IP Socket based implementation written in C# or any other language will be different but the point is nearly all of them work and exhibit small bugs or performance issues.

When you have tried a multitude of ways, tried all sorts of wrappers and sticking plaster fixes and none of them work it becomes relatively obvious that there is an underlying issue. No code is perfect but nor should it need to be to get to a functioning application

1 Like

Thanks to everyone for all their help, despite my frustrated tone I do appreciate anyone taking time to help.

I look forward to more news on the GHI 4.4 rollout and if anyone out there has a silver bullet in the meantime then they will forever be in my good books. Have a nice weekend

Steve

1 Like

@ andre.m - I think he lost your point. You might want to revise?

3 Likes

Working code below. I fixed a compile error, and then successfully reproduced your problem. By changing the listen depth to ‘1’ and spawning a thread for each request, this now has run for 15 minutes at 1 req/sec. I will leave it running just in case I have postponed the error.

using System;
using Microsoft.SPOT;
using System.Threading;
using System.Net.Sockets;
using System.Net;
using System.Text;

namespace Topic22141
{
    public class SampleTCPListener
    {
        private AutoResetEvent _listenerStarted;
        internal Socket _socket;
        internal Thread _thread;

        private IPAddress _interfaceAddress = IPAddress.Any;
        private int _receiveTimeout = -1;
        private int _sendTimeout = -1;
        private int _listenBacklog = 10;
        private bool _isActive = false;

        const int BufferSize = 1460;
        const int StringLength = 1 * 1024;

        public bool Start(int servicePort)
        {
            try
            {
                //BUG: This line referred to a non-existent var and _listenerStarted was never
                //     assigned, so I presume this is what you meant.
                _listenerStarted = new AutoResetEvent(false);

                _interfaceAddress = System.Net.IPAddress.GetDefaultLocalAddress();

                _socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);

                _socket.Bind(new IPEndPoint(_interfaceAddress, servicePort));
                _socket.ReceiveTimeout = _receiveTimeout;
                _socket.SendTimeout = _sendTimeout;

                //BUG: Listen with a backlog of n where n > 1 hangs after n requests
                _socket.Listen(1); // _listenBacklog);

                _isActive = true;
                _thread = new Thread(StartListen);
                _thread.Start();

                _listenerStarted.WaitOne();
            }
            catch (Exception)
            {
                return false;
            }
            return true;
        }

        public bool Stop()
        {
            try
            {
                _isActive = false;
                _socket.Close();
            }
            catch (Exception)
            {
                return false;
            }

            return true;
        }

        private void StartListen()
        {
            //Thread.CurrentThread.Priority = ThreadPriority.AboveNormal;

            _listenerStarted.Set();

            while (_isActive)
            {
                var clientSocket = _socket.Accept();

                //CHANGE: new thread for each request (a thread pool would be better)
                new Thread(() =>
                {
                    try
                    {
                        OnSocket(clientSocket);
                    }
                    catch (Exception ex)
                    {
                        throw ex;
                    }
                }).Start();
            }

            _socket.Close();
        }

        protected virtual void OnSocket(Socket socket)
        {
            try
            {
                if (socket.Poll(-1, SelectMode.SelectRead))
                {
                    EndPoint remoteEndPoint = new IPEndPoint(0, 0);

                    if (socket.Available == 0)
                        return;

                    byte[] response = Encoding.UTF8.GetBytes(new String(RandomResponse(StringLength)));

                    int numBytesToWrite = StringLength;
                    int offset = 0;

                    do
                    {
                        var sendSize = (numBytesToWrite <= BufferSize) ? numBytesToWrite : BufferSize;

                        var bytesSent = socket.Send(response, offset, sendSize, SocketFlags.None);
                        numBytesToWrite -= bytesSent;
                        offset += bytesSent;

                    } while (numBytesToWrite > 0);
                }
            }
            catch (SocketException ex)
            {
                if (ex.ErrorCode == (int)SocketError.ConnectionReset)
                    return;
            }
            catch (Exception ex)
            {
                throw ex;
            }
            finally
            {
                socket.Close();
            }
        }

        private char[] RandomResponse(int length)
        {
            var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";

            var stringChars = new char[length];
            var random = new Random();

            for (int i = 0; i < stringChars.Length; i++)
            {
                stringChars[i] = chars[random.Next(chars.Length)];
            }

            return stringChars;
        }
    }
}

4 Likes

@ sh - Hi Steve, we spoke last week about you using our jump start service to create a custom board based on the information provided. This week was spent creating the block diagram and to setup the invoice for the Jump Start service. With it being a holiday week there are varying days of vacation and not everyone who was needed happened to be in the office at the same time this week. While this is not a common thing, it’s also not common for us to recommend a custom solution in a few days. We like to think about the information provided to recommend to you, for example you had mentioned you might want a different ethernet solution compared to the ENC28 due to speed. We also don’t just accept what the customer says is the perfect core, in your case it was the G400. We research it to make sure, there might be a cheaper/better solution. I of course am not going to disclose all the details on this thread but know that we take your business very serious and we wouldn’t just throw some information out there to you to collect the fee. We want to make sure that what we are recommending is really the best solution we can offer you. You have my email, if you have any specific questions please feel free to let me know.

Gary

@ mcalsyn - another example why i have nerds like you on the payroll!
Have the weekend off on me :wink:

1 Like