BuiltInEthernet issue

The command for ping: ping -t 172.16.2.14

Yes, the firmware for both EMX and G120E 4.3.8.1

I will look at the ethernet traffic and let you know if I see anything abnormal.
Thanks.

@John

would starting and stopping PWM or updating a screen every ten milliseconds cause interrupts to be masked which would result in the loss of ICMP messages? Race condition?

@ssalmi
Why updated a screen every 10ms. Do you need a 100 frames per seconds refresh rate?

Losing a ICMP packet and then recovering is not a major problem. I thought once the packet was lost, the device stopped responding. Your heartbeat processing should handle an occasional lost of heartbeat.

In a busy network yes it is possible. There is some buffering internally but not sure how much data it can hold.

Or a very busy device?

John,
Now that I think sometimes instead of ping timeout it took 2-3 seconds to answer the ping. Do you still suggest that the packets get lost?

Mike,

I’ve increased the screen refresh time to every 100ms, but the problem still exists.
The frequency 2-3 minutes of missing ping is for the very simple code. It is much higher in my original application. Ping response times are in seconds or the ping times out.
In the application, one G120E sends a ping every 5 seconds and if does not receive 3 consecutive ping responses it assumes the other G120E lost communication or is not present.
Although G120E is present but this problem occurs in about every 5 minutes (3 consecutive ping time outs).

Do you have a GHI board that you use to test? Just to rule out the custom hardware.

No, unfortunately I do not have GHI board but the same custom board is used for EMX and G120E.

To clarify: is it only ever just late replies to pings or is there sometimes never a reply?

Load on the device can certainly cause degraded network performance, but we’re not doing much in this situation. Given that the problem is exacerbated in your full application it does seem like the device is doing too much to respond quickly.

Load on the network can also cause degraded performance and it’s possible the G120 firmware isn’t as resilient to that as the EMX firmware. Is the PC connected to another network (perhaps via Wi-Fi or another ethernet connection) or is it just connected to the device?

Were you able to get the wireshark trace?

I set ping timeout on my PC to 10 seconds and I still get timeout. Sometimes in addition to timeouts I see long response time, 2-3 seconds.
Here are some of the images from wireshark trace:


Here it looks like there was no response.

In the next image the response was send to a wrong address:


My PC is directly connected to G120E through ethernet. There is no Ethernet but the PC is connected to another network with WIFI> however I turned WIFI off and it didn’t fix it.

I do understand the point, but why the same code (small code and real application) does not affect EMX? Is the device doing something in the background that we can’t see?

Thanks for the trace captures. There is other stuff going on in the firmware beyond your application like interrupts and timers. Since the EMX apparently doesn’t show the issue, there is likely a difference in their firmwares (they do not share the same codebase). For what it’s worth, we are able to reproduce delayed ping response here, but not the lost response (which appears to go to the wrong address?).

While neither case is ideal, is it possible to tweak the timeout window or count in your full application such that you allow for an errant delayed or missed ping?

John,
Thanks for your reply.
I spend some time yesterday and today to try to tweak the application, but the rate of ping timeouts are so high that increasing the number of the ping timeouts that are ignored will defeat the purpose of it (checking the other system) because if we go too long without any ping then we won’t know if the system is shut down, is in error mode or just not responding.

The issue does seem to scale with the load on the system. Is there some tight inner loop that you can break up? Perhaps making things more event driven? If the system is allowed to be idle for some periods that may help. Of course, this depends on your application.

If only to test, in whatever inner loops you have, play around with adding some sleeps of increasing duration up to a few hundred milliseconds to see if the ping issue is reduced, if possible.

Hello
We met similar issue working on our project year ago - Remove socket's supply with c# code - #13 by przemo
Network communication was not reliable; there were many retransmission and lost packets
We tried really hard to solve it but with no result
So my opinion is that G120E driving 4,3" TFT display is not combination that you can use for commercial network product.

1 Like

So you mean it works fine without display?

yes. It also works fine with character display

That is a very interesting observation. This should help us in investigating TinyCLR.

Hello and happy new year.
I was out during the holidays and saw the conversation about LCD and Ethernet communication when I got back.
So does this mean that we can’t use Ethernet communication reliably because we have a LCD display? This is going to be a big problem for us.

Is there a plan to fix this? Is there a work around that we can use?

We are certain that is the case and honestly it doesn’t make sense. What I suggest is we discuss what you are doing directly with our engineers and see what the best path might be. We are now in the middle of the switch from NETMF to TinyCLR so your timing and future plans are a big factor of the solution.

Thank you for your reply. I’ll get in touch with your engineering group.