Network Reliability Issue

This was originally posted at https://www.ghielectronics.com/community/forum/topic?id=22141&page=15#msg215369.

@ John requested that I create a stripped down version of the application so that GHI can test it; he also requested that I start a new thread.

Configure the Raptor according to the included definition. The Raptor should be on 2016 R1 Pre-Release 2. Copy the files contained within the “MANUALLY COPY THESE FILES TO THE ROOT OF THE SD CARD” folder (I couldn’t upload a .zip file here, so head over to https://www.ghielectronics.com/community/codeshare/entry/1081 to download the code) to the root of the SD card. Deploy and run without attaching a debugger. (I don’t know if it matters, but in my test case, I did not have a debugger attached.) The Raptor was connected via Ethernet cable to a Dell PowerConnect 5524P switch. I’ve also seen it fail pretty reliably on a Dell PowerConnect 2824. The client is a desktop web browser connected wirelessly through an access point. Run a continuous ping so that you can see it drop off the network. The Raptor will use DHCP, and the web application runs on port 80. The web application does an AJAX callback every minute (JSON). If you cannot get the device to fail by repeatedly refreshing the page, leave it sitting in the background, go away for 30 minutes to an hour, then try refreshing the page again. It almost always dies for me then.

@ ethicalhacker - Unfortunately I can’t get the board to permanently go offline. I’ve tried your code with an ENC28 and the Wi-Fi RS21. With both interfaces, the application still loads fine after a few hours. The connection between the board was separated by various switches and wireless.

The only issue I have seen is when I refresh all 25 tabs repeatedly, the board will get backlogged and the connections will time out. This is mostly because of the browser closing connections abruptly and the debug output of the board getting filled with exceptions and your tracing messages. It’s mitigated by switching to serial debugging or having the debugger connected. Even so, the board does eventually recover from this as everything times out and comes back.

If I let all 25 tabs load and then refresh them all once, they all eventually load without issue, though a bit slowly.

At this point I would make sure you’ve tried multiple FEZ Raptors, ENC28s, and cables if you have them. If everything is still good after that, I would connect the Raptor and a PC directly to each other and test. After that, I would connect them both to the switches you mentioned with nothing else connected (you’ll have to switch to static IP unless it has a DHCP server). If you can reproduce the failures with just that switch we can take a look at testing with that switch.

I have done this. I have two raptors with the exact same configuration but purchased at different times. They both fail. The cables, clients, and browsers are also different.

Confirmed.

I had a similar problem with the Raptor and ENC28. I created a web application with approximately 40 sources (html, js, css …) with jQuery, jQuery UI and AJAX. My server running for about two days and then was unreachable. I have long studied and tried to solve the problem. Once I tried to cooling USB Client DP module and then the system is running 50 days without issue :wink:

1 Like

@ Majo -

Are you talking USB Client DP or ENC28 modules? :think:

@ Dat -

USB Client DP, see my first picture.

Hmm, someone with the hardware skills explain what happens when USB DP becomes hot… :wall: :think:

As per cyberh0me comment, you would see the voltage start to dip. The board uses linear regulators which dissipate the regulation as mainly heat. Is the very reason I only design something that has to be reliable with switching supplies, even for simple designs.

@ Majo - Are you ready to try some experiments with your setup to measure the network stability and the voltage on the USB DP on longer periods of time? And see if there is a problem there? ::slight_smile:

@ Dat

After your question, I realized that the ENC28 is the secondary cooling of the intake air into the fan. In any case, uncooled system cause problems.

@ njbuch - Yes, there is no problem, I will continue to experiment :wink:

@ Majo - Great, but based on the fact the you both cool the end enc-module and the usb dp, you might decide to expand the test into cooling them both and individually…