G120+ENC28: Memory fragmentation? on hevy network traffic

Hi,

we where chasing some strange G120+ENC28 custom device crashes for a while now. (using FW V4.3.8.1)
We finally have identified the root cause of it.
One device on the same network (an Atmel UC3 based device) had a multicast MAC address (1st Byte was 0x01).
By this (at least I guess) all messages from the PC to this device where also routed to the G120 device by the network switch (simple non managed industrial grade switch from MOXA).
Generally everything works fine as long as the traffic to the UC3 device is low.
But in some cases there is a lot of traffic to this device.
In this case the G120 device has a huge performance drop (and often even a total crash) after about 3 hours. Reducing this traffic a little bit delayed the crash to about 7 hours. The crash times where very constant at a given traffic rate to the UC3 device.

Funny enough: If I set up a Debug.GC(false); every 30 seconds on the G120, I did not get any crash within days. Running the Debug.GC(false); every 300 seconds did not really help.
Because the GC solved this issue, and always returnd a constant amount of free RAM (more than 6 MByte), I think its more a case of memory fragmentation then usage or leak.

Changing the UC3 device to a unicast MAC (first Byte 0x02) and not running the GC on timer also solved the issue.

@GHI: Does my analysis makes any sens? May be this could be improved for TinyCLR?

1 Like

When you run Debug.GC(false) every 300 seconds, did it also return a constant amount of usage? 30 seconds may not have fixed the problem, it more likely than not just made it infrequent enough to escape notice in testing.

Depending on exactly the amount of memory used and its characteristics, fragmentation shouldn’t be a big issue outside of memory pressure scenarios, the GC will move stuff around to compact the heap. Especially considering you have 6MB of free space left.

Hard to say more without digging into it ourselves. An outright crash under load is interesting.

To your specific question, we’re of course interested in all manner of improvements for TinyCLR. As specific bottlenecks are identified we can investigate.

2 Likes

Now that you ask: I never looked at the RAM usage when I ran Debug.GC(false) every 300 sec.
For now we have solved the issue by changing the other MAC address.
Further investigations on my side have to wait at least till February. But if possible I will try to reproduce the issue on a smaler scope.

1 Like