We are experiencing problems with SC20100S on custom boards crashing occasionally. They just randomly stop working. We have a led on our boards that is continously blinking; controlled by its own thread. This led stops blinking, activity on the can bus stops and the module becomes unpingable.
We tried connecting the debugger and wait; it just gave us a message in the console like “process x has exited” and then the debugging process stopped.
We are still trying to make this reproducible. Due to time pressure we have used the watchdog to circumvent the problem, but we are soon releasing a river cruise ship auto pilot based on sitcore and can not have crashes every couple hours…
Are there any ways or tools we can debug this? Any things we can track?
We were thinking of monitoring memory usage to see if it decreases somehow or surrounding the entire code with a try/catch block and logging any caught exception. The latter probably won’t catch our issue as the entire system just locks up.
Is there ways to debug on a lower level what is happening to see what the last thing is that the cpu does before crashing? Any other variables we can track? Any other procedure we can follow?
Is it 2.2.0.300?
Is CAN heavy?
How long does the device run before crashed?
Does it use network? If so, enc or wifi?
Is it happened in previous version?
We dont think memeory. What we can think now is, one of error happen that cause error interrupt and the flag not cleared. Causing system trigger interrupt always.
But probem is, what peripheral, what kind of error.
We have multiple modules that have this. Some also read UART, some read analog pins but the only thing all the crashing modules have in common are CAN and network.
Might be worth mentioning that we have about 2-3 tcp sockets always running in each module.
To test, might it be useful to disable the can controllers and then wait a few days to see if the crash still happens?