SC20100S crashing

Hi,

We are experiencing problems with SC20100S on custom boards crashing occasionally. They just randomly stop working. We have a led on our boards that is continously blinking; controlled by its own thread. This led stops blinking, activity on the can bus stops and the module becomes unpingable.

We tried connecting the debugger and wait; it just gave us a message in the console like “process x has exited” and then the debugging process stopped.

We are still trying to make this reproducible. Due to time pressure we have used the watchdog to circumvent the problem, but we are soon releasing a river cruise ship auto pilot based on sitcore and can not have crashes every couple hours…

Are there any ways or tools we can debug this? Any things we can track?

We were thinking of monitoring memory usage to see if it decreases somehow or surrounding the entire code with a try/catch block and logging any caught exception. The latter probably won’t catch our issue as the entire system just locks up.

Is there ways to debug on a lower level what is happening to see what the last thing is that the cpu does before crashing? Any other variables we can track? Any other procedure we can follow?

Thanks!

Is it 2.2.0.300?
Is CAN heavy?
How long does the device run before crashed?
Does it use network? If so, enc or wifi?
Is it happened in previous version?

It is latest firmware but happened on all firmware we have tested.

Can is about 600 msg/sec. Ethernet is used using built in mac and dp83848 phy. It runs for anywhere between a couple hours and a few days.

Can, network, anything else?

We dont think memeory. What we can think now is, one of error happen that cause error interrupt and the flag not cleared. Causing system trigger interrupt always.

But probem is, what peripheral, what kind of error.

We have multiple modules that have this. Some also read UART, some read analog pins but the only thing all the crashing modules have in common are CAN and network.

Might be worth mentioning that we have about 2-3 tcp sockets always running in each module.

To test, might it be useful to disable the can controllers and then wait a few days to see if the crash still happens?

We don’t have any other way better than that for now.

Ok!

What we will do is make three modules to test: one with just ethernet and can code, one with just ethernet and one with just can.

No watchdog on any of them and then see what causes the crash.

Also, just realized another thing these have in common is controlling WS2812B LEDs.

power supply?

1 Like

I thought about it, but doubt it. The power supply is able to provide 2A and I am only running 3 ws2812s at 10% brightness.

Any one failed yet?

Have not tested yet, it almost never happens in our lab. I will be spending some time in the field in the coming weeks and will do extensive testing.

Any update yet :))?

Look to us this is last main issue we are waiting for the release.:))

1 Like

I understand! I installed the modules yesterday and am waiting for one to crash. As of now, nothing crashed yet.

I will report back as soon as I know more.

No crashes as of yet, surprisingly

I’m sweating.

1 Like

Modules have reached 7 days uptime, have not seen that happen before with these modules.

I’d say this issue is closed for now.

thank you!