After two days of headaches finding why my new G400 board silently hangs after running for a while, I may have found something.
Preamble:
This is a custom G400 board.
Externally powered.
There are ~1000 CAN mesages received per second.
Posting ~100 CAN messages in rapid succesion hangs G400 (no exceptions, no errors, it simply freezes).
This is a snippet that gives me 100% failure within the first minute (debugger attached):
using System.Threading;
using GHI.Premium.Hardware;
using Microsoft.SPOT;
using Microsoft.SPOT.Hardware;
namespace G400CanTester {
public class Program {
private static Thread _blinkingThread;
private static CAN _can;
private static CAN.Message[] _msgList;
public static void Main() {
//A blinker indicates if the MCU is hanging or not
var led = new OutputPort( GHI.Hardware.G400.Pin.PD18, false);
_blinkingThread = new Thread(() => {
while (true) {
led.Write(false);
Thread.Sleep(100);
led.Write(true);
Thread.Sleep(100);
}
});
_blinkingThread.Priority = ThreadPriority.Lowest;
_blinkingThread.Start();
//Initializing CAN'n'stuff
_msgList=new CAN.Message[1000];
for (int i = 0; i < 1000; i++) {
_msgList[i]=new CAN.Message();
}
var brp = 6;
var sjw = 1;
var propag = 1;
var phase1 = 7;
var phase2 = 7;
_can=new CAN(CAN.Channel.Channel_1, (uint)((brp << 16) + (sjw << 12) + (propag << 8) + (phase1 << 4) + (phase2 << 0)),1000); //1Mbit
_can.DataReceivedEvent += CanDataReceivedEventHandler;
//Creating an array for sending
var sendList = new CAN.Message[100];
for (int i = 0; i < 100; i++) {
sendList[i] = new CAN.Message();
sendList[i].ArbID = (uint) i;
sendList[i].Data[0] = (byte) i;
}
//Send CAN messages and crash G400!
while (true) {
for (int i = 0; i < 100; i++) {
_can.PostMessages(sendList, i, 1);
}
Thread.Sleep(50); //<-smaller value gives higher probability
}
}
private static void CanDataReceivedEventHandler(CAN sender, CANDataReceivedEventArgs args) {
int count = sender.GetMessages(_msgList, 0, 1000);
Debug.Print("CAN: "+count+" received;");
}
}
}
Important notes:
High traffic only makes G400 fail faster. Eventually it fails with only 130 incoming messages per second and 1 outgoing message per second, but one has to wait half of the day.
Problem is somewhere in PostMessages function; if nothing is sent, G400 does not freeze even with higher traffic.
Guys at GHI, please take look at this, my entire career now depends on this bug
case 1: I used 2 G400, one send - one receive at 1Mb over socket 7, it is running well more than 3000 messages and still running
case 2: used 1 G400 and send-receive together over socket 6 and 7, still running well
Case 3: used 1 G400 and LAWICEL CANUSB (run o PC), at 1Mb, it works well,
Case 1 and 2, it runs automatically, just sometime I add USB debug (MFDeploy) to make sure it is still running and monitor by an LED, then disconnect USB. I mean I am using external power.
Case 3, because this software is running on PC so I have to click on the mouse more than 1000 times by my hand. :))
I used your config to setup 1Mb, sleep 50ms for every PostMessage (as your code)
Let me know if you have another suggestion to reproduce this bug.
Edit: Now I change to sleep 1ms (to send 1000 messages/per second)
@ Dat - I’ve tried your code on one G400HDR and one custom board. And indeed it works!
So, probably, the problem is not in message count itself, but the distribution of the messages. In my system, they arrive in batches, a few hundred in rapid succesion. Maybe this is the problem? To test this assumption, I’ve modified your code; if I replace
It would have been yesterday but you found a bug and other found few issues. We hope early next week but we are not in a hurry. We need to make sure it is all good.