G120 module- diagnosing a problem?

SecretSquirrel · January 8, 2018, 1:17am

I have a a very unusual problem for a GHI product, and the G120 SOM. They’re rock-solid. They’re so reliable that we assume they’ll work fine and use the SOM to run our board diagnostics.

Several G120 SOM modules which were installed by a third party on our custom circuit board. They are loading fine, but failing to run. Some modules won’t talk to the USB debugger at all. Some will boot, but deploying the application fails. Some will boot, take the application deployment, and the basic features work fine up until the application starts running, then they either crash without throwing an exception or throw an exception, then crash.

I’ve done very careful checking to make absolutely sure that the loader and firmware are correct on these boards.

So, I’m suspecting the SOMs were damaged in some way during installation. Are there any low-level diagnostics that will progressively test the SOM to failure, so we can see what went wrong? And, if not, is there something GHI can do with one of these SOMs that has been removed from a board?

Dave_McLaughlin · January 8, 2018, 1:07pm

How is the board designed? Is it 4 layer with a power and ground plane?

What type of power supply does the board have? Switched mode, linear?

Is there enough caps at the processor module?

How have you wired up the RESET input?

Mr_John_Smith · January 8, 2018, 1:27pm

Did the pcb assembler follow the temp profile during reflow? What I would have done was install the first few modules by hand myself to verify that board works fine. Try replacing one by hand.

SecretSquirrel · January 8, 2018, 1:53pm

Are these symptoms common for a G120 SOM that was overheated?

Is there a common area of failure if the module is overheated?

The third-party installer is aware of the temp profile.

Replacing a G120 by hand worked fine. However, we only did one that way.

Mr_John_Smith · January 8, 2018, 5:33pm

That is suspicious. You replaced one that wasn’t working properly, managing to fix the board?

SecretSquirrel · January 8, 2018, 6:29pm

Yes. We replaced the G120. The board worked fine after that.

I’m suspicious, too. Hoping someone can answer my questions:

Does overheating a G120 SOM when installing it on the board cause specific symptoms, and what are they?

Is there a way to diagnose the “overheated module” failure?

jwizard93 · January 8, 2018, 7:37pm

What’s the exception? Is it possible you could determine from a software perspective what is failing?

Recently I thought an outside company may have been mishandling our GHI products. The true error went unidentified for way too long. The board had external pulldown resistors I didn’t know about. When I measured the input impedance I thought the boards were damaged. I was able to get just enough working pins when I replaced some chips by hand. That was only because when I soldered the chips myself I broke some of the connections with the resistors…

Soldering chips with so many pins by hand is possible, and some people are very good at it I’m sure, but this kind of manual work is likely to be shoddy. It’s just too difficult. So my point is is that soldering something by hand could simply be introducing a differnet “error” that makes it even more difficult to track the original problem.

jwizard93 · January 8, 2018, 8:00pm

I hope someone can answer this question for you as well. I contacted STMicro engineers about my problems since they produce the actual MCU. They basically said that in my case a damaged pin or port could look like anything… IE it was too complicated of an issue to address over email. And I only wasted time trying to prove it was the type of error I initially assumed, (and probably aggravated the third party asking them to be more careful) rather than searching for another cause.

Brett · January 8, 2018, 8:13pm

I doubt there’s any meaningful “standard failure mode” on this kind of potential mistreatment. It would totally depend on how much heat, how poorly applied, and for how long. I just don’t think people will be able to give you that level of information. You may find some people who have had actual experience and who can share with you their experience, but there’s nothing to say that a failure like theirs would be indicative of what you would see.

Think of it like this. Will a car tyre manufacturer tell you what kind of failure to expect when you use a standard street car tyre on a hot racetrack? No, they’re going to say that’s outside spec, don’t do it.

bfisher · January 8, 2018, 8:20pm

Assuming that there is code that is tested and validated:

My understanding is that when a package like the ARM Cortex-M3 (brain of the G120) is overheated, it can cause internal package damage that is difficult to detect from the exterior. I had a similar problem with another chip and it was caused by the oven reflow curve exposing the package to peak temperature for too long.

If after inspection of the chip shows that there isn’t physical damage to the chip, solder bridging, or melted components, I think it would be safe to assume that after replacing the G120 and having the board work correctly, that the original G120 was damaged from overheating. (This could be caused by an oven or from an iron held too long to the chip. I have had both)

Please correct me if I have said something incorrect.

mhardy · January 8, 2018, 9:50pm

We thought initially we had over temp reflow problem. Did same and hand soldered g120 on board and all worked fine.

Turned out some of the 200 g120s we purchased had ghi test firmware loaded, which caused problems like you might be seeinng.

Ask ghi if this might be possible with the soms u ordered.

SecretSquirrel · January 10, 2018, 12:24am

We always erase the whole memory space (that we control) and reload TinyBooter and the firmware. If there was test firmware, it would have been erased. On the other hand, if the “special sauce” we don’t control was “extra-special” I would think the chip would just refuse to talk to us.

SecretSquirrel · January 10, 2018, 12:31am

So, I’m asking the question mhardy suggests. TO GHI: (Gus?) is it possible that a G120 module might not have the correct low-level firmware when it ships from GHI? Would the symptom be that the bootloader, firmware, and application will load fine but the application won’t run? Or, is this more likely a problem with installation? We already ruled out a problem with the board because it works when another G120 module is installed.

Brett · January 10, 2018, 1:26am

I’m not Gus nor someone from GHI, but I would think it’s unlikely you’d get a unit in that state if you do a full erase like you’re saying. Is it possible to now put that suspect/failed G120 module on another board and reflash it, to see if there’s a behavioural difference?

Gus_Issa · January 10, 2018, 9:55pm

The boot loader should not effect the firmware. And you should always load a specific format version that you need.

SecretSquirrel · January 11, 2018, 12:31pm

Is there a way to verify that a G120 SOM is fully operational before installing it on a circuit board?

Gus_Issa · January 11, 2018, 1:49pm

You really only need power and USB to see the module working. See the design considerations again in the datasheet as well please.

Mr_John_Smith · January 11, 2018, 2:07pm

@Gus, oh since we’re on this topic, does the G120 need those USB protection ICs or does it have it onboard already?

SecretSquirrel · January 11, 2018, 2:48pm

So, to get power and USB onto the G120 SOM, I need some sort of jig or socket in the proper form factor so I can pop the SOM in, test it, and pop it out again.

I’m assuming the jig/socket isn’t something I can buy- I’ll have to make it. Any suggestions on a DIY solution?

Mr_John_Smith · January 11, 2018, 4:01pm

Solder wires to the castellated pins.