Marginal timing issue with MassStorage on G120

I have encountered what appears to be a marginal timing issue with the MassStorage implementation on the G120 module when accessed with some modern PC drivers.
Perhaps someone else has come across this, maybe GHI can take a look at their implementation and see if there is something obvious?

Background

Back in 2011 we started making an EMX based board for use in industry, it has been very reliable in an extremely harsh environment.
It uses USB Client and microSD for data storage, with what we call “Disk drive Mode” (MassStorage) to allow users to download collected data and upload settings,new software for IFU etc from their PCs.

The EMX software still uses Netmf 4.2 (it just works!)

In 2013 we built some prototype boards using the G120 module in place of the EMX, and a Redpine WiFi module.
Same software, worked with netmf and 4.2 and 4.3, although at that stage some things (eg hibernate) didn’t work with the G120.
We waited for GHI to fix the problems, the clients who had asked for WiFi never went ahead with it.

In 2019 more clients asked for WiFi. Both EMX and Redpine were now EOL, so we did a replacement board using the G120 and Espressif ESP-12 (ESP8266 module) for WiFi.
In testing the original 4.3 code everything worked - except MassStorage writes. Reads were fine.

USB was working fine. We could do full debugging, everything loaded onto the board through the debugger and ran.
MicroSD read and write from the downloaded code running on the module worked perfectly. Settings were read, directories created, log files written.
And in DiskDrive Mode (MassStorage) reads were good, but ANY sort of write (create folder, create 4-byte text file, etc) failed, with errors like “CRC Error” or “Catastrophic Failure (Error 0x8000fff)”

Drilling down into the problem, went back to the original prototype boards. Now we saw problems with microSD writes on them too!

Changed the software to the simplest MassStorage test code, as given by Dat in a forum response to someone else with a problem.

using System;
using System.IO;
using System.Threading;

using Microsoft.SPOT;
//using Microsoft.SPOT.IO;
using Microsoft.SPOT.Hardware;
using GHI.IO;
using GHI.Usb;
using GHI.Usb.Host;
using GHI.Processor;
using GHI.IO.Storage;

namespace MassStorageTest
{
public class Program
{
    public static void Main()
    {
        GHI.Usb.Client.MassStorage MyUSBClient;
        GHI.IO.Storage.SDCard MySD;
        MySD = new GHI.IO.Storage.SDCard();
        MyUSBClient = new GHI.Usb.Client.MassStorage(0x1B9F, 0xF002, 0x100, 250, "TriNeuron", "MFC", "123456", "xyz", 1);
        GHI.Usb.Client.Controller.ActiveDevice = MyUSBClient;
        Thread.Sleep(100);
        MyUSBClient.AttachLogicalUnit(MySD, 0, " ", " ");
        Thread.Sleep(100);
        MyUSBClient.EnableLogicalUnit(0);
        Thread.Sleep(-1);

    }

}
}

Doesn’t work. Tested on Windows 7. Tested on Windows 10.
Now, test on a Ubuntu Linux system - and it works perfectly, no problems reading or writing any size file, creating directories, etc!

And the original protoype boards? Of course, they were programmed and tested back on Windows XP, which is why they worked.

So, it appears something has changed in the microSD timing in Windows >= 7 which has exposed a problem in the MassStorage timing.
But for G120 only, the EMX based boards still work fine.

More testing, back to basics. Patch up a test G120HDR board with USB and microSD to test.
Great long wires everywhere (see photo), but it works!

Testing

First thought, as problem is only on write, supply voltage to the microSD. Should be ok with short, wide supply lines an a biggish capacitor nearby, but add another big cap just in case. No difference.
Second thought, could it be the USB lines? Jumper them directly - no difference.

So, use a microSD breakout board to check out the signals on the scope.
Captured signals look good - but wait - now the card is working! The breakout board has no components, it only extends the traces and has some test pins, but when the microSD card is connected through it everything works fine.

Alright, back to having the microSD directly in the mainboard, and check out the signals there.
Long story short, when the scope probe is connected to the microSD DAT0 line (at the uSD socket end or the G120 end) things work correctly, connecting anywhere else doesn’t make a difference.

So, the scope probe loading is 10M in parallel with around 4-10pF, so lets try that.
microSD card specs say maximum line loading is 20pF, luckily I have some 18pF caps (for the G120 RTC crystal), add one at the most convenient location, between G120 pins 26 & 27 (very convenient), and now everything functions correctly!

Note that all the lines between microSD and G120 are short (around 30mm) and direct - this should be a good thing!

Conclusion

It appears that there is a critical timing issue that can be fixed by adding a delay to the DAT0 signal, either by adding extra length or extra capacitance to the line. This heuristic fix does not instill me with confidence. Hopefully GHI can take a look at the internals of their implementation and see if they can locate the problem and provide a better solution?

[Edit: Added test code]

3 Likes

We can look into it but no changes are scheduled for netmf. What I would like to do is run your test on TinyCLR.

1 Like

Impressive problem identification!

1 Like

@Gus_Issa Happy to test it, I didn’t think TinyCLR supported USB Client MassStorage yet?

Can we be sure this is a timing issue, and not an issue of high frequency noise that is being filtered by the capacitor? If it was a timing issue, I would think it somewhat unlikely that only the SD D0 line would be affected.

Although it could still be timing. I’d have to look at the source code, but if D0 was written before the other data lines (or D0 data is getting there too soon for some other reason) the capacitor could help by adding a slight delay.

@Joel_Riley
As the problem goes away when I attach a probe it is hard to observe exactly what is happening, however as:

  1. There is no problem reading/writing the uSD card from C# code running on the G120. On startup it creates a directory structure and populates it, and also keeps various log files on the card.
  2. There is no problem reading/writing the uSD card in USB Client MassStorage mode when the USB connects to a Linux laptop, reading and writing even very large files.
  3. There is a completely repeatable write failure when in USB Client MassStorage mode connected to a Windows 7 or Windows 10 PC.

I would expect any noise issue to have similar effect in all of the above cases.

Regarding noise, the only noise should be coming from the G120 module itself. The board is designed to take microvolt level readings in the field, and should be very low noise. Power supply is via linear LDO from a battery source. The ESP module is on a separate LDO and is actually powered down. For testing the display is disconnected, and the code running is as shown above, ie nothing but MassStorage operating.
Lines between the uSD and G120 are short and direct. (microSD lines are on the same side of the G120 module as the card socket)