NMEA parser without strings

mcalsyn · November 15, 2020, 5:41pm

Instead of a native-code NMEA parser, you could generalize this into something like regex or sscanf and make it a generalized parser - e.g., template and struct in; populated struct out. What makes it an NMEA are the sentence templates. That could be written in managed or native code.

I would say ‘just use regex’ but you would still end up with some memory allocations since regex first creates an intermediate string that has to be converted to a numeric. But in the end, a generalized scanning grammar that includes the ability to convert numerics seems like a more reusable solution.

Bec_a_Fuel · November 15, 2020, 6:30pm

You are right. Regexp is very memory intensive and slow. At least with the managed implementation. And you end up with strings anyway.
To me, it’s not a good idea to use it for parsing NMEA data at the rate and quantity it comes from the GPS.

Before coding this byte parser, I’ve tried this code :

>     var receivedString = "$GPRMC,123519,A,4807.038,N,01131.000,E,022.4,084.4,230394,003.1,W*6A";
> var expRMC = "([$]GPRMC)[,]([0-9]{6})[,]([AV]{1})[,](.*)[,]([NS]{1})[,](.*)[,]([EW]{1})[,](.*)[,](.*)[,](.*)[,](.*)[,]([EW]{1})([*][0-9a-f]{2})";
>                 var NMEAPattern = "[$]([A-Z]{2})([A-Z]{3}).*[*]([0-9a-f]{2})";
> 
>                 Regex validNMEA = new Regex(NMEAPattern, RegexOptions.IgnoreCase);
> 
>                 if (validNMEA.IsMatch(receivedString))
>                 {
>                     Match m = validNMEA.Match(receivedString);
>                     switch (m.Groups[2].Value)
>                     {
>                         case "RMC":
>                             Regex rRMC = new Regex(expRMC, RegexOptions.IgnoreCase);
>                             Match mRMC = rRMC.Match(receivedString);
>                             RMCSentence.FixTime = mRMC.Groups[2].Value != String.Empty
>                             ? new TimeSpan(Convert.ToInt32(mRMC.Groups[2].Value.Substring(0, 2)), Convert.ToInt32(mRMC.Groups[2].Value.Substring(2, 2)), Convert.ToInt32(mRMC.Groups[2].Value.Substring(4, 2)))
>                             : new TimeSpan(0);
>                             RMCSentence.Status = String.IsNullOrEmpty(mRMC.Groups[3].Value) ? Char.MinValue : mRMC.Groups[3].Value[0];
>                             RMCSentence.Latitude = (Single)Double.Parse(mRMC.Groups[4].Value) / 100;
>                             RMCSentence.LatitudeHemisphere = String.IsNullOrEmpty(mRMC.Groups[5].Value) ? Char.MinValue : mRMC.Groups[5].Value[0];
>                             RMCSentence.Longitude = (Single)Double.Parse(mRMC.Groups[6].Value) / 100;
>                             RMCSentence.LongitudePosition = String.IsNullOrEmpty(mRMC.Groups[7].Value) ? Char.MinValue : mRMC.Groups[7].Value[0];
>                             RMCSentence.SpeedKnots = (Single)Double.Parse(mRMC.Groups[8].Value);
>                             RMCSentence.SpeedKm = RMCSentence.SpeedKnots * 1.852f;
>                             RMCSentence.TrackAngle = (Single)Double.Parse(mRMC.Groups[9].Value);
>                             RMCSentence.MagneticVariation = (Single)Double.Parse(mRMC.Groups[11].Value);
>                             RMCSentence.MagneticVariationDirection = String.IsNullOrEmpty(mRMC.Groups[12].Value) ? Char.MinValue : mRMC.Groups[12].Value[0];
>                             RMCSentence.Checksum = (Byte)Convert.ToInt32(mRMC.Groups[13].Value.Substring(1, 2), 16);
>                             break;
>                     }
>                 }

This does indeed work. But unfortunately it’s not efficient at all.

mcalsyn · November 15, 2020, 6:38pm

But my point is that you can still get the memory win without making this an NMEA-specific parser. Just struct+template in and populated struct out, in either native or managed code (though admittedly, without reflection, the native code implementation is more complex).

It becomes an NMEA parser when you use NMEA templates as the input.

Gus_Issa · November 15, 2020, 7:00pm

My idea is to implement a helper, nothing specific to GPS. Like CSV parser. Which then we can use for this example and other things, like simple configuration values (network config from a text file for example).

mcalsyn · November 15, 2020, 7:11pm

Config files are a whole 'nother basket of worms, but yes, in general, I agree.

For config files, I have been using bson because it is the most compact storage format that can be directly read (no decompression) while still allowing for schema and version management and can be read in place (without copying elements). Of course, if config files have to be human readable, then text or json is better, but also less space efficient.

I’ve also moved to logging via bson, and logging just the message type and data payload and then transforming that to readable text only when it needs to be human readable (and generally, that’s on a server or desktop machine).

Bec_a_Fuel · November 22, 2020, 3:16pm

I have finally fixed all issues I had so far. Those were mainly in GSV sentences.

Now the parser can handle such GSV sentences :

$GPGSV,3,1,11,03,03,111,00,04,15,270,00,06,01,010,00,13,06,292,00*74
$GPGSV,3,3,11,22,42,067,42,24,14,311,43,27,05,244,00,,,,*4D
$GPGSV,4,4,13,30,31,066,28*40
$GLGSV,3,3,10,84,06,036,,,,,18*52
$GLGSV,1,1,02,72,,,29,74,,,19*62
$GLGSV,1,1,02,65,50,140,28,,,,32*5F
$GPGSV,4,4,14,36,29,144,31,49,35,178,*74
$GPGSV,1,1,00*79

There are some sentences with “inconsistent data”, others with partial data and others with missing expected data…
Many of the sentences above have been received by real hardware (GNSS Click, GNSS 4 Click and GNSS Zoe Click), so I’m pretty confident in the parser.

FYI, parsing 16.000 sentences (1.000 * 16) is using only 19KB of memory. Again, this is a stable consumption. No matter if you parse 1, 10, 100 or 10.000 sentences.

Before loop :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    173,968 |    333,824 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

After loop :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    193,008 |    314,784 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

Parser is available now on our Github repo.
I have reactivated a PR on TinyCLR/Drivers repository.

Edit : Here is the program I’ve used for memory stress tests.

class Program
    {
        private static readonly String GGAString1 = "$GPGGA,123519,4807.038,N,01131.000,E,1,08,0.9,545.4,M,46.9,M,,*47";
        private static readonly String GGAString2 = "$GPGGA,123519,4807.038,N,01131.000,E,1,08,0.9,545.4,M,46.9,M,,0123*47";
        private static readonly String GSAString = "$GLGSA,A,3,04,05,,09,12,,,24,,,,,2.5,1.3,2.1*39";
        private static readonly String GSVString1 = "$GPGSV,3,1,11,03,03,111,00,04,15,270,00,06,01,010,00,13,06,292,00*74";
        private static readonly String GSVString2 = "$GPGSV,3,3,11,22,42,067,42,24,14,311,43,27,05,244,00,,,,*4D";
        private static readonly String GSVString3 = "$GPGSV,4,4,13,30,31,066,28*40";
        private static readonly String GSVString4 = "$GLGSV,3,3,10,84,06,036,,,,,18*52";
        private static readonly String GSVString5 = "$GLGSV,1,1,02,72,,,29,74,,,19*62";
        private static readonly String GSVString6 = "$GLGSV,1,1,02,65,50,140,28,,,,32*5F";
        private static readonly String GSVString7 = "$GPGSV,4,4,14,36,29,144,31,49,35,178,*74";
        private static readonly String GSVString8 = "$GPGSV,1,1,00*79";
        private static readonly String RMCString = "$GBRMC,221030,A,4807.038,N,01131.000,E,022.4,084.4,101120,003.1,W*6A";
        private static readonly String VTGString = "$INVTG,220.86,T,,M,2.550,N,4.724,K,A*34";
        private static readonly String HDTString = "$GAHDT,274.07,T*03";
        private static readonly String GLLString = "$GNGLL,4404.14012,N,12118.85993,W,001037.00,A,A*67";
        private static readonly String UNKString = "$GNUNK,4404.14012,N,12118.85993,W,001037.00,A,A*67";

        static void Main()
        {
            TestByteArray();

            Thread.Sleep(Timeout.Infinite);
        }

        static void TestByteArray()
        {
            Byte[] GGA1 = Encoding.UTF8.GetBytes(GGAString1);
            Byte[] GGA2 = Encoding.UTF8.GetBytes(GGAString2);
            Byte[] GSA = Encoding.UTF8.GetBytes(GSAString);
            Byte[] RMC = Encoding.UTF8.GetBytes(RMCString);
            Byte[] GSV1 = Encoding.UTF8.GetBytes(GSVString1);
            Byte[] GSV2 = Encoding.UTF8.GetBytes(GSVString2);
            Byte[] GSV3 = Encoding.UTF8.GetBytes(GSVString3);
            Byte[] GSV4 = Encoding.UTF8.GetBytes(GSVString4);
            Byte[] GSV5 = Encoding.UTF8.GetBytes(GSVString5);
            Byte[] GSV6 = Encoding.UTF8.GetBytes(GSVString6);
            Byte[] GSV7 = Encoding.UTF8.GetBytes(GSVString7);
            Byte[] GSV8 = Encoding.UTF8.GetBytes(GSVString8);
            Byte[] VTG = Encoding.UTF8.GetBytes(VTGString);
            Byte[] HDT = Encoding.UTF8.GetBytes(HDTString);
            Byte[] GLL = Encoding.UTF8.GetBytes(GLLString);
            Byte[] UNK = Encoding.UTF8.GetBytes(UNKString);

            Info();

            for (var i = 0; i < 1000; i++)
            {
                NMEAParser.Parse(GGA1);
                NMEAParser.Parse(GGA2);
                NMEAParser.Parse(GSA);
                NMEAParser.Parse(RMC);
                NMEAParser.Parse(GSV1);
                NMEAParser.Parse(GSV2);
                NMEAParser.Parse(GSV3);
                NMEAParser.Parse(GSV4);
                NMEAParser.Parse(GSV5);
                NMEAParser.Parse(GSV6);
                NMEAParser.Parse(GSV7);
                NMEAParser.Parse(GSV8);
                NMEAParser.Parse(VTG);
                NMEAParser.Parse(HDT);
                NMEAParser.Parse(GLL);
                NMEAParser.Parse(UNK);

                Thread.Sleep(20);
            }

            Info();
        }

        private static void Info()
        {
            Debug.WriteLine($"+-----------+------------+------------+");
            Debug.WriteLine($"| Memory    | Used       | Free       |");
            Debug.WriteLine($"+-----------+------------+------------+");
            Debug.WriteLine($"| Managed   | {Memory.ManagedMemory.UsedBytes,10:N0} | {Memory.ManagedMemory.FreeBytes,10:N0} |");
            Debug.WriteLine($"| Unmanaged | {Memory.UnmanagedMemory.UsedBytes,10:N0} | {Memory.UnmanagedMemory.FreeBytes,10:N0} |");
            Debug.WriteLine($"+-----------+------------+------------+\r\n");
        }
}

Hopefully it will be useful to others.

Bec_a_Fuel · November 24, 2020, 7:13am

FYI, here is the result for 6.912.000 sentences parsed (16 sentences every 50ms during 6 hours) :

Before :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    182,848 |    324,944 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

After :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    193,920 |    313,872 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

LucaP · January 7, 2021, 10:19pm

I’m very impressed in the results and I will be putting this library in heavy testing starting somewhere next week…

I’ve tried to make sense of how the library works, but I have been unable to get my head around it. Would it be possible for you to add some more sentences? Or maybe give a quick explanation to how I could do it myself?

I need the following sentences (and possibly more in the near future):

RMC
GGA
ROT
HDT
VTG
DBT
MWV
MDA
VBW
VHW

I have example sentences for all of these. Let me know if they are needed. (Let me know if there’s something I could do myself as well, I’m totally willing to help )

Also, how does the library handle talker ids? Are they ignored when feeding in a sentence?

Bec_a_Fuel · January 7, 2021, 11:05pm

Thank your for the compliment
I would gladly welcome the stress test from you.

Now, adding new sentences is not that hard :

You declare a struct that will contain the sentence data. There are some mandatory fields here, like TalkerID, Checksum and DataStatus.
You create a static variable and initialize it the constructor
In the “private vars” section, you add the byte array that describes the pattern of the sentence. The content is simple the ascii code of the 3 letters of the sentence.
You add that pattern to the list of “SupportedPatterns”.
You add a lock for that pattern. That’s just in case your code is not fast enough to handle the data when a second identical sentence comes in. Unlikely but one never knows. That’s more about safety, here.
Then you create the “ParseXXX()” method on the same model as existing one.
In the body of that parse method, the parameters of the “xxxFromAscii” methods are simply the positions of the commas in the sentence
You have to create a “ClearXXX()” method that clears data in your struct
Finally, you add a call to your “xxxParse()” method in the switch statement of the “Parse()” method.

It seems complicated but really it is not. The hardest part is to deal with incomplete or bad-formed sentences…
The GSV sentence has been a real nightmare to debug, for example Different devices send different data for the same satellites…
Whereas the HDT one was really fast to code.

Regarding the talker IDs, the most common ones are stored in an enum at the beginning of the code if you need to have a reference in your own code. You may add others if you want but they are not used by the parser itself.
In the parser, the talker ID is simply the hex value of the two letters id. e.g. “GP” in “GPGGAxxxxx” is coded 0x4750 because ascii code for “G” is 0x47 and ascii code for “P” is 0x50. Putting them together in an Int16 0x4750 makes the ID value unique.

All those “tricks” were used to completely avoid strings.

If you can send me some sentences, I would help you, no problem !

LucaP · January 20, 2021, 12:38pm

Thank you for your explanation! I managed to add the sentences I needed:

DPT
MWV
MDA
ROT
VBW

I have also done some refactoring to make some things simpler and easier to oversee, atleast in my opinion.

I have created a new DoubleFromAscii and IntFromAscii functions which are simpler to use. Instead of giving it the position of the two commas to look inbetween, you just pass into the function after which comma the value can be found. This way, we can turn

VTGSentence.CourseOverGroundDegrees = DoubleFromAscii(sentence, commas[0] + 1, commas[1] - commas[0] - 1);

into

VTGSentence.CourseOverGroundDegrees = DoubleFromAscii(sentence, 0);

because the CourseOverGroundDegrees comes after the first comma (arrray pos 0) in the VTG sentence. E.g. $GPVTG,220.86,T,M,2.550,N,4.724,K,A*34

I have created a function that works out the checksum from the right bytes. There were some lines to determine the checksum based on if a CRLF was included in every parse method. I wrapped this code into a function that just returns the checksum as a byte.
This way, you can turn this

if (CRLFAppended)
                    {
                        b0 = (Byte)(sentence[sentence.Length - 4] >= 65 ? sentence[sentence.Length - 4] - 55 : sentence[sentence.Length - 4] - 48);
                        b1 = (Byte)(sentence[sentence.Length - 3] >= 65 ? sentence[sentence.Length - 3] - 55 : sentence[sentence.Length - 3] - 48);
                    }
                    else
                    {
                        b0 = (Byte)(sentence[sentence.Length - 2] >= 65 ? sentence[sentence.Length - 2] - 55 : sentence[sentence.Length - 2] - 48);
                        b1 = (Byte)(sentence[sentence.Length - 1] >= 65 ? sentence[sentence.Length - 1] - 55 : sentence[sentence.Length - 1] - 48);
                    }
                    VTGSentence.Checksum = (Byte)((b0 << 4) + b1);

into this

VTGSentence.Checksum = GetChecksum(sentence);

I have added new, public, methods that will parse a sentence that is not supported by the library. This is very useful for NuGet package users that want to interpret sentences not supported by the library. The method signatures are:

public static Double GetDouble(Byte[] Sentence, int position)
public static int GetInt(Byte[] Sentence, int position)

With these functions, you can just put in a not supported sentence in bytes and the position of the value you want. Let’s say you want to use the fictional sentence $ABCDE,10,20,30,40 and you want to extract the 30. You would use this function as

int value = NMEAParser.GetInt(sentenceInBytes, 2);

I have only made these refactors to the sentences I could test myself at this moment, this means that RMC, GGA, GSA, GSV and GLL are not refactored to the new methods described above.

If so desired, I can try to make a pull request to update the NMEAParser.cs in the drivers github so everyone can use these changes.

@Gus_Issa Would it help if I wrote a documentation page on how to use the parser and pull request this to the docs github?

edit:

Just noticed the DoubleFromAScii function does not work properly with a negative number in the string. I’m working on a fix for this

edit2: fixed this problem

Bec_a_Fuel · January 23, 2021, 5:07pm

Thank you for all this !

You can submit a PR on the repo, I will handle it.

Edit: I’m interested to see how you did in your new “DoubleFromAscii()” methods because I tried that way and had issues with lengths different from different devices and/or talkers.
I don’t remember exactly which sentences were concerned but sometimes I got 123.45 and some other times I got 123.456789 for the same field in the same sentence but from a different talker. Hence the “look inbetween commas” way of doing things

Gus_Issa · April 14, 2021, 3:00pm

Hello @Bec_a_Fuel thanks for contributing this to TinyCLR libs. We need to change the name space and give you credit for the contribution

this is what we think:
namespace/NuGet GHIElectronics.TinyCLR.Drivers.Gps.Nmea0183

and in source we add

// Ported and contributed to TinyCLR OS by MBN Software

Good or what would you like us to change?

Thanks a million!

Bec_a_Fuel · April 14, 2021, 5:08pm

That’s good for me.

mhardy · April 15, 2021, 9:46pm

@mcalsyn: I recall that you or maybe someone else put the bson assembly, docs, stuff like bson for dummy’s somewhere.

My ‘stack-of-stuff-to-do’ is at the level of being able to take a gander at bson.
Bson is something that really makes sense in our realm.