NMEA parser without strings

You are awesome :slight_smile:

There will be a significant change in the driver. I have used floats for latitude, longitude and such fields but there are sentences that send very precise informations and float is not accurate enough.

This is for example the case for the GLL sentence :

$GNGLL,4404.14012,N,12118.85993,W,001037.00,A,A*67

So I will have to use doubles instead of singles to maintain the precision of the received data. At the cost of increased structs’ sizes.

Hello,

Major overhaul on the NMEA parser…

After some tests, I’ve noticed that at some time available memory started to decrease. Slowly but it was decreasing regularly. Not good :frowning:

So I’ve changed many things : no more ArrayLists, no more Array.Split(). Those were the cause of the “memory leak”.

The code is also now very specific to parsing NMEA sentences. Before the rework, some methods were kind of generic but they were consuming too much memory.

Now the code is not as beautiful as before but it’s way more efficient. And memory consumption is minimal and steady.

After 10.000 loops, each one parsing 8 different NMEA sentences every 10ms, only 18KB have been eaten :

> Before loop 10.000 x 8 sentences
> +-----------+------------+------------+
> | Memory    | Used       | Free       |
> +-----------+------------+------------+
> | Managed   |    157,168 |    350,624 |
> | Unmanaged |          0 | 33,554,404 |
> +-----------+------------+------------+
> 
> After loop 10.000 x 8 sentences
> +-----------+------------+------------+
> | Memory    | Used       | Free       |
> +-----------+------------+------------+
> | Managed   |    175,536 |    332,256 |
> | Unmanaged |          0 | 33,554,404 |
> +-----------+------------+------------+

It’s the same memory consumption for 10 loops or 1000 loops. So now I know that memory doesn’t “leak” anymore.

Also, as said above, I have added the talker Id and changed Singles to Doubles.

Edit PR created in TinyCLR-Drivers repo.

1 Like

Please do not use this parser yet.

It still needs some fixes as different GPS devices do not send the same data format for the same sentences
e.g. some GPS are sending Latitude or Fix time as xxxxxx.xx while others are sending xxxxxxx only, without decimals.

Also, checksums are invalid at the moment for sentences that end with CRLF.

PR has been closed for now.

Please take your time

I will need some time, indeed…

Same device (SAM-M8Q) is sending different GSV sentences for the last satellite in view, depending on the talker :

> GPS : $GPGSV,4,4,13,30,31,066,28*40
> GLONASS : $GLGSV,3,3,10,84,06,036,,,,,18*52

This is quite challenging.

It’s the last pebble in my shoe. But it hurts :cry:

Writing a NMEA parser is easy, writhing a proper parser that is efficient is not so easy. I have seen so many implementations that work but they just kill the system with strings manipulation.

One of the thought I had was to implement a native parser or at least implement a “helper” in this case. Like a bit converter class that takes byte arrays and breaks it into values. Like you search for $ and \n in C# and then pass that byte array to a native parser that returns array values found between commas.

BitConverter.ParseCSV(byte[] dawdata, start, size, double[] returnedvalues)

We are open for ideas.

That’s exactly how this parser is working.
Problem is that not all devices are sending excactly the sames sentences and different talkers do not send “standard” sentences either.

The GSV example above is a “good” example of what can happen :frowning:

About the helper, it may be useful but, again, there would be memory issues at some time because of the arrays you would have to return. Each one would have a different size, thus needing a new one to be allocated at each call.

That’s why I’m using an array containing commas positions. It’s a small array of fixed size. This makes code less “beautiful” or readable, but it does not consume much memory. And it’s fast. Even in C#.

I was wrong in my analysis… This is an almost well-formed sentence, in fact.
Indeed, there are 10 satellites in view. Each GSV containing 4 satellites, the last GSV should contain 2 satellites. Which is the case for GLONASS.
But it turns out that the 10th satellite has empty data :frowning:

Instead of a native-code NMEA parser, you could generalize this into something like regex or sscanf and make it a generalized parser - e.g., template and struct in; populated struct out. What makes it an NMEA are the sentence templates. That could be written in managed or native code.

I would say ‘just use regex’ but you would still end up with some memory allocations since regex first creates an intermediate string that has to be converted to a numeric. But in the end, a generalized scanning grammar that includes the ability to convert numerics seems like a more reusable solution.

You are right. Regexp is very memory intensive and slow. At least with the managed implementation. And you end up with strings anyway.
To me, it’s not a good idea to use it for parsing NMEA data at the rate and quantity it comes from the GPS.

Before coding this byte parser, I’ve tried this code :

>     var receivedString = "$GPRMC,123519,A,4807.038,N,01131.000,E,022.4,084.4,230394,003.1,W*6A";
> var expRMC = "([$]GPRMC)[,]([0-9]{6})[,]([AV]{1})[,](.*)[,]([NS]{1})[,](.*)[,]([EW]{1})[,](.*)[,](.*)[,](.*)[,](.*)[,]([EW]{1})([*][0-9a-f]{2})";
>                 var NMEAPattern = "[$]([A-Z]{2})([A-Z]{3}).*[*]([0-9a-f]{2})";
> 
>                 Regex validNMEA = new Regex(NMEAPattern, RegexOptions.IgnoreCase);
> 
>                 if (validNMEA.IsMatch(receivedString))
>                 {
>                     Match m = validNMEA.Match(receivedString);
>                     switch (m.Groups[2].Value)
>                     {
>                         case "RMC":
>                             Regex rRMC = new Regex(expRMC, RegexOptions.IgnoreCase);
>                             Match mRMC = rRMC.Match(receivedString);
>                             RMCSentence.FixTime = mRMC.Groups[2].Value != String.Empty
>                             ? new TimeSpan(Convert.ToInt32(mRMC.Groups[2].Value.Substring(0, 2)), Convert.ToInt32(mRMC.Groups[2].Value.Substring(2, 2)), Convert.ToInt32(mRMC.Groups[2].Value.Substring(4, 2)))
>                             : new TimeSpan(0);
>                             RMCSentence.Status = String.IsNullOrEmpty(mRMC.Groups[3].Value) ? Char.MinValue : mRMC.Groups[3].Value[0];
>                             RMCSentence.Latitude = (Single)Double.Parse(mRMC.Groups[4].Value) / 100;
>                             RMCSentence.LatitudeHemisphere = String.IsNullOrEmpty(mRMC.Groups[5].Value) ? Char.MinValue : mRMC.Groups[5].Value[0];
>                             RMCSentence.Longitude = (Single)Double.Parse(mRMC.Groups[6].Value) / 100;
>                             RMCSentence.LongitudePosition = String.IsNullOrEmpty(mRMC.Groups[7].Value) ? Char.MinValue : mRMC.Groups[7].Value[0];
>                             RMCSentence.SpeedKnots = (Single)Double.Parse(mRMC.Groups[8].Value);
>                             RMCSentence.SpeedKm = RMCSentence.SpeedKnots * 1.852f;
>                             RMCSentence.TrackAngle = (Single)Double.Parse(mRMC.Groups[9].Value);
>                             RMCSentence.MagneticVariation = (Single)Double.Parse(mRMC.Groups[11].Value);
>                             RMCSentence.MagneticVariationDirection = String.IsNullOrEmpty(mRMC.Groups[12].Value) ? Char.MinValue : mRMC.Groups[12].Value[0];
>                             RMCSentence.Checksum = (Byte)Convert.ToInt32(mRMC.Groups[13].Value.Substring(1, 2), 16);
>                             break;
>                     }
>                 }

This does indeed work. But unfortunately it’s not efficient at all.

But my point is that you can still get the memory win without making this an NMEA-specific parser. Just struct+template in and populated struct out, in either native or managed code (though admittedly, without reflection, the native code implementation is more complex).

It becomes an NMEA parser when you use NMEA templates as the input.

My idea is to implement a helper, nothing specific to GPS. Like CSV parser. Which then we can use for this example and other things, like simple configuration values (network config from a text file for example).

1 Like

Config files are a whole 'nother basket of worms, but yes, in general, I agree.

For config files, I have been using bson because it is the most compact storage format that can be directly read (no decompression) while still allowing for schema and version management and can be read in place (without copying elements). Of course, if config files have to be human readable, then text or json is better, but also less space efficient.

I’ve also moved to logging via bson, and logging just the message type and data payload and then transforming that to readable text only when it needs to be human readable (and generally, that’s on a server or desktop machine).

1 Like

I have finally fixed all issues I had so far. Those were mainly in GSV sentences.

Now the parser can handle such GSV sentences :

$GPGSV,3,1,11,03,03,111,00,04,15,270,00,06,01,010,00,13,06,292,00*74
$GPGSV,3,3,11,22,42,067,42,24,14,311,43,27,05,244,00,,,,*4D
$GPGSV,4,4,13,30,31,066,28*40
$GLGSV,3,3,10,84,06,036,,,,,18*52
$GLGSV,1,1,02,72,,,29,74,,,19*62
$GLGSV,1,1,02,65,50,140,28,,,,32*5F
$GPGSV,4,4,14,36,29,144,31,49,35,178,*74
$GPGSV,1,1,00*79

There are some sentences with “inconsistent data”, others with partial data and others with missing expected data…
Many of the sentences above have been received by real hardware (GNSS Click, GNSS 4 Click and GNSS Zoe Click), so I’m pretty confident in the parser.

FYI, parsing 16.000 sentences (1.000 * 16) is using only 19KB of memory. Again, this is a stable consumption. No matter if you parse 1, 10, 100 or 10.000 sentences.

Before loop :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    173,968 |    333,824 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

After loop :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    193,008 |    314,784 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

Parser is available now on our Github repo.
I have reactivated a PR on TinyCLR/Drivers repository.

Edit : Here is the program I’ve used for memory stress tests.

class Program
    {
        private static readonly String GGAString1 = "$GPGGA,123519,4807.038,N,01131.000,E,1,08,0.9,545.4,M,46.9,M,,*47";
        private static readonly String GGAString2 = "$GPGGA,123519,4807.038,N,01131.000,E,1,08,0.9,545.4,M,46.9,M,,0123*47";
        private static readonly String GSAString = "$GLGSA,A,3,04,05,,09,12,,,24,,,,,2.5,1.3,2.1*39";
        private static readonly String GSVString1 = "$GPGSV,3,1,11,03,03,111,00,04,15,270,00,06,01,010,00,13,06,292,00*74";
        private static readonly String GSVString2 = "$GPGSV,3,3,11,22,42,067,42,24,14,311,43,27,05,244,00,,,,*4D";
        private static readonly String GSVString3 = "$GPGSV,4,4,13,30,31,066,28*40";
        private static readonly String GSVString4 = "$GLGSV,3,3,10,84,06,036,,,,,18*52";
        private static readonly String GSVString5 = "$GLGSV,1,1,02,72,,,29,74,,,19*62";
        private static readonly String GSVString6 = "$GLGSV,1,1,02,65,50,140,28,,,,32*5F";
        private static readonly String GSVString7 = "$GPGSV,4,4,14,36,29,144,31,49,35,178,*74";
        private static readonly String GSVString8 = "$GPGSV,1,1,00*79";
        private static readonly String RMCString = "$GBRMC,221030,A,4807.038,N,01131.000,E,022.4,084.4,101120,003.1,W*6A";
        private static readonly String VTGString = "$INVTG,220.86,T,,M,2.550,N,4.724,K,A*34";
        private static readonly String HDTString = "$GAHDT,274.07,T*03";
        private static readonly String GLLString = "$GNGLL,4404.14012,N,12118.85993,W,001037.00,A,A*67";
        private static readonly String UNKString = "$GNUNK,4404.14012,N,12118.85993,W,001037.00,A,A*67";

        static void Main()
        {
            TestByteArray();

            Thread.Sleep(Timeout.Infinite);
        }

        static void TestByteArray()
        {
            Byte[] GGA1 = Encoding.UTF8.GetBytes(GGAString1);
            Byte[] GGA2 = Encoding.UTF8.GetBytes(GGAString2);
            Byte[] GSA = Encoding.UTF8.GetBytes(GSAString);
            Byte[] RMC = Encoding.UTF8.GetBytes(RMCString);
            Byte[] GSV1 = Encoding.UTF8.GetBytes(GSVString1);
            Byte[] GSV2 = Encoding.UTF8.GetBytes(GSVString2);
            Byte[] GSV3 = Encoding.UTF8.GetBytes(GSVString3);
            Byte[] GSV4 = Encoding.UTF8.GetBytes(GSVString4);
            Byte[] GSV5 = Encoding.UTF8.GetBytes(GSVString5);
            Byte[] GSV6 = Encoding.UTF8.GetBytes(GSVString6);
            Byte[] GSV7 = Encoding.UTF8.GetBytes(GSVString7);
            Byte[] GSV8 = Encoding.UTF8.GetBytes(GSVString8);
            Byte[] VTG = Encoding.UTF8.GetBytes(VTGString);
            Byte[] HDT = Encoding.UTF8.GetBytes(HDTString);
            Byte[] GLL = Encoding.UTF8.GetBytes(GLLString);
            Byte[] UNK = Encoding.UTF8.GetBytes(UNKString);

            Info();

            for (var i = 0; i < 1000; i++)
            {
                NMEAParser.Parse(GGA1);
                NMEAParser.Parse(GGA2);
                NMEAParser.Parse(GSA);
                NMEAParser.Parse(RMC);
                NMEAParser.Parse(GSV1);
                NMEAParser.Parse(GSV2);
                NMEAParser.Parse(GSV3);
                NMEAParser.Parse(GSV4);
                NMEAParser.Parse(GSV5);
                NMEAParser.Parse(GSV6);
                NMEAParser.Parse(GSV7);
                NMEAParser.Parse(GSV8);
                NMEAParser.Parse(VTG);
                NMEAParser.Parse(HDT);
                NMEAParser.Parse(GLL);
                NMEAParser.Parse(UNK);

                Thread.Sleep(20);
            }

            Info();
        }

        private static void Info()
        {
            Debug.WriteLine($"+-----------+------------+------------+");
            Debug.WriteLine($"| Memory    | Used       | Free       |");
            Debug.WriteLine($"+-----------+------------+------------+");
            Debug.WriteLine($"| Managed   | {Memory.ManagedMemory.UsedBytes,10:N0} | {Memory.ManagedMemory.FreeBytes,10:N0} |");
            Debug.WriteLine($"| Unmanaged | {Memory.UnmanagedMemory.UsedBytes,10:N0} | {Memory.UnmanagedMemory.FreeBytes,10:N0} |");
            Debug.WriteLine($"+-----------+------------+------------+\r\n");
        }
}

Hopefully it will be useful to others.

6 Likes

FYI, here is the result for 6.912.000 sentences parsed (16 sentences every 50ms during 6 hours) :

Before :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    182,848 |    324,944 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+

After :
+-----------+------------+------------+
| Memory    | Used       | Free       |
+-----------+------------+------------+
| Managed   |    193,920 |    313,872 |
| Unmanaged |          0 | 33,554,404 |
+-----------+------------+------------+
5 Likes

I’m very impressed in the results and I will be putting this library in heavy testing starting somewhere next week…

I’ve tried to make sense of how the library works, but I have been unable to get my head around it. Would it be possible for you to add some more sentences? Or maybe give a quick explanation to how I could do it myself?

I need the following sentences (and possibly more in the near future):

  • RMC
  • GGA
  • ROT
  • HDT
  • VTG
  • DBT
  • MWV
  • MDA
  • VBW
  • VHW

I have example sentences for all of these. Let me know if they are needed. (Let me know if there’s something I could do myself as well, I’m totally willing to help :slight_smile:)

Also, how does the library handle talker ids? Are they ignored when feeding in a sentence?

1 Like

Thank your for the compliment :blush:
I would gladly welcome the stress test from you.

Now, adding new sentences is not that hard :

  • You declare a struct that will contain the sentence data. There are some mandatory fields here, like TalkerID, Checksum and DataStatus.
  • You create a static variable and initialize it the constructor
  • In the “private vars” section, you add the byte array that describes the pattern of the sentence. The content is simple the ascii code of the 3 letters of the sentence.
  • You add that pattern to the list of “SupportedPatterns”.
  • You add a lock for that pattern. That’s just in case your code is not fast enough to handle the data when a second identical sentence comes in. Unlikely but one never knows. That’s more about safety, here.
  • Then you create the “ParseXXX()” method on the same model as existing one.
  • In the body of that parse method, the parameters of the “xxxFromAscii” methods are simply the positions of the commas in the sentence
  • You have to create a “ClearXXX()” method that clears data in your struct
  • Finally, you add a call to your “xxxParse()” method in the switch statement of the “Parse()” method.

It seems complicated but really it is not. The hardest part is to deal with incomplete or bad-formed sentences…
The GSV sentence has been a real nightmare to debug, for example :frowning: Different devices send different data for the same satellites…
Whereas the HDT one was really fast to code.

Regarding the talker IDs, the most common ones are stored in an enum at the beginning of the code if you need to have a reference in your own code. You may add others if you want but they are not used by the parser itself.
In the parser, the talker ID is simply the hex value of the two letters id. e.g. “GP” in “GPGGAxxxxx” is coded 0x4750 because ascii code for “G” is 0x47 and ascii code for “P” is 0x50. Putting them together in an Int16 0x4750 makes the ID value unique.

All those “tricks” were used to completely avoid strings.

If you can send me some sentences, I would help you, no problem !

Thank you for your explanation! I managed to add the sentences I needed:

  • DPT
  • MWV
  • MDA
  • ROT
  • VBW

I have also done some refactoring to make some things simpler and easier to oversee, atleast in my opinion.
  1. I have created a new DoubleFromAscii and IntFromAscii functions which are simpler to use. Instead of giving it the position of the two commas to look inbetween, you just pass into the function after which comma the value can be found. This way, we can turn

VTGSentence.CourseOverGroundDegrees = DoubleFromAscii(sentence, commas[0] + 1, commas[1] - commas[0] - 1);

into

VTGSentence.CourseOverGroundDegrees = DoubleFromAscii(sentence, 0);

because the CourseOverGroundDegrees comes after the first comma (arrray pos 0) in the VTG sentence. E.g. $GPVTG,220.86,T,M,2.550,N,4.724,K,A*34


  1. I have created a function that works out the checksum from the right bytes. There were some lines to determine the checksum based on if a CRLF was included in every parse method. I wrapped this code into a function that just returns the checksum as a byte.
    This way, you can turn this
if (CRLFAppended)
                    {
                        b0 = (Byte)(sentence[sentence.Length - 4] >= 65 ? sentence[sentence.Length - 4] - 55 : sentence[sentence.Length - 4] - 48);
                        b1 = (Byte)(sentence[sentence.Length - 3] >= 65 ? sentence[sentence.Length - 3] - 55 : sentence[sentence.Length - 3] - 48);
                    }
                    else
                    {
                        b0 = (Byte)(sentence[sentence.Length - 2] >= 65 ? sentence[sentence.Length - 2] - 55 : sentence[sentence.Length - 2] - 48);
                        b1 = (Byte)(sentence[sentence.Length - 1] >= 65 ? sentence[sentence.Length - 1] - 55 : sentence[sentence.Length - 1] - 48);
                    }
                    VTGSentence.Checksum = (Byte)((b0 << 4) + b1);

into this

VTGSentence.Checksum = GetChecksum(sentence);


  1. I have added new, public, methods that will parse a sentence that is not supported by the library. This is very useful for NuGet package users that want to interpret sentences not supported by the library. The method signatures are:
public static Double GetDouble(Byte[] Sentence, int position)
public static int GetInt(Byte[] Sentence, int position)

With these functions, you can just put in a not supported sentence in bytes and the position of the value you want. Let’s say you want to use the fictional sentence $ABCDE,10,20,30,40 and you want to extract the 30. You would use this function as

int value = NMEAParser.GetInt(sentenceInBytes, 2);


I have only made these refactors to the sentences I could test myself at this moment, this means that RMC, GGA, GSA, GSV and GLL are not refactored to the new methods described above.


If so desired, I can try to make a pull request to update the NMEAParser.cs in the drivers github so everyone can use these changes.

@Gus_Issa Would it help if I wrote a documentation page on how to use the parser and pull request this to the docs github?


edit:

Just noticed the DoubleFromAScii function does not work properly with a negative number in the string. I’m working on a fix for this :slight_smile:

edit2: fixed this problem

1 Like

Thank you for all this !

You can submit a PR on the repo, I will handle it.

Edit: I’m interested to see how you did in your new “DoubleFromAscii()” methods because I tried that way and had issues with lengths different from different devices and/or talkers.
I don’t remember exactly which sentences were concerned but sometimes I got 123.45 and some other times I got 123.456789 for the same field in the same sentence but from a different talker. Hence the “look inbetween commas” way of doing things