New SplitToArray function in GHI libs

This should make it to the next SDK, a SplitToArray method in our Util class.

The method takes a byte array containing comma separated values (CSV) or values separated by any mark, like ‘|’ or tab! Then the method decodes the values into a float array. So a source array containing “23.5,6.0.4” will result in 3 float values 23.5 and 6.0 and 0.4

Mainly, this was added to decode GPS data but this is perfect for many other purposes, like when transferring data from a PC to a NETMF device.

Any suggestions on improving this? Looking at code below, I decoded some GPS data without a single object allocation. GC will simply never run if you decode GPS this way and it will use MUCH less of the system resources.


using System;
using Microsoft.SPOT;

namespace GPSTester
{
    public class Program
    {
        public static void Main()
        {

// some GPS data ... I purposely broke the sentences to make it look like real data stream!
            string gpss = 
                ".3372,W,1,8,1.03,61.7,M,55.2,M,,*76" +
                "$GPGGA,092750.000,5321.6802,N,00630.3372,W,1,8,1.03,61.7,M,55.2,M,,*76" +
                "$GPGGA,092750.000,5321.6802,N";

            byte[] gps = System.Text.UTF8Encoding.UTF8.GetBytes(gpss);//we are faking GPS stream for this test

            byte[] CSVLine = new byte[256];//our buffer to help a line CSV comma separated values
            float[] values = new float[100];


            ///////////////// GPS code //////////////////
            // find the start and end of the line we care about
            int lineStart = Array.IndexOf(gps, '$');
            int firstComma = Array.IndexOf(gps, ',', lineStart);
            firstComma++;//skip the comma
            int lineEnd = Array.IndexOf(gps, '*', lineStart);
            int length = lineEnd - firstComma;

            Array.Copy(gps, firstComma, CSVLine, 0, length);

            // "CSVLine" now holds a single line of the length "length"...looks like this
            // 092750.000,5321.6802,N,00630.3372,W,1,8,1.03,61.7,M,55.2,M,,
            
            // GHI will add this
            // a comma separated values CSV will be extracted and put in the float array
            // in this example it use ',' but this can be anything!
            int floatCount = GHI.Premium.System.Util.SplitToArray(CSVLine, length, ',', values);
            /* from values above it will be
             * values[0] = 092750.000;
             * values[1] = 5321.6802;
             * values[2] = (float)'N';
             * ...etc.
            */
            ///////////////////////////////////////////////////
            
            
            ///debug test
            /////show the byte array
            string extracted = new string(System.Text.UTF8Encoding.UTF8.GetChars(CSVLine));
            //show the decoded values
            for (int i = 0; i < floatCount; i++)
            {
                Debug.Print(values[i].ToString());
            }
            Debug.Print(extracted);

            System.Threading.Thread.Sleep(-1);
        }

    }
}

1 Like

@ Gus - handy, premium?

Yes premium. OSHW users can still decode data like we always did in the past. This is a luxury item, not a required item.

Rodger, lucky i have at least one of each then :slight_smile:

Something like that would be quite trivial to implement in the OSHW firmwares…

You should add quote checking to be able to Tokenize. :wink:

It’s always a tradeoff between functionality and space… if only 10% of people needed the functionality, it might not make sense to consume the space. This is why NETMF is modular, so when you build the firmware, you get to choose what modules are included.

1 Like

This is an excellent point.

@ Gus - How will the new method treat strings? In the example, I see the character ‘N’ being converted to a float version of its ASCII value (I think), but what about a multicharacter string?

Not supported but we are open to ideas. However, note that strings can’t be changed so you will be back to using GC.

One option would be that If something is not a number then the element in the float array should be NaN.

There could be additional flags passed to the call which control how chars and strings are handled, ie convert first char to ASCII value etc.

Chars are a must in the GPS data stream, like W for west. So we do convert ‘W’ to ASCII. The user would now automatically know if this is a ‘W’ or it is the actual value but on GPS, specific fields are known to be ASCII. This is showing in the example code.

@ Gus - Agreed, which is why I would suggest that you have flags controlling the behavior.

In my WebSocket project on Code Share, I have a set of Array extensions that includes an “IndexOf” function and a “Split” function.

IndexOf searches for an array of bytes in a larger array of bytes, and returns the index of the first byte.

Split uses a byte array as a delimiter to “split” a larger byte array into an array of strings.


public static int IndexOf(this byte[] array, byte[] value, int startIndex = 0)
{
	if (array == null || value == null || value.Length == 0 || startIndex < 0 || startIndex >= array.Length)
		return -1;

	int i = startIndex;
	int j = 0;
	int k = 0;
	while (i < array.Length)
	{
		j = i;
		k = 0;
		while (k < value.Length && array[j] == value[k])
		{
			j++;
			k++;
		}
		if (k == value.Length)
		{
			return i;
		}
		i++;
	}

	return -1;
}


public static ArrayList Split(this byte[] array, byte[] delimiter, int startIndex, int length, bool toLower = false)
{
	if (array == null || startIndex < 0 || length < 0 || array.Length < (startIndex + length))
		return new ArrayList();

	ArrayList list = new ArrayList();

	int posMax = startIndex + length;
	int posDelimiter = 0;
	while ((startIndex + delimiter.Length) <= posMax && (posDelimiter = IndexOf(array, delimiter, startIndex)) >= 0)
	{
		if (toLower)
		{
			list.Add((new string(Encoding.UTF8.GetChars(array, startIndex, posDelimiter - startIndex))).ToLower());
		}
		else
		{
			list.Add(new string(Encoding.UTF8.GetChars(array, startIndex, posDelimiter - startIndex)));
		}
		startIndex = posDelimiter + delimiter.Length;
	}
	return list;
}

You should use the native index of. It will be many times faster.

Thanks Gus. I’ll look into that.

It would be nice if the IndexOf() method could search a byte array for a pattern of bytes. And also if we can Split() using a pattern of bytes as a delimiter.

@ Gus -

Great… I could have used that yesterday!

I tried the Array.IndexOf method, but it turns out that it does not do the same thing as my extension method.

public static int IndexOf (Array array, Object value, int startIndex, int count)

if you pass a byte array for the “value” parameter, it does not find it in the “array”. I’m guessing that it treats the byte array as a byte not as an array??

My method searches for an array of bytes in a byte array.

public static int IndexOf(this byte[] array, byte[] value, int startIndex = 0)

May we have any performance expectation using this method instead of traditionnal implementation?