Using Gauss with NB Arduino

Looking for the best performing classifiers with a minimum amount of parameters to set?Don't look anywhere else: Gauss Naive Bayes is what you're looking for, and thanks to microML, you can now easily move it to your microcontroller.

Gaussian Naive Bayes

Gaussian Naive Bayes

Naive Bayes classifiers are simple models based on probability theory that can be used for classification.

They are due to the assumption of independence between input variables.Although this assumption is not true in the vast majority of cases, they usually perform very well in most classification missions, so they are quite popular.

Gauss Naive Bayes combines another (mostly false) assumption: a Gauss probability distribution of variables.

While it's hard to accept that so many false assumptions lead to such good performances, the fact that it's a classifier that works quite well makes it one of the reasons why it works.

However, what is important to us is that sklear implements GaussianNB,so we easily train such a classifier.
The most interesting part is that gaussianNB can be adjusted with a single parameter: var_smoothing.

import sklearn.datasets as d
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import normalized
From sklearn.naive_bayes import GaussianNB

def pick_best (X_train, X_test, y_train, y_test):
    best = (None, 0)
    for var_smoothing in range(-7, 1):
        clf = GaussianNB(var_smoothing=pow(10, var_smoothing))
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        accuracy = (y_pred == y_test).sum()
        if accuracy > best[1]:
            best = (clf, accuracy)
    print('best accuracy',best[1] / len(y_test))
    return best[0]

iris = d.load_iris()
X = normalize(iris.data)
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf = pick_best(X_train, X_test, y_train, y_test)

This simple code will train a group of classifiers with a different var_smoothingfactor and select the best performer.

EloquentML Application

Once you have your trained classifier, it is as easy as ever to move it to C:

from micromlgen import port

clf = pick_best()
print(port(clf))

port is a useful method that can carry many classifiers: it will automatically detect the converter that is suitable for you.

What does the exported code look like?

#pragma agonamespace Eloquent {
    namespace ML {
        namespace Port {
            class GaussianNB {
                public:
                    /**  * Predict class for features vector  */int predict(float *x) {
                        float votes[3] = { 0.0f };
                        float theta[4] = { 0 };
                        float sigma[4] = { 0 };
                        theta[0] = 0.801139789889; theta[1] = 0.54726920354; theta[2] = 0.234408773313; theta[3] = 0.039178084094;
                        sigma[0] = 0.000366881742; sigma[1] = 0.000907992556; sigma[2] = 0.000740960787; sigma[3] = 0.000274925514;
                        votes[0] = 0.3333333333333 - gauss(x, theta, sigma);
                        theta[0] = 0.748563871324; theta[1] = 0.349390892644; theta[2] = 0.536186138345; theta[3] = 0.166747384117;
                        sigma[0] = 0.000529727082; sigma[1] = 0.000847956504; sigma[2] = 0.000690057342; sigma[3] = 0.000311828658;
                        votes[1] = 0.3333333333333 - gauss(x, theta, sigma);
                        theta[0] = 0.704497203305; theta[1] = 0.318862439835; theta[2] = 0.593755956917; theta[3] = 0.217288784452;
                        sigma[0] = 0.000363782089; sigma[1] = 0.000813846722; sigma[2] = 0.000415475678; sigma[3] = 0.000758478249;
                        votes[2] = 0.3333333333333 - gauss(x, theta, sigma);
                        // return argmax of votesuint8_t classIdx = 0;
                        float maxVotes = votes[0];

for (uint8_t i = 1; i <  3; i++) {
                            if (votes[i] > maxVotes) {
                                classIdx = i;
                                maxVotes = votes[i];
                            }
                        }

return classIdx;
                    }

protected:
                    /**  * Compute gaussian value  */float gauss(float *x, float *theta, float *sigma) {
                        float gauss = 0.0f;

for (uint16_t i = 0; i <  4; i++) {
                            gauss += log(sigma[i]);
                            gauss += pow(x[i] - theta[i], 2) / sigma[i];
                        }

return gauss;
                    }
                };
            }
        }
    }



                    
                        


                    

Comparisons

Report of tests with Arduino Nano 33 Ble Sense:

ClassifierDatasetFlashData storeImplementation timePrecision
GaussNBIris (150×4)82 kb42 Kb65 ms%97
LinearSVCIris (150×4)83 Kb42 Kb76 ms%99
GaussNBBreast cancer (80×40)90 Kb42 Kb160 ms%77
LinearSVCBreast cancer (80×40)112 Kb42 Kb378 ms%73
GaussNBWinw (100×13)85 Kb42 Kb130 ms%97
LinearSVCWine (100×13)89 Kb42 Kb125 ms%99

We can see that accuracy is equal to a linear SVM and reaches 97% in some datasets. Its simplicity is taken into account by high-size datasets (breast cancer), where the execution time is half the lines: we can see this pattern repeated with other real-world, medium-sized datasets.

Troubleshooting

You may receive a TemplateNotFound error when using micromlgen, in which case you can go through the problem by removing and reinstaling the library:

pip uninstall micromlgen

Then go to Github,download the package as a zip and take the micromlgenfolder to your project.