TinyML Benchmark: Fully Connected Neural Networks

Have you ever thought about how fast the popular microcontroller cards are to run Tensorflow Lite neural networks?In this article, we will look at this information for fully connected neural networks.

tinyml benchmark

In this article we will compare different development cards related to the task of running Tensorflow Lite Neural Networks.

Development cards in the list:

  • Arduino Nano 33 BLE Sense (Cortex M4 @ 64 MHz)
  • ESP32 (Xtensa dual-core @ 240 MHz)
  • Feather M4 Express (Cortex M4F @ 200 MHz)
  • STM32 Nucleo H743ZI2 (Cortex M7 @ 480 MHz)
  • Arduino Portenta (Cortex M7 @ 480 MHz)
  • Teensy 4.0 (Cortex M7 @ 600 MHz)
  • Raspberry Pi Pico (Rp2040 / Cortex M0+ @ 125 MHz)

As you can see, they differ in CPU and clock frequency.

Comparative network topology is 3 types of fully connected networks:

  • 1 layer with 10 neurons
  • 2 layers, one with 10 neurons and the other with 50 neurons
  • 10 layers, each containing 10 neurons

Forecast Times

The following charts show the inference time (in microseconds) of different networks for each development card on a linear and logarithmic scale.

tinyml benchmark

Fully linked benchmarks Inference time linear scale, including Rpi Pico

tinyml benchmark

Inference time linear scale without full-link benchmarks Rpi Pico

tinyml benchmark

Fully linked benchmarks Slowest inference time, linear scale

tinyml benchmark

Fully linked benchmarks inference time diary scale, including Rpi Pico

What's the verdict?

  • Teensy 4.0 is the fastest, as you might expect from its faster watch.
  • Arduino Portenta and Nucleo H743ZI2 are quite equal as they share two CPUs from the same family, but Nucleo is faster in all topologies.
  • If we look at it in terms of price/performance, ESP32 still has a great performance/price ratio.
  • The Raspberry Pi Pico is the slowest, although it does not have the slowest watch (but the Arduino Nano 33 BLE Sense has a Cortex M4 CPU)

Unprocessed Data

Development CardData SetclfForecast Duration
Arduino Nano 33Breast cancerFC 1 x 10138.71
Arduino Nano 33Breast cancerFC 10 x 10472.11
Arduino Nano 33Breast cancerFC 10+50286.86
Arduino Nano 33FiguresFC 1 x 10390.25
Arduino Nano 33FiguresFC 10 x 10719.08
Arduino Nano 33FiguresFC 10+50589.75
Arduino Nano 33IrisFC 1 x 10113.61
Arduino Nano 33IrisFC 10 x 10442.75
Arduino Nano 33IrisFC 10+50266.54
Arduino Nano 33WineFC 1 x 10130.1
Arduino Nano 33WineFC 10 x 10460.02
Arduino Nano 33WineFC 10+50283.82
Arduino Portenta M7Breast cancerFC 1 x 1013.75
Arduino Portenta M7Breast cancerFC 10 x 1055.16
Arduino Portenta M7Breast cancerFC 10+5031.72
Arduino Portenta M7FiguresFC 1 x 1026.96
Arduino Portenta M7FiguresFC 10 x 1069.54
Arduino Portenta M7FiguresFC 10+5051.56
Arduino Portenta M7IrisFC 1 x 108.71
Arduino Portenta M7IrisFC 10 x 1049.85
Arduino Portenta M7IrisFC 10+5027.35
Arduino Portenta M7WineFC 1 x 1010.94
Arduino Portenta M7WineFC 10 x 1052.11
Arduino Portenta M7WineFC 10+5029.55
ESP32 Giant ModuleBreast cancerFC 1 x 1036.31
ESP32 Giant ModuleBreast cancerFC 10 x 10125.03
ESP32 Giant ModuleBreast cancerFC 10+5074.86
ESP32 Giant ModuleFiguresFC 1 x 1077.25
ESP32 Giant ModuleFiguresFC 10 x 10172.94
ESP32 Giant ModuleFiguresFC 10+50130.61
ESP32 Giant ModuleIrisFC 1 x 1020.83
ESP32 Giant ModuleIrisFC 10 x 10109.23
ESP32 Giant ModuleIrisFC 10+5061.17
ESP32 Giant ModuleWineFC 1 x 1028.89
ESP32 Giant ModuleWineFC 10 x 10117.95
ESP32 Giant ModuleWineFC 10+5069.28
Feather M4 Express {opt=fastest,speed=200}Breast cancerFC 1 x 1031.81
Feather M4 Express {opt=fastest,speed=200}Breast cancerFC 10 x 10132.66
Feather M4 Express {opt=fastest,speed=200}Breast cancerFC 10+5079.13
Feather M4 Express {opt=fastest,speed=200}FiguresFC 1 x 1069.89
Feather M4 Express {opt=fastest,speed=200}FiguresFC 10 x 10167.29
Feather M4 Express {opt=fastest,speed=200}FiguresFC 10+50132.14
Feather M4 Express {opt=fastest,speed=200}IrisFC 1 x 1017.79
Feather M4 Express {opt=fastest,speed=200}IrisFC 10 x 10118.9
Feather M4 Express {opt=fastest,speed=200}IrisFC 10+5067.17
Feather M4 Express {opt=fastest,speed=200}WineFC 1 x 1023.84
Feather M4 Express {opt=fastest,speed=200}WineFC 10 x 10124.46
Feather M4 Express {opt=fastest,speed=200}WineFC 10+5072.93
NUCLEO H743ZI2 {opt=o3}Breast cancerFC 1 x 108.5
NUCLEO H743ZI2 {opt=o3}Breast cancerFC 10 x 1034.19
NUCLEO H743ZI2 {opt=o3}Breast cancerFC 10+5020.18
NUCLEO H743ZI2 {opt=o3}FiguresFC 1 x 1018.08
NUCLEO H743ZI2 {opt=o3}FiguresFC 10 x 1044.16
NUCLEO H743ZI2 {opt=o3}FiguresFC 10+5033.8
NUCLEO H743ZI2 {opt=o3}IrisFC 10 x 1031.51
NUCLEO H743ZI2 {opt=o3}IrisFC 10+5017.8
NUCLEO H743ZI2 {opt=o3}WineFC 10 x 1032.57
NUCLEO H743ZI2 {opt=o3}WineFC 10+5019.06
Raspberry Pi PicoBreast cancerFC 1 x 10872.85
Raspberry Pi PicoBreast cancerFC 10 x 103369.54
Raspberry Pi PicoBreast cancerFC 10+502413.44
Raspberry Pi PicoFiguresFC 1 x 101982.31
Raspberry Pi PicoFiguresFC 10 x 104503.25
Raspberry Pi PicoFiguresFC 10+504314.19
Raspberry Pi PicoIrisFC 1 x 10313.77
Raspberry Pi PicoIrisFC 10 x 102801.82
Raspberry Pi PicoIrisFC 10+501953.96
Raspberry Pi PicoWineFC 1 x 10509.76
Raspberry Pi PicoWineFC 10 x 103021.03
Raspberry Pi PicoWineFC 10+502176.92
Teensy 4.0Breast cancerFC 1 x 105.16
Teensy 4.0Breast cancerFC 10 x 1020.15
Teensy 4.0Breast cancerFC 10+5012.32
Teensy 4.0FiguresFC 10 x 1026.09
Teensy 4.0FiguresFC 10+5021.01
Teensy 4.0IrisFC 1 x 103.14
Teensy 4.0IrisFC 10 x 1018.12
Teensy 4.0IrisFC 10+5011.13
Teensy 4.0WineFC 1 x 103.86
Teensy 4.0WineFC 10 x 1018.92
Teensy 4.0WineFC 10+5011.43