Skip to main content

Where embedded meets ML

Edge AI & TinyML on the ESP32

The interesting AI is increasingly moving to the edge — instant decisions, no bandwidth, no recurring inference bill, data that never leaves the device, and a product that keeps working when the network doesn't. The hard part is making a useful model fit inside a microcontroller's memory and latency budget. That fit is exactly what we engineer.

What we build on-device

Edge-AI use caseHow it runs on the ESP32What it replaces
Object / person detectionESP32-S3 + camera, quantized CNN via TFLite Micro / ESP-DLA cloud vision API round-trip + bandwidth + latency
Anomaly / predictive maintenanceOn-device autoencoder or FFT over accelerometer dataStreaming raw vibration data to the cloud
Keyword / wake-word spottingTinyML model on the live mic streamAlways-on cloud audio (privacy + recurring cost)
Visual classification / countingQuantized model using the ESP32-S3 vector instructionsManual inspection or per-frame cloud inference
Sensor-fusion classificationSmall on-device model over multi-sensor inputBrittle rule-trees that fail on edge cases
On-device ML we ship on the ESP32-S3 — and the cloud round-trip each one removes.

Our edge

  1. Both halves. We train/optimize the model and integrate it into production ESP-IDF firmware — no handoff across a data-science / embedded boundary where most edge-AI projects stall.
  2. Real proof.We've shipped an ESP32 camera object-detection system — on-device vision, not a cloud demo.
  3. The right tools. TensorFlow Lite Micro, ESP-DL, and Edge Impulse, plus the ESP32-S3 vector instructions — chosen per project, not by habit.

Frequently asked questions

Can the ESP32 really run machine learning?

Yes — within limits, and that is the whole skill. The ESP32-S3 has vector/SIMD instructions and enough SRAM/PSRAM to run small quantized neural networks (object detection, keyword spotting, anomaly detection) in real time. The art is shrinking and quantizing the model to fit the memory and latency budget without losing the accuracy that matters.

ESP32 or ESP32-S3 for edge AI?

The ESP32-S3 for almost any real on-device ML — it adds the vector instructions that ESP-DL and TensorFlow Lite Micro use to accelerate inference, plus more SRAM and PSRAM headroom for the model and camera buffers. The classic ESP32 can run very small models, but the S3 is the practical default for vision or anything non-trivial.

Do you use TensorFlow Lite Micro or Edge Impulse?

Both, depending on the project. Edge Impulse is excellent for fast data collection, training, and a first deployable model; TensorFlow Lite Micro (and ESP-DL) when we need full control over the model, quantization, and the integration into production ESP-IDF firmware. We pick based on the accuracy target, the data you have, and how much custom control the product needs.

When does on-device ML beat sending data to the cloud?

When latency, bandwidth, cost, privacy, or connectivity matter — which is most real deployments. Running inference on the device means instant decisions, no per-image bandwidth, no recurring cloud-inference bill, data that never leaves the device, and a product that still works when the network does not. We help you decide honestly which parts belong on the edge and which belong in the cloud.