Where embedded meets ML
Edge AI & TinyML on the ESP32
The interesting AI is increasingly moving to the edge — instant decisions, no bandwidth, no recurring inference bill, data that never leaves the device, and a product that keeps working when the network doesn't. The hard part is making a useful model fit inside a microcontroller's memory and latency budget. That fit is exactly what we engineer.
What we build on-device
| Edge-AI use case | How it runs on the ESP32 | What it replaces |
|---|---|---|
| Object / person detection | ESP32-S3 + camera, quantized CNN via TFLite Micro / ESP-DL | A cloud vision API round-trip + bandwidth + latency |
| Anomaly / predictive maintenance | On-device autoencoder or FFT over accelerometer data | Streaming raw vibration data to the cloud |
| Keyword / wake-word spotting | TinyML model on the live mic stream | Always-on cloud audio (privacy + recurring cost) |
| Visual classification / counting | Quantized model using the ESP32-S3 vector instructions | Manual inspection or per-frame cloud inference |
| Sensor-fusion classification | Small on-device model over multi-sensor input | Brittle rule-trees that fail on edge cases |
Our edge
- Both halves. We train/optimize the model and integrate it into production ESP-IDF firmware — no handoff across a data-science / embedded boundary where most edge-AI projects stall.
- Real proof.We've shipped an ESP32 camera object-detection system — on-device vision, not a cloud demo.
- The right tools. TensorFlow Lite Micro, ESP-DL, and Edge Impulse, plus the ESP32-S3 vector instructions — chosen per project, not by habit.
Go deeper
Frequently asked questions
Can the ESP32 really run machine learning?
Yes — within limits, and that is the whole skill. The ESP32-S3 has vector/SIMD instructions and enough SRAM/PSRAM to run small quantized neural networks (object detection, keyword spotting, anomaly detection) in real time. The art is shrinking and quantizing the model to fit the memory and latency budget without losing the accuracy that matters.
ESP32 or ESP32-S3 for edge AI?
The ESP32-S3 for almost any real on-device ML — it adds the vector instructions that ESP-DL and TensorFlow Lite Micro use to accelerate inference, plus more SRAM and PSRAM headroom for the model and camera buffers. The classic ESP32 can run very small models, but the S3 is the practical default for vision or anything non-trivial.
Do you use TensorFlow Lite Micro or Edge Impulse?
Both, depending on the project. Edge Impulse is excellent for fast data collection, training, and a first deployable model; TensorFlow Lite Micro (and ESP-DL) when we need full control over the model, quantization, and the integration into production ESP-IDF firmware. We pick based on the accuracy target, the data you have, and how much custom control the product needs.
When does on-device ML beat sending data to the cloud?
When latency, bandwidth, cost, privacy, or connectivity matter — which is most real deployments. Running inference on the device means instant decisions, no per-image bandwidth, no recurring cloud-inference bill, data that never leaves the device, and a product that still works when the network does not. We help you decide honestly which parts belong on the edge and which belong in the cloud.