✏️AI Demo: Quick Draw Doodle Recognition (RZ/V2N)

A touchscreen sketch recogniser on the SolidRun RZ/V2N SOM. INT8 MobileNetV2 on the DRP-AI3 accelerator — 345 categories, ~1 ms inference, entirely on-device.

Live recognition on the HummingBoard-IIoT — top-5 guesses re-rank with every stroke.

📋 Overview

Production-grade AI, on a single industrial SOM, at a price built for scale.

You draw. The board knows. In one millisecond — locally, from 345 categories.

No GPU server. No cloud subscription. No per-inference billing.

Just one industrial SOM — the RZ/V2N with the DRP-AI3 accelerator — handling everything on-device.

You draw on the touchscreen. The board recognises what you are sketching — live, in milliseconds.

A MobileNetV2 classifier, quantised to INT8 and accelerated by DRP-AI3 (15 Sparse / 4 Dense TOPS), picks the right answer from 345 categories in roughly one millisecond.

See it in action

Sketch a circle. The model offers its best guesses — cookie · pizza · sun · clock.

Add two pointed ears on top. The top prediction pivots — cat.

Add whiskers and a tail. Confidence locks in — cat, 92 %.

That is the demo: live recognition that sharpens with every stroke.

Draw an apple, a bicycle, an airplane, a house, a guitar, a hamburger — any of the 345 categories in Google's Quick, Draw! taxonomy — and the board's 5 best guesses re-rank live.

👉 Source: github.com/SolidRun/quickdraw-rzv2n


🎯 Why It Matters

Real-time AI at the touch surface. Sub-millisecond inference. No perceptible latency between stroke and prediction.

Cloud-free by design. Drawings, predictions, and feedback never leave the board. A natural fit for privacy-sensitive HMI, kiosks, education, and industrial controls.

Cost-effective edge intelligence. DRP-AI3 delivers GPU-class inference performance at a small fraction of the price.

No discrete GPU. No add-on accelerator card. No per-inference cloud bill.

Just one industrial SOM doing the work — at a price point that makes mass deployment realistic.

A complete, reproducible recipe. Train → quantise → compile → cross-compile → deploy → run. Fully scripted. Fully documented. Use it as the starting point for your own edge-AI product.


🧠 The Platform

The demo runs on the HummingBoard-IIoT carrier with the RZ/V2N SOM — SolidRun's industrial Edge-AI platform.

The headlines are below. For the full datasheets and pinouts, follow the links beneath each table.

RZ/V2N SOM — the brain

Highlight
Value

AI engine

DRP-AI3 — 15 Sparse TOPS / 4 Dense TOPS

CPU

4× Arm Cortex-A55 @ up to 1.8 GHz + 1× Cortex-M33 (real-time)

Memory

Up to 8 GB LPDDR4 with inline ECC

Form factor

47 × 30 mm, industrial −40 °C to +85 °C

→ Full specs: RZ/V2N SOM product page · Hardware User Manual

HummingBoard-IIoT — the carrier

Highlight
Value

Industrial I/O

RS232/RS485 · CAN-FD · 2× GbE · USB 3.2 · PoE 802.3at

Display + camera

MIPI-DSI · MIPI-CSI 4-lane

Power

7–32 V wide-input, reverse-polarity protected, or PoE

Operating + OS

Industrial −40 °C to +85 °C · Yocto Linux + Weston

→ Full specs: HummingBoard RZ/V2N IIOT SBC product page · Quick Start Guide

The demo uses a 1024 × 600 touchscreen over MIPI-DSI, running under Weston (Wayland).

No camera required — input comes from the touch surface only.


🏗️ How It Works

A tight, deterministic loop between Cairo, the C++ preprocessor, the MERA2 runtime on DRP-AI3, and GTK3.

The on-board preprocessor and the training pipeline share an identical normalisation convention.

mean = [0,0,0], std = [1,1,1] — sketch-specific, not ImageNet.

Mismatch these values and predictions will silently break — TRAINING.md flags this as critical. The Configuration section repeats the warning below.

Full reference: 🔗 docs/APP.md.


🧮 The Model

Property
Value

Architecture

MobileNetV2 — ImageNet-pretrained backbone, custom 1280 → 768 → 345 head

Dataset

Google Quick, Draw! — 345 classes

Input

[1, 3, 128, 128] (NCHW), grayscale replicated to 3 channels

Quantisation

INT8 — Percentile 99.99 calibration over 1,725 representative PNGs

Compiler

DRP-AI Translator i8 v1.11 → DRP-AI TVM v2.7+ (MERA2 backend)

Inference

~1 ms on DRP-AI3

Training

Two-stage transfer learning — ~6 hours on RTX 5060 Ti — ~82 % validation accuracy

Training is optional.

The repository ships a trained qd_model.onnx and a pre-compiled drpai_model/ so you can deploy in minutes.

Retrain only when you want to narrow the category set, swap the dataset, or experiment with the head.

Hardware requirement — training and building have different needs.

A consumer GPU is enough. Reference run: ~6 hours on an RTX 5060 Ti, ~3 GB VRAM, ~82 % accuracy.

You will need either:

  • A local machine with an NVIDIA GPU — CUDA-capable, ≥ 4 GB VRAM at the default batch size, or

  • A cloud ML training platform — Roboflow, Google Colab, Vertex AI, AWS SageMaker, Azure ML, or Paperspace — that handles the GPU for you.

For the full training recipe — NDJSON download, augmentations, two-stage hyper-parameters, ONNX export verification, calibration generation — see 🔗 docs/TRAINING.md.


🛠️ Build & Deploy

Just want to see it run? Build the quickdraw-demo-image Yocto target instead — a Weston-based image that auto-starts the demo on boot.

→ See all available image targets in SolidRun's BSP layer (branch scarthgap_rzv2n_dev).

The pipeline below is the development path — for retraining, recompiling, or modifying the app.

End to end, the pipeline is five stages — most of them one command.

1

Toolchain — one-time

Build the DRP-AI TVM Docker image. Bundles the Renesas RZ/V LP SDK, the DRP-AI Translator i8, and DRP-AI TVM (MERA2) — reusable across every project.

docs/INSTALL.md

2

Compile the model

compile_model.sh runs inside the Docker container — quantises ONNX to INT8, runs the DRP-AI Translator, and emits a MERA2-ready runtime.

→ ONNX → INT8 → DRP-AI runtime

3

Cross-compile the C++ app

docker_build.sh runs on the host. It auto-detects the container, sources the SDK, and produces an ARM64 binary in board_app/build/.

app_quickdraw_gui (~13 MB)

4

Package

Automatic at the end of docker_build.sh. Assembles a ~46 MB deploy/ folder with the binary, model, MERA2 libs, configs, and run.sh.

→ Ready to ship.

5

Deploy & run

deploy.sh <board-ip> --run SCPs the bundle to the board and launches the app under Weston (Wayland). First run installs MERA2 libs into /usr/lib64/.

→ Live in < 30 seconds.

The fast path:

That is the entire developer loop on a known-good toolchain.

The scripts handle SDK sourcing, Docker container detection, the Wayland environment, and MERA2 library installation on first run.

To recompile the model from a fresh ONNX export — for example after retraining on a custom subset of categories — the model-compile step lives inside the Docker container and is invoked via docker cp + docker exec.

The exact recipe is documented step by step in 🔗 docs/BUILD.md.


📚 The Full Documentation Set

The upstream repository's docs/ folder is the authoritative, command-by-command reference.

Use this page for orientation. Then follow the link that matches your task.

Goal
Read this

Set up the host PC, the Docker image, and the board OS

Train the MobileNetV2 model and export ONNX

Compile the model for DRP-AI, cross-compile, and deploy

Understand the C++ application and its configuration


🎛️ Configuration in Brief

Two text files in the deployed deploy/ directory control everything tunable.

Both are editable on the target. No rebuild required.

  • config.ini — model path · DRP core frequency · AI-MAC frequency.

  • config.json — UI layout · colour palette · prediction smoothing · confidence thresholds · the AI commentary system that drives the witty one-liners above the canvas.

What you can change without rebuilding
Where

DRP / AI-MAC clock — trade power vs. throughput

config.ini[drpai]

Canvas size, brush radius, refresh rate

config.json → UI keys

Auto-predict delay, live-prediction interval

config.jsonauto_predict_delay_ms, live_predict_interval_ms

Confidence threshold and temporal smoothing window

config.jsonconfidence_threshold, smooth_window

AI commentary tone, templates, no-repeat buffer

config.json → comment pools

Brand text, logo placement

config.jsontitle / subtitle / badge

Full reference: 🔗 docs/APP.md.


📚 References

Last updated