Skip to main content
AI & Software

Conversational AI on Embedded Hardware: Baru HRI System

Maedcore builds Baru: a conversational AI system running on embedded hardware with multi-modal sensor input (voice, touch, distance), adaptive NLP, and session-level personalisation. Full HRI case study.

Eduardo Fuentevilla Blanco

Written by Eduardo Fuentevilla Blanco

Robotics Engineer at Maedcore · Robotics Engineer LinkedIn ↗

December 1, 2024 5 min read
Reviewed by Maedcore Team
Conversational AI on Embedded Hardware: Baru HRI System
Conversational AI on Embedded Hardware: Baru HRI System

Conversational AI on Embedded Hardware: The Baru HRI System

Executive summary: Baru is a complete human-robot interaction (HRI) system built by Maedcore: a conversational AI engine running on embedded hardware, with a multi-modal sensor array (distance, touch, voice), session-state management, and continuous NLP personalisation. The deployment form factor is a child-safe zoomorphic enclosure — a design decision that eliminates the acceptance barrier in constrained interaction contexts. The underlying architecture — edge AI inference, multi-modal input fusion, adaptive response generation — is directly applicable to industrial HMI panels, voice-controlled machinery, and human-machine collaboration systems in manufacturing environments.


The Engineering Challenge

Building a conversational AI system that runs reliably on constrained embedded hardware — without cloud dependency — while handling multi-modal input in real time presents three core challenges:

Latency under resource constraints. NLP inference and sensor polling must run concurrently on a single embedded system without perceptible response delay. Any lag between user input and system response breaks the interaction loop and degrades perceived intelligence.

Multi-modal input fusion. The system receives simultaneous input from three sensor types — ultrasonic distance sensors, capacitive touch sensors, and a microphone array — each with different polling rates and data formats. The controller must fuse these streams into a coherent interaction context.

Adaptive personalisation without cloud dependency. Session state and interaction history are stored and processed on-device, allowing the AI to personalise responses over time without transmitting sensitive data to external servers.


System Architecture

Baru conversational AI system with multi-modal sensor interface and adaptive NLP engine

The Baru system operates on three integrated layers:

Layer 1 — Multi-Modal Sensor Input

Three sensor streams feed the interaction controller simultaneously:

  • Ultrasonic distance sensors — detect user proximity and presence, triggering wake-on-approach behaviour without requiring explicit user action.
  • Capacitive touch sensors — register intentional contact input, mapped to interaction triggers and conversational branching points.
  • Microphone array — captures voice input for NLP processing, with hardware-level noise filtering for noisy environments.

All three streams are polled asynchronously and merged by the input fusion controller, which assigns priority weights based on interaction context (e.g., voice dominates when the user is speaking; proximity dominates at session initialisation).

Layer 2 — Conversational AI Engine

The NLP pipeline processes fused input and generates responses on-device:

Speech-to-intent classification maps spoken input to one of the system’s defined interaction intents, handling natural linguistic variation without requiring exact phrasing.

Contextual response generation selects and adapts output based on the current session state, the user’s interaction history, and the active intent. Responses are generated as parameterised templates, allowing variation without requiring generative inference at runtime.

Emotion and engagement signalling is expressed via the expressive display (facial states) and audio output, synchronized to the NLP response.

Layer 3 — Session State and Personalisation

Interaction data is persisted per user session:

  • Vocabulary level and response complexity adapt to demonstrated linguistic patterns.
  • Engagement metrics (response latency, touch frequency, proximity patterns) update the personalisation model after each session.
  • Cumulative data is available for export to external analysis systems via a secure local API — no cloud transmission required.

Implementation: Embedded Hardware Integration

Baru system during embedded hardware integration and sensor validation phase

The hardware integration process involved three engineering phases:

Component specification and layout. Processing unit selection balanced NLP inference performance against power envelope and thermal constraints. Sensor placement was validated against occlusion patterns and interaction geometry — the system must detect approach from any angle, regardless of user height.

Real-time OS configuration. The embedded OS was configured for deterministic task scheduling, ensuring the NLP inference loop and sensor polling loops share CPU time without priority inversion or starvation under peak load.

Acoustic enclosure engineering. The microphone array required an acoustic geometry that maximises voice pickup while attenuating structural vibration from the servo-driven display actuators embedded in the same chassis.


Performance Results

MetricResult
Voice-to-response latency< 800 ms end-to-end on-device
Sensor fusion polling rate60 Hz across all three streams
Session personalisation dataStored and updated per interaction, no cloud dependency
Operating environmentContinuous multi-hour operation at room temperature
Input variability toleranceHandles natural speech variation, background noise, and partial sensor occlusion

Technology Applications Beyond This Deployment

The Baru HRI architecture addresses a class of problems that recurs across industrial and enterprise contexts:

Industrial HMI panels. A voice and touch interface running conversational AI on embedded hardware — with no cloud dependency — is directly applicable to factory-floor control panels where network connectivity is unreliable and response latency is critical.

Voice-controlled machinery. The multi-modal input fusion layer (voice + proximity + touch) provides a more robust control interface than single-modality voice systems, reducing false-trigger rates in noisy industrial environments.

Human-machine collaboration. The adaptive session state layer — building a behavioural model of the operator over time — is the foundation for assistive systems that adjust to individual work patterns rather than requiring fixed interaction protocols.

Accessibility interfaces. The system’s low-pressure, non-screen-dependent interaction model translates directly to accessible interfaces for operators with motor or cognitive constraints.


Technologies Used

Project developed with: Conversational AI — NLP — Embedded Systems — Edge AI Inference — Multi-Modal Sensor Fusion — Human-Robot Interaction (HRI) — Session State Management


Building an HRI or Conversational AI System?

Baru demonstrates Maedcore’s ability to take a conversational AI system from architecture to embedded hardware deployment — without cloud dependency, with full sensor integration and adaptive personalisation. If you need an HRI solution, an industrial voice interface, or an edge AI system for a constrained environment, request a technical consultation.

Talk to the AI Team | View AI & Software Services | See All Success Stories

About the Author

Eduardo Fuentevilla Blanco

Eduardo Fuentevilla Blanco

Robotics Engineer

For over a decade, I have been driven by a single mission: leveraging AI and robotics to build a world of automated production. I believe that by creating self-sufficient systems, we can empower people to refocus on what truly matters—their families and their passions. My expertise spans from winning prestigious European startup competitions to architecting complex, integrated hardware and software projects. I specialize in bridging the gap between today's industrial challenges and tomorrow's autonomous solutions.

AI & RoboticsIndustrial AutomationHardware & Software IntegrationIoT
LinkedIn ↗

Expert review: Maedcore Team

Ready to transform your company?

Book a free 30-minute meeting with an engineer.