Skip to main content
AI & Software

Full-Stack Autonomous Robot: Computer Vision, NLP, and Mechatronics

Maedcore designs and builds a complete autonomous robot: dual-camera computer vision, on-board NLP inference, 15-servo actuation, 3D-printed chassis across 22 parts, and a thermal print output. End-to-end robotics case study.

Eduardo Fuentevilla Blanco

Written by Eduardo Fuentevilla Blanco

Robotics Engineer at Maedcore · Robotics Engineer LinkedIn ↗

February 1, 2024 7 min read
Reviewed by Maedcore Team
Full-Stack Autonomous Robot: Computer Vision, NLP, and Mechatronics
Full-Stack Autonomous Robot: Computer Vision, NLP, and Mechatronics

Full-Stack Autonomous Robot: Computer Vision, NLP, and Mechatronics

Executive summary: Maedcore designed and built a complete autonomous mobile robot from the ground up: a dual-camera computer vision system for object detection and targeting, an on-board NLP inference engine that generates natural-language output, a 15-servo actuation system for locomotion and expression, a 3D-printed chassis manufactured across 22 parts, and a mini thermal printer for physical output delivery. The project covers the full robotics stack — mechanical design, electronics wiring, embedded systems integration, and AI code development — in a single end-to-end build. The deployment application is autonomous art critique; the engineering stack is applicable to inspection robots, autonomous navigation platforms, and human-robot collaboration systems.


The Engineering Scope

This project required Maedcore to operate across five simultaneous engineering disciplines:

  1. Mechanical design — CAD modelling of a quadruped chassis optimised for 3D printability, internal component packaging, and maintenance access.
  2. 3D manufacturing — 22-part print job, with individual parts requiring up to 36 hours of print time, followed by post-processing and assembly.
  3. Electronics integration — full wiring of a 15-servo actuation system, dual cameras, audio hardware, thermal printer, cooling, and power management.
  4. Embedded systems — real-time OS configuration, task scheduling for concurrent computer vision and locomotion control, sensor-to-actuator latency management.
  5. AI development — computer vision model for object detection and targeting, NLP pipeline for natural-language output generation, integration of both into a single operational loop.

The key constraint across all five disciplines: every component must fit within a sealed chassis, with no external wiring or exposed hardware.


Stage 1: Mechanical Design

3D model of the autonomous robot chassis with all electronic components positioned for fit and maintenance access

The mechanical design phase established the constraints for everything downstream:

Conceptual CAD model — overall geometry, joint positions, and movement envelope. The quadruped configuration was chosen for stability on irregular surfaces and for the range of expressive poses achievable through coordinated servo actuation.

Detailed component layout — every electronic component was modelled into the chassis at this stage. Decisions made here determined:

  • Internal airflow paths for the cooling fan.
  • Cable routing from the 15 servos to the central controller.
  • Camera mounting geometry for the required field-of-view angles.
  • Thermal printer placement relative to the output slot in the body.

Manufacturing split — the chassis was divided into 22 printable parts, each sized to fit within the print volume while minimising support material and post-processing time.


Stage 2: 3D Manufacturing and Assembly

The 22 chassis parts were printed sequentially, with the largest requiring up to 36 hours per part. Print parameters (infill density, wall count, layer height) were tuned per part based on structural loading requirements — structural members use higher infill than enclosure panels.

Assembly proceeded in three sub-stages:

  1. Head assembly — integrating the wide-angle camera, autofocus camera, speaker, and microphone into the head unit before closing the enclosure.
  2. Body assembly — routing all cabling from the servo harness, power management board, and thermal printer into the main chassis before closing.
  3. Limb assembly — attaching and calibrating the 15 servos for the legs, tail, and head, with endpoint position verification before software integration.

Stage 3: Electronics and Wiring

The electronics integration required coordinating 14 distinct hardware components:

ComponentFunction
Wide-angle cameraScene detection — identifies objects in the environment
Autofocus cameraTarget capture — high-resolution image acquisition of selected objects
Central processing unitRuns CV model, NLP model, and locomotion controller
15 servosLeg joints (12), tail (1), head pan (1), head tilt (1)
Internal cooling fanThermal management for CPU under inference load
SpeakerAudio output for NLP-generated speech
MicrophoneEnvironmental audio input and interaction detection
Mini thermal printerPhysical output — prints generated text on paper
Communication antennaWireless connectivity for remote monitoring
Power management boardVoltage regulation and battery management
4 wheel drive motorsAutonomous locomotion on flat surfaces

All wiring was completed before the chassis was closed — no post-assembly access to internal cabling.


Stage 4: AI Integration

Autonomous robot computer vision system during object detection validation tests

The AI system operates as a two-stage sequential pipeline:

Stage A — Object Detection and Navigation

The wide-angle camera feeds a computer vision model that detects and classifies objects within the robot’s field of view. When a valid target is identified, the system:

  1. Calculates the angular offset between the robot’s current heading and the target.
  2. Transmits navigation commands to the locomotion controller.
  3. Drives the robot forward, steering toward the target.
  4. Halts when the target is within the autofocus camera’s optimal capture range.

The navigation loop runs at 10 Hz, updating the heading correction on every frame.

Stage B — Analysis and Output Generation

Once positioned, the autofocus camera captures a high-resolution image of the target. This image is processed by the second AI stage:

  1. The CV model extracts visual features from the captured image.
  2. These features are passed to the NLP model, which generates a natural-language description of the target.
  3. The generated text is routed to the speaker for audio output.
  4. Simultaneously, the thermal printer produces a paper printout of the output.

The two-stage pipeline — detection → navigation → capture → generate — runs without human intervention from initial detection to final output.


Performance Metrics

MetricResult
Object detection accuracyHigh — validated across multiple target types and lighting conditions
Navigation-to-target time~15 seconds from detection to optimal capture position
NLP output generation< 3 seconds from image capture to audio output
Servo coordination15 servos operating in coordinated gait patterns without actuation conflicts
Continuous operationMulti-hour autonomous operation between maintenance interventions

Technology Applications

The engineering disciplines exercised in this project map directly to industrial robotics applications:

Autonomous inspection. The perception-navigation-capture pipeline — detect target, move to optimal position, capture high-resolution data — is the core architecture for automated inspection robots in manufacturing, infrastructure, and energy.

Industrial navigation. The multi-servo locomotion system and real-time CV-driven navigation loop scale to warehouse AGVs and assembly-line robots operating in unstructured environments.

Human-robot collaboration. The audio I/O system and NLP output layer are the foundation for robots that communicate their status, findings, and actions to human operators in natural language.

Computer vision quality control. The two-stage CV pipeline (wide-angle detection + autofocus capture) is directly applicable to automated visual inspection systems for defect detection on production lines.


Technologies Used

Project developed with: Autonomous Robotics — Computer Vision — NLP — Embedded Systems — 3D Design and Manufacturing — Servo Control — Mechatronics — Edge AI Inference


Building an Autonomous Robot or Computer Vision System?

This project demonstrates that Maedcore can take a complex robotics system from CAD design to fully autonomous operation — integrating mechanical engineering, electronics, embedded systems, and AI in a single build. If you have a robotics, inspection, or computer vision requirement, request a technical quote.

Request Robotics Quote | View Mechatronics Services | View AI Services

About the Author

Eduardo Fuentevilla Blanco

Eduardo Fuentevilla Blanco

Robotics Engineer

For over a decade, I have been driven by a single mission: leveraging AI and robotics to build a world of automated production. I believe that by creating self-sufficient systems, we can empower people to refocus on what truly matters—their families and their passions. My expertise spans from winning prestigious European startup competitions to architecting complex, integrated hardware and software projects. I specialize in bridging the gap between today's industrial challenges and tomorrow's autonomous solutions.

AI & RoboticsIndustrial AutomationHardware & Software IntegrationIoT
LinkedIn ↗

Expert review: Maedcore Team

Ready to transform your company?

Book a free 30-minute meeting with an engineer.