Communication Protocols: ROS 2, EtherCAT, and Real-Time Networking for Humanoid Robot Subsystems
DOI: 10.5281/zenodo.19153562[1] · View on Zenodo (CERN)
| Badge | Metric | Value | Status | Description |
|---|---|---|---|---|
| [s] | Reviewed Sources | 7% | ○ | ≥80% from editorially reviewed sources |
| [t] | Trusted | 50% | ○ | ≥80% from verified, high-quality sources |
| [a] | DOI | 21% | ○ | ≥80% have a Digital Object Identifier |
| [b] | CrossRef | 0% | ○ | ≥80% indexed in CrossRef |
| [i] | Indexed | 100% | ✓ | ≥80% have metadata indexed |
| [l] | Academic | 21% | ○ | ≥80% from journals/conferences/preprints |
| [f] | Free Access | 79% | ○ | ≥80% are freely accessible |
| [r] | References | 14 refs | ✓ | Minimum 10 references required |
| [w] | Words [REQ] | 2,872 | ✓ | Minimum 2,000 words for a full research article. Current: 2,872 |
| [d] | DOI [REQ] | ✓ | ✓ | Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19153562 |
| [o] | ORCID [REQ] | ✓ | ✓ | Author ORCID verified for academic identity |
| [p] | Peer Reviewed [REQ] | — | ✗ | Peer reviewed by an assigned reviewer |
| [h] | Freshness [REQ] | 42% | ✗ | ≥80% of references from 2025–2026. Current: 42% |
| [c] | Data Charts | 0 | ○ | Original data charts from reproducible analysis (min 2). Current: 0 |
| [g] | Code | — | ○ | Source code available on GitHub |
| [m] | Diagrams | 3 | ✓ | Mermaid architecture/flow diagrams. Current: 3 |
| [x] | Cited by | 0 | ○ | Referenced by 0 other hub article(s) |
Abstract #
A humanoid robot with more than forty actuated degrees of freedom generates a continuous stream of sensor readings, motor commands, and state estimates that must be exchanged between distributed controllers within deterministic time bounds. This article examines the communication protocols that form the nervous system of an open-source humanoid platform: the Robot Operating System 2 (ROS 2) middleware layer for high-level coordination, the EtherCAT fieldbus for sub-millisecond servo-loop networking, and emerging standards such as Time-Sensitive Networking (TSN) and Zenoh that promise to unify best-effort and hard-real-time traffic on a single physical medium. We analyse the layered architecture from application publish-subscribe down to wire-level frame processing, quantify latency and jitter constraints for bipedal balance control, and propose a dual-bus topology that separates safety-critical motion data from perception and planning traffic. The discussion incorporates recent benchmarks from the ROS 2 real-time community, the NXP-NVIDIA Holoscan Sensor Bridge reference design for humanoid motor control, and the Meta-ROS middleware evaluation. By the end of this article, the reader will understand how to select, integrate, and validate communication stacks for a forty-plus-joint humanoid that must walk, manipulate, and interact in real time.
1. Introduction #
In the previous article, we analysed battery architecture, energy harvesting strategies, and runtime optimisation techniques for autonomous humanoid robots, establishing that sustained bipedal operation demands careful co-design of power electronics and control software (Ivchenko, 2026). With an energy budget defined, the next engineering challenge is equally fundamental: how do the dozens of sensors, actuators, and compute units inside the robot exchange data fast enough to keep it upright?
The communication subsystem of a humanoid is analogous to the human nervous system. Peripheral sensors — joint encoders, inertial measurement units, force-torque plates — must report to central controllers with latencies measured in microseconds, while higher-level perception and planning modules tolerate millisecond-scale delays (Casini et al., 2026[2]). A mismatch between the timing requirements of the control loop and the guarantees of the communication stack leads to instability: a bipedal robot that receives a foot-contact signal 2 ms late may already be falling.
The robotics community has converged on ROS 2 as the de facto middleware for modular, distributed software development (Ranjan et al., 2026[3]). However, ROS 2 alone cannot satisfy the sub-100-microsecond cycle times demanded by high-bandwidth servo loops. Industrial fieldbus protocols — EtherCAT foremost among them — fill this gap by providing hardware-synchronised, deterministic Ethernet communication at cycle times as low as 62.5 microseconds (L&P, 2025[4]). Meanwhile, the IEEE 802.1 Time-Sensitive Networking standard suite and the Zenoh protocol are reshaping the landscape by converging best-effort and guaranteed traffic onto unified Ethernet infrastructure (NXP-NVIDIA, 2026[5]).
This article maps the full communication stack of an open-source humanoid — from application-layer publish-subscribe semantics to wire-level frame scheduling — and proposes a practical dual-bus architecture that separates safety-critical motion traffic from perception data.
2. Communication Requirements for Bipedal Humanoids #
2.1 Timing Domains #
Not all data inside a humanoid shares the same urgency. We identify three distinct timing domains, each with its own latency and jitter budget.
The hard-real-time domain encompasses the servo control loop: joint position, velocity, and torque commands exchanged between the central motion controller and each actuator driver. For stable bipedal locomotion at a 1 kHz control rate, end-to-end latency from sensor sampling to actuator command must remain below 500 microseconds, with jitter under 50 microseconds (Casini et al., 2026[2]). Missing a single deadline can excite resonance modes in the mechanical structure.
The soft-real-time domain covers perception pipelines: depth-camera point clouds, LIDAR scans, and speech-recognition streams. These typically operate at 30 to 60 Hz with tolerable latencies of 10 to 30 milliseconds. Occasional frame drops degrade performance but do not cause immediate instability.
The best-effort domain includes logging, telemetry, diagnostics, and over-the-air updates. Throughput matters more than determinism; standard TCP/IP suffices.
flowchart TD
subgraph Hard Real-Time
A[Joint Encoders] -->|< 500 us| B[Motion Controller]
B -->|< 500 us| C[Actuator Drivers]
D[IMU / F-T Sensors] -->|< 500 us| B
end
subgraph Soft Real-Time
E[Depth Cameras] -->|10-30 ms| F[Perception Pipeline]
G[LIDAR] -->|10-30 ms| F
F --> H[Planning Module]
end
subgraph Best Effort
I[Diagnostics] --> J[Logging Server]
K[Telemetry] --> L[Cloud Dashboard]
end
H --> B
2.2 Data Volume Estimation #
A humanoid with 42 actuated joints, each reporting a 12-byte state vector (position, velocity, torque) at 1 kHz, generates approximately 504 bytes per millisecond of servo data — modest by Ethernet standards. However, when six depth cameras stream 640 x 480 depth maps at 30 fps, the perception bus must sustain roughly 110 megabits per second. These wildly different bandwidth profiles reinforce the argument for bus separation.
3. ROS 2 as the Application-Layer Middleware #
3.1 Architecture and the DDS Abstraction #
ROS 2 adopts a layered architecture: application nodes communicate through a client library (rclcpp for C++), which delegates to the ROS client layer (rcl), which in turn interfaces with a pluggable middleware layer (rmw). The rmw layer historically wraps a Data Distribution Service (DDS) implementation — Fast DDS, Cyclone DDS, or Connext DDS — each offering publish-subscribe semantics with configurable Quality of Service (QoS) policies (Casini et al., 2026[2]).
For humanoid robotics, QoS configuration is critical. The RELIABLE reliability policy with KEEPLAST history depth of 1, combined with a DEADLINE QoS of 10 milliseconds, ensures that perception nodes detect stale data. For servo-loop topics, however, the BESTEFFORT reliability policy paired with KEEP_LAST depth 1 minimises buffering delay at the cost of occasional message loss — acceptable because the next cycle’s data supersedes the lost sample.
3.2 Real-Time Executors and Scheduling #
The default ROS 2 single-threaded executor processes callbacks sequentially, introducing priority inversion and unbounded blocking when high-priority and low-priority callbacks share an executor. The community has responded with specialised executor designs: the events executor, the priority-based executor, and the LET (Logical Execution Time) executor that decouples read and write phases to eliminate data races (Casini et al., 2026[2]).
For our humanoid reference design, we recommend a multi-executor topology: a dedicated real-time executor on an isolated CPU core for the motion-control node, a separate executor for perception callbacks, and a default executor for diagnostics. Combined with the PREEMPT_RT Linux kernel patch, this arrangement achieves worst-case callback latencies below 200 microseconds on commodity ARM processors.
3.3 Zenoh as an Alternative RMW #
Since ROS 2 Jazzy Jalisco, Zenoh has been available as an alternative rmw implementation (rmw_zenoh). Benchmarks show Zenoh delivering up to 30 percent higher throughput than Fast DDS while reducing tail latency, particularly in mesh-network topologies where DDS discovery overhead becomes significant (Ranjan et al., 2026[3]). For a humanoid operating in a factory alongside other robots, Zenoh’s peer-to-peer routing and built-in bridging capability simplify multi-robot coordination without the UDP multicast storms that plague DDS in large deployments.
Chovet et al. demonstrated in real-world non-line-of-sight mesh networks that Zenoh as an ROS 2 RMW provides faster communication and greater message reachability with lower CPU usage, though at the cost of increased RAM consumption (Kluener et al., 2026[6]). For humanoid platforms where onboard compute is typically an NVIDIA Jetson Orin or equivalent, the RAM trade-off is acceptable given the latency benefits.
4. EtherCAT: The Fieldbus for Servo-Loop Networking #
4.1 Operating Principles #
EtherCAT (Ethernet for Control Automation Technology) processes standard Ethernet frames using a “processing on the fly” model: each slave node reads its addressed data from the passing frame and inserts its response in the same frame within nanoseconds, without store-and-forward buffering. The master transmits a single frame that traverses the entire daisy-chain and returns with all slave responses in a single network round-trip (L&P, 2025[4]).
This architecture yields sub-microsecond jitter and supports cycle times as low as 62.5 microseconds for small process-data images. For a 42-joint humanoid, a typical EtherCAT telegram carrying 504 bytes of joint data completes a full bus cycle in approximately 80 microseconds on a 100 Mbit/s link — well within the 500-microsecond budget for the servo loop.
4.2 Distributed Clocks and Synchronisation #
EtherCAT’s Distributed Clock (DC) mechanism synchronises all slave devices to a common time reference with sub-microsecond accuracy. Each slave maintains a local 64-bit nanosecond counter that is periodically corrected against the reference clock (typically the first slave in the chain). For humanoid locomotion, DC synchronisation ensures that all joint actuators sample their encoders and apply torque commands at precisely the same instant — a prerequisite for coordinated multi-joint trajectories.
flowchart LR
M[EtherCAT Master] -->|Frame Out| S1[Hip Joint L]
S1 --> S2[Knee Joint L]
S2 --> S3[Ankle Joint L]
S3 --> S4[Hip Joint R]
S4 --> S5[Knee Joint R]
S5 --> S6[Ankle Joint R]
S6 -->|Frame Return| M
S1 -.->|DC Sync| S2
S2 -.->|DC Sync| S3
S4 -.->|DC Sync| S5
S5 -.->|DC Sync| S6
4.3 Integration with ROS 2 via ros2_control #
The ros2control framework provides a hardware abstraction layer that bridges ROS 2 topics and EtherCAT process data. The ethercatdriver_ros2 package implements a hardware interface that maps joint state and command interfaces to EtherCAT slave Process Data Objects (PDOs). At each control cycle, the controller manager reads joint states from the EtherCAT bus, executes the active controller (e.g., a whole-body inverse-dynamics controller), and writes computed torques back to the bus — all within a single real-time thread (Stogl and Banovic, 2026[7]).
The Electromate HEJ 90 high-efficiency joint module exemplifies modern EtherCAT-enabled actuators for mobile robotics: it supports real-time control rates up to 1 kHz with dual EtherCAT ports for daisy chaining, simplifying wiring in multi-joint robots (Electromate, 2026[8]).
5. Emerging Standards: TSN and Unified Networking #
5.1 Time-Sensitive Networking (IEEE 802.1) #
TSN is a suite of IEEE 802.1 standards that add deterministic scheduling, bounded latency, and precise time synchronisation to standard Ethernet. Key mechanisms include IEEE 802.1Qbv (Time-Aware Shaper), which reserves guaranteed time slots for critical traffic within each network cycle, and IEEE 802.1AS (generalised Precision Time Protocol), which synchronises all network endpoints to a common clock with sub-microsecond accuracy.
For humanoid robotics, TSN’s promise is convergence: instead of maintaining separate physical networks for servo data, perception, and diagnostics, a TSN-capable switch can guarantee sub-millisecond latency for motion-control streams while simultaneously carrying best-effort perception traffic on the same Gigabit Ethernet links. The NXP-NVIDIA collaboration announced in March 2026 explicitly targets this architecture, with NXP’s S32J TSN switch aggregating kinematic chains of i.MX RT1180 motor-control MCUs and connecting directly to NVIDIA’s Jetson Thor robot brain via the Holoscan Sensor Bridge (NXP-NVIDIA, 2026[5]).
5.2 OPC UA FX over TSN #
The industrial automation community is converging on OPC UA Field eXchange (FX) as the application-layer protocol for TSN-based factory networks. OPC UA FX provides publish-subscribe semantics with integrated security and information modelling, running directly over TSN Ethernet without requiring proprietary fieldbus stacks. For open-source humanoid platforms, OPC UA FX offers a vendor-neutral alternative to EtherCAT — though as of early 2026, the ecosystem of OPC UA FX-compatible servo drives remains limited compared to the mature EtherCAT device catalogue (PLC DCS Pro, 2026[9]).
5.3 The Convergence Timeline #
The report by Global and China Next-Generation Embodied AI Robot Communication Network Topology and Chip Industry (2026) highlights that humanoid robots with more than 40 joint degrees of freedom place extraordinary demands on communication node integration and real-time performance, driving chip vendors to develop unified communication SoCs that combine EtherCAT slave controllers, TSN switches, and wireless connectivity on a single die (GlobeNewswire, 2026[10]).
graph TD
subgraph Current Architecture
A1[EtherCAT Bus] -->|Servo Data| B1[Motion Controller]
A2[Ethernet] -->|Perception| B2[Planning Node]
A3[Wi-Fi] -->|Telemetry| B3[Cloud]
end
subgraph TSN Converged Architecture
C1[TSN Switch] -->|Priority Queue 7| D1[Servo Data Stream]
C1 -->|Priority Queue 5| D2[Perception Stream]
C1 -->|Priority Queue 0| D3[Diagnostics Stream]
D1 --> E1[Motion Controller]
D2 --> E2[Planning Node]
D3 --> E3[Cloud]
end
6. Proposed Dual-Bus Architecture #
6.1 Design Rationale #
Despite TSN’s convergence promise, we recommend a pragmatic dual-bus topology for the Open Humanoid reference platform in 2026. The primary bus is a 100 Mbit/s EtherCAT daisy chain connecting all 42 joint actuators to the real-time motion controller, operating at a 1 kHz cycle rate. The secondary bus is a Gigabit Ethernet backbone with optional TSN scheduling, carrying ROS 2 perception, planning, and diagnostic traffic between the central compute unit, camera modules, and external interfaces.
This separation provides defence in depth: even if a perception node crashes and floods the Ethernet backbone with error traffic, the EtherCAT servo loop remains unaffected. The two buses connect only through the motion-controller node, which runs on an isolated CPU core with PREEMPT_RT scheduling and acts as a gateway between the hard-real-time and soft-real-time domains.
6.2 Wiring Topology #
The EtherCAT chain follows the robot’s kinematic tree. Starting from the motion-controller board mounted in the torso, the chain splits into three branches via an EtherCAT junction slave: left leg (6 joints), right leg (6 joints), and upper body (12 arm joints plus 6 hand joints plus 4 neck/head joints plus 8 torso/spine joints). Each branch terminates with an EtherCAT end terminator. The total chain length of 42 slaves with 12 bytes of process data each yields a telegram size of approximately 560 bytes (including headers), completing a round trip in under 100 microseconds.
The Gigabit backbone uses a compact managed switch mounted in the torso, with individual Ethernet runs to each camera module, the central GPU compute board, and an external RJ45 port for development and diagnostics.
6.3 Fault Isolation and Redundancy #
EtherCAT supports cable redundancy through its hot-connect and ring topology capabilities. By connecting the last slave in each branch back to a secondary EtherCAT port on the master, the bus can tolerate a single cable break without losing communication to any slave. For the Open Humanoid, we implement ring topology on the leg chains (where cable stress from walking is highest) and leave the upper-body chain in standard daisy-chain mode.
7. Performance Validation Framework #
7.1 Latency and Jitter Measurement #
Validating the communication stack requires instrumented benchmarks at multiple layers. At the EtherCAT level, the IgH EtherCAT Master for Linux provides built-in timing statistics: cycle jitter, frame round-trip time, and working-counter error rates. These should be logged during representative motion sequences — walking, stair climbing, object manipulation — to capture worst-case timing under realistic computational load.
At the ROS 2 level, the ros2tracing package and LTTng kernel tracing enable end-to-end callback latency measurement from DDS message arrival through executor dispatch to application callback completion. Combined with cyclictest results on the PREEMPTRT kernel, this provides a complete picture of the real-time budget allocation.
A Docker-enabled real-time framework for robotic applications in heterogeneous ROS 2 environments has demonstrated that containerised ROS 2 nodes can achieve near-native real-time performance when configured with CPU pinning, memory locking, and appropriate scheduling policies, enabling reproducible deployment of validated communication stacks (MDPI Processes, 2026[11]).
7.2 Acceptance Criteria #
For the Open Humanoid reference platform, we define the following communication performance acceptance criteria: EtherCAT servo-loop jitter must remain below 20 microseconds RMS during walking at 1.0 m/s; ROS 2 perception callback latency must stay below 15 milliseconds at the 99th percentile; and zero EtherCAT working-counter errors must occur during a continuous 24-hour soak test. These criteria should be verified during full-body commissioning, which the next article in this series will address.
| Metric | Target | Measurement Tool |
|---|---|---|
| EtherCAT cycle jitter (RMS) | < 20 us | IgH Master statistics |
| EtherCAT round-trip time (42 slaves) | < 100 us | Oscilloscope + GPIO toggle |
| ROS 2 perception callback (p99) | < 15 ms | ros2_tracing + LTTng |
| DC clock deviation (max) | < 1 us | EtherCAT DC diagnostics |
| Working-counter errors (24h soak) | 0 | IgH Master error log |
| Zenoh pub-sub latency (p99) | < 5 ms | rmw_zenoh benchmarks |
8. Security Considerations #
Communication protocols for humanoid robots introduce attack surfaces that must be addressed before deployment in human-occupied environments. Abaza et al. provided a systematic cybersecurity assessment of the humanoid ecosystem, identifying ROS 2 DDS discovery, EtherCAT slave firmware updates, and wireless telemetry links as primary attack vectors (Abaza et al., 2025[12]). The study recommends DDS Security (the OMG DDS-Security specification) for encrypting and authenticating ROS 2 topic traffic, EtherCAT Safety-over-EtherCAT (FSoE) for functional-safety integrity of the servo bus, and WPA3-Enterprise with certificate-based authentication for any wireless management interface.
For the Open Humanoid, we adopt a defence-in-depth posture: the EtherCAT bus is air-gapped from external networks (accessible only through the motion-controller gateway), ROS 2 DDS Security is enabled for all inter-node communication on the Gigabit backbone, and the external diagnostic port requires SSH tunnel authentication before exposing any ROS 2 topics.
9. Conclusion #
The communication subsystem is the nervous system of a humanoid robot, and its design determines whether forty-plus joints can move in concert or collapse into uncoordinated failure. This article has mapped the full protocol stack from ROS 2 publish-subscribe semantics and real-time executor scheduling, through EtherCAT fieldbus networking with distributed-clock synchronisation, to emerging TSN and Zenoh technologies that promise network convergence.
Our proposed dual-bus architecture — an EtherCAT servo loop for hard-real-time motion control and a Gigabit Ethernet backbone for perception and planning — balances determinism with flexibility. The architecture leverages mature industrial technology where timing guarantees are non-negotiable, while embracing open standards where modularity and extensibility matter most.
The NXP-NVIDIA Holoscan Sensor Bridge reference design, the Meta-ROS middleware benchmarks, and the comprehensive ROS 2 real-time survey collectively demonstrate that the building blocks for open-source humanoid communication are rapidly maturing. As TSN-capable switches and unified communication SoCs become commercially available in 2026, the path toward single-network convergence will shorten — but for today’s Open Humanoid builder, the dual-bus approach provides the safest and most predictable foundation.
The next article in this series will address human-robot interaction — gesture recognition, emotion detection, and social behaviour — exploring how the communication infrastructure established here carries not just motor commands but the higher-level signals that enable a humanoid to operate alongside people.
References (12) #
- Stabilarity Research Hub. Communication Protocols: ROS 2, EtherCAT, and Real-Time Networking for Humanoid Robot Subsystems. doi.org. dti
- (2026). [2601.10722] A Survey of Real-Time Support, Analysis, and Advancements in ROS 2. doi.org. dti
- (2026). [2601.21011] Meta-ROS: A Next-Generation Middleware Architecture for Adaptive and Scalable Robotic Systems. doi.org. dti
- EtherCAT Explained: The Definitive Guide to Real-Time Industrial Ethernet. resources.l-p.com. iv
- (2026). NXP and Nvidia collaborate on edge computing platform for humanoid robots. roboticsandautomationnews.com. iv
- (20or). [2412.07817] Modern Middlewares for Automated Vehicles: A Tutorial. arxiv.org. tii
- (2026). ROS 2 in Industry: Key Takeaways from the ROS-Industrial Conference 2025 — ROS-Industrial. rosindustrial.org. ia
- Just a moment…. control.com. iv
- Humanoid Robotics: Navigating the Limits of Wheeled Automation – PLC DCS Pro Ltd.. plcdcspro.com. iv
- (2026). Global and China Next-Generation Embodied AI Robot. globenewswire.com. iv
- Access Denied. mdpi.com. rtil
- (20or). [2508.17481] SoK: Cybersecurity Assessment of Humanoid Ecosystem. arxiv.org. tii