Proprioception and Internal State Estimation: Joint Encoders, Torque Sensing, and Body Schema for Humanoid Robots
DOI: 10.5281/zenodo.19057213[1] · View on Zenodo (CERN)
| Badge | Metric | Value | Status | Description |
|---|---|---|---|---|
| [s] | Reviewed Sources | 45% | ○ | ≥80% from editorially reviewed sources |
| [t] | Trusted | 90% | ✓ | ≥80% from verified, high-quality sources |
| [a] | DOI | 55% | ○ | ≥80% have a Digital Object Identifier |
| [b] | CrossRef | 25% | ○ | ≥80% indexed in CrossRef |
| [i] | Indexed | 85% | ✓ | ≥80% have metadata indexed |
| [l] | Academic | 60% | ○ | ≥80% from journals/conferences/preprints |
| [f] | Free Access | 65% | ○ | ≥80% are freely accessible |
| [r] | References | 20 refs | ✓ | Minimum 10 references required |
| [w] | Words [REQ] | 2,972 | ✓ | Minimum 2,000 words for a full research article. Current: 2,972 |
| [d] | DOI [REQ] | ✓ | ✓ | Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19057213 |
| [o] | ORCID [REQ] | ✓ | ✓ | Author ORCID verified for academic identity |
| [p] | Peer Reviewed [REQ] | — | ✗ | Peer reviewed by an assigned reviewer |
| [h] | Freshness [REQ] | 50% | ✗ | ≥80% of references from 2025–2026. Current: 50% |
| [c] | Data Charts | 0 | ○ | Original data charts from reproducible analysis (min 2). Current: 0 |
| [g] | Code | — | ○ | Source code available on GitHub |
| [m] | Diagrams | 3 | ✓ | Mermaid architecture/flow diagrams. Current: 3 |
| [x] | Cited by | 0 | ○ | Referenced by 0 other hub article(s) |
Abstract #
A humanoid robot that cannot sense its own body is a humanoid robot that falls down. Proprioception — the internal sensing of joint positions, velocities, torques, and overall body configuration — is the foundation upon which every higher-level capability depends. Without accurate proprioceptive state estimation, locomotion controllers cannot maintain balance, manipulation pipelines cannot close force loops, and safety systems cannot detect collisions. This article examines the engineering of proprioceptive systems for humanoid robots: the sensor hardware (joint encoders, torque sensors, inertial measurement units), the estimation algorithms (extended Kalman filters, factor graphs, learned estimators), and the emerging concept of robotic body schema — a continuously updated internal model of the robot’s own geometry and dynamics. We ground the discussion in recent advances from 2025–2026 research and connect each subsystem to the open-source design philosophy of the Open Humanoid project.
graph TD
A[Joint Encoders] --> D[State Estimator]
B[IMU Array] --> D
C[Torque Sensors] --> D
D --> E[Body Schema Model]
E --> F[Locomotion Controller]
E --> G[Manipulation Pipeline]
E --> H[Safety System]
D --> I[Contact Estimator]
I --> F
style D fill:#4a90d9,color:#fff
style E fill:#7b68ee,color:#fff
Why Proprioception Is the Hardest Sensing Problem #
External perception — cameras, LiDAR, depth sensors — gets most of the engineering attention in robotics. Yet the most critical sensing challenge for a walking humanoid is internal. Consider what a bipedal robot must know about itself at every control cycle:
- Joint positions to within 0.01 degrees across 30+ degrees of freedom
- Joint velocities for derivative control terms, updated at 1 kHz or faster
- Joint torques for force control, compliance, and collision detection
- Base pose and velocity — the position and orientation of the torso in world coordinates, despite having no direct sensor for this quantity
- Contact state — which feet (or hands, or body surfaces) are in contact with the environment, and with what forces
The difficulty is compounded by the fact that humanoid robots are fundamentally underactuated systems. The torso floats freely — no sensor directly measures its position in space. Every estimate of base state must be inferred from the combination of inertial measurements, joint kinematics, and contact assumptions. This inference problem is the core of proprioceptive state estimation.
As Rotella et al. (2023)[2] note in their comprehensive review of perception for humanoid robots, internal state estimation makes extensive use of Bayesian filtering methods and optimization techniques based on maximum a-posteriori formulation by utilizing proprioceptive sensing. The field has matured considerably, but significant challenges remain — particularly for humanoids that must operate in unstructured environments where contact assumptions frequently break down.
Joint Encoders: The Foundation of Position Sensing #
Encoder Types and Trade-offs #
Every proprioceptive system begins with joint encoders — sensors that measure the angular position of each actuated joint. For humanoid robots, the choice of encoder technology has cascading consequences for the entire control architecture.
Optical incremental encoders remain the workhorse for high-speed joints. They provide relative position changes at resolutions of 10,000+ counts per revolution, with latency under 1 microsecond. The limitation: they require a homing procedure at startup and can lose track during power failures or severe impacts. For a humanoid that may be powered off in an arbitrary configuration, this is a non-trivial operational constraint.
Absolute magnetic encoders solve the homing problem by providing absolute position at any time. Modern designs based on Hall-effect arrays or magnetoresistive elements achieve 14-bit resolution (0.022 degrees) in packages small enough to integrate directly into joint actuator modules. The trade-off is lower maximum update rate and susceptibility to magnetic interference from nearby motors.
Capacitive encoders represent a newer alternative that combines absolute measurement with high resolution and immunity to magnetic fields. Several commercial offerings now achieve 19-bit resolution with update rates above 10 kHz, making them increasingly attractive for humanoid applications.
Encoder Placement and the Elasticity Problem #
A subtle but critical design decision is where to place the encoder relative to the joint’s transmission. In a rigid transmission (direct drive or low-ratio gearbox), a single encoder on the motor shaft suffices — motor position directly maps to joint position through a known gear ratio. But humanoid joints increasingly use series elastic actuators (SEAs) or high-ratio harmonic drives that introduce compliance between motor and output.
For these joints, a single motor-side encoder cannot accurately measure the actual output position under load. The solution is dual encoding: one sensor on the motor shaft and one on the joint output. The difference between these two measurements, combined with the known spring constant, provides a direct estimate of joint torque — a technique that forms the basis of most torque sensing in modern humanoid robots.
graph LR
M[Motor] --> E1[Motor Encoder]
M --> G[Gear/Spring]
G --> J[Joint Output]
J --> E2[Output Encoder]
E1 --> T[Torque Estimate]
E2 --> T
T --> |"τ = k(θ_motor/N - θ_output)"| C[Controller]
style T fill:#e67e22,color:#fff
style C fill:#27ae60,color:#fff
Torque Sensing: Closing the Force Loop #
Why Torque Matters for Humanoids #
Position control alone is insufficient for humanoid robots. A robot reaching for a door handle needs to sense the resistance when it pushes. A robot walking on soft ground needs to detect that the expected ground reaction force has changed. A robot carrying a heavy object needs to compensate for the load’s effect on joint dynamics.
Torque sensing enables all of these capabilities. More fundamentally, torque sensing transforms a position-controlled robot into a force-controlled robot — and force control is what separates industrial manipulators (which work in carefully controlled environments) from humanoids (which must interact safely and adaptively with the unstructured world).
Dedicated Torque Sensors #
While encoder-based torque estimation through series elasticity works well for many applications, dedicated torque sensors offer higher bandwidth and accuracy. Research on structural optimization of high-sensitivity torque sensors for robotic joints[3] published in Sensors (2026) demonstrates ongoing advances in strain-gauge-based designs that achieve sub-0.1 Nm resolution while fitting within the tight dimensional constraints of humanoid joint modules.
The iCub humanoid platform pioneered the integration of dedicated joint torque sensors in a humanoid upper body, demonstrating that direct torque measurement enables whole-body dynamics estimation and compliant interaction control that would be difficult to achieve with encoder-only approaches.
Whole-Body Tactile Skin as Extended Proprioception #
A recent advance extends the concept of proprioception beyond joints to the entire body surface. Armleder et al. (2025)[4] present a real-time control system for a humanoid robot using whole-body tactile skin sensing. Their key insight is that methods relying solely on joint-torque sensing suffer from ambiguity in multi-contact scenarios, while vision is prone to occlusion. A distributed tactile skin sensor network resolves these ambiguities by providing direct contact force and location information across the entire body.
This approach effectively treats the robot’s skin as a proprioceptive organ — not sensing joint state, but sensing the body’s interaction state with the environment. For humanoid robots that must operate in close proximity to humans, this extended proprioception may be as important as traditional joint sensing.
Inertial Measurement and Base State Estimation #
The Floating Base Problem #
The central challenge of humanoid proprioception is estimating the pose and velocity of the robot’s torso (the “floating base”) in world coordinates. Unlike a fixed-base industrial arm, a humanoid’s base link is not bolted to the ground. Its position must be inferred.
The standard approach combines an inertial measurement unit (IMU) mounted on the torso with kinematic chains through the legs to ground contact points. The IMU provides high-frequency acceleration and angular velocity measurements. Forward kinematics through the leg joints, combined with an assumption that the stance foot is stationary, provides position corrections. An extended Kalman filter (EKF) fuses these two information sources.
Multi-IMU Architectures #
Recent research has demonstrated significant improvements from distributing multiple IMUs across the robot’s body rather than relying on a single trunk-mounted unit. The DogLegs framework (Wu et al., 2025)[5] fuses measurements from a body-mounted IMU, joint encoders, and multiple leg-mounted IMUs using an error-state EKF. The additional IMUs on the legs provide direct observation of leg dynamics, reducing the reliance on kinematic models that degrade under joint flexibility, backlash, and unmodeled compliance.
Multi-IMU proprioceptive state estimation (Ramuzat et al., 2023)[6] specifically addresses the problem that standard algorithms assume feet remain flat and in constant position during ground contact — a hypothesis easily violated during dynamic walking. By adding IMUs to the feet, the estimator can detect and compensate for foot slip and rotation, dramatically improving state estimation accuracy during aggressive gaits.
The trend toward multi-IMU architectures reflects a broader principle: more proprioceptive sensors, distributed closer to the points of interest, consistently outperform fewer sensors with more sophisticated algorithms. Hardware redundancy beats algorithmic cleverness for proprioception.
Learning-Assisted State Estimation #
The learning-assisted multi-IMU approach (2025)[7] demonstrates how neural networks can augment traditional filtering. A learned component processes raw proprioceptive observations to provide better contact detection and velocity estimates, which then feed into a conventional EKF. This hybrid architecture preserves the mathematical guarantees of Kalman filtering while leveraging learned models to handle the nonlinearities and unmodeled dynamics that defeat purely model-based estimators.
Fast decentralized state estimation (2024)[8] takes a different approach, using moving horizon estimation (MHE) as an alternative to EKF. MHE naturally handles constraints (joint limits, contact force bounds) and can fuse multirate sensors more gracefully than Kalman filters. The computational cost, historically prohibitive, has been reduced through decentralized formulations that distribute the optimization across multiple processors.
graph TD
subgraph Sensors
IMU_B[Body IMU]
IMU_L1[Left Leg IMU]
IMU_L2[Right Leg IMU]
ENC[Joint Encoders x30]
FT[Force/Torque Sensors]
end
subgraph Estimation
NN[Learned Contact Detector]
EKF[Error-State EKF]
KIN[Forward Kinematics]
end
IMU_B --> EKF
IMU_L1 --> EKF
IMU_L2 --> EKF
ENC --> KIN
ENC --> NN
FT --> NN
KIN --> EKF
NN -->Contact State| EKF
EKF -->Pose, Velocity, Bias| OUT[State Output @ 1kHz]
style EKF fill:#4a90d9,color:#fff
style NN fill:#e74c3c,color:#fff
style OUT fill:#27ae60,color:#fff
Slip Detection and Robust Contact Estimation #
One of the most failure-prone assumptions in proprioceptive state estimation is the no-slip contact model. When a foot slips on a wet surface, the kinematic correction that anchors the estimator to world coordinates becomes corrupted, leading to rapid drift.
Sun et al. (2025)[9] address this directly with proprioceptive slip detection for multi-legged robots in slippery scenarios. Their method uses only proprioceptive signals — no vision or external sensing — to detect slip events and adjust the estimator’s confidence in contact-based corrections. The approach is particularly relevant for humanoid robots that must operate on diverse surfaces without prior knowledge of friction coefficients.
The robust state estimation framework using dual beta Kalman filtering[10] proposes a comprehensive measurement model that accounts for both foot slippage and variable leg length by analyzing the relative motion between foot contact points and the robot’s body center. This represents a more fundamental rethinking of the measurement model rather than simply detecting and rejecting slip events.
For the Open Humanoid project, robust slip handling is not optional. An open-source robot will be deployed in environments its designers never anticipated. The proprioceptive system must degrade gracefully when contact assumptions fail, rather than catastrophically diverging.
Body Schema: The Robot’s Self-Model #
From State Estimation to Self-Awareness #
Beyond estimating instantaneous joint positions and base pose, a humanoid robot needs a continuously updated model of its own body — a body schema. This concept, borrowed from neuroscience, refers to the brain’s internal representation of the body’s geometry, dynamics, and capabilities.
For a robot, the body schema includes:
- Kinematic model — link lengths, joint axes, joint limits
- Dynamic model — link masses, inertias, center-of-mass locations
- Calibration parameters — encoder offsets, IMU biases, sensor alignments
- Wear and damage state — degraded joints, loose connections, payload changes
Traditional robotics treats these as fixed parameters loaded from a CAD model and a calibration file. But a robot that operates for months or years will experience wear, minor damage, and changes (carrying objects, wearing shoes, joint degradation). A true body schema must be continuously estimated and updated from proprioceptive data.
Learning Body Models from Proprioception #
Chen et al. (2025)[11] demonstrate learning object properties from robot proprioception via differentiable robot-object interaction. While focused on manipulated objects rather than the robot’s own body, the technique — using proprioceptive signals during interaction to infer physical properties through differentiable simulation — applies directly to body schema learning. A robot pushing against a wall can infer its own mass distribution; a robot swinging its arm can refine its inertial parameters.
The emerging field of proprioceptive-tactile representation learning[12] takes this further, demonstrating that neural networks trained on humanoid proprioceptive data can predict touch and its location from joint state alone. This suggests that a sufficiently rich proprioceptive representation can implicitly encode body geometry in a way that generalizes to novel situations.
Proprioception in Reinforcement Learning #
Modern humanoid locomotion increasingly relies on reinforcement learning (RL) policies trained in simulation and transferred to hardware. As reviewed by comprehensive surveys on humanoid locomotion and manipulation (2025)[13], RL-based controllers process proprioceptive observations — joint angles, joint velocities, body orientation, and angular velocities — and output joint torque or position commands.
The choice of proprioceptive observation space is critical for sim-to-real transfer. Observations that are easy to simulate accurately (joint positions, IMU readings) transfer well. Observations that depend on difficult-to-model phenomena (contact forces, joint friction) transfer poorly. This has led to a design philosophy where the proprioceptive sensor suite is chosen not just for control performance in isolation, but for simulability — how faithfully the sensor can be modeled in the training simulator.
Advancements in humanoid robot dynamics (2025)[14] review how learning-based methods process proprioceptive observations through proportional-derivative (PD) control for locomotion. The most successful approaches use privileged information during training (ground truth contact state, terrain properties) but rely only on proprioceptive observations during deployment, with a learned estimator bridging the gap.
Proprioceptive Architecture for the Open Humanoid #
Based on the surveyed literature and engineering constraints, we propose the following proprioceptive architecture for the Open Humanoid:
Sensor Suite #
| Sensor Type | Count | Placement | Update Rate | Purpose |
|---|---|---|---|---|
| Absolute magnetic encoder (output side) | 30 | Each actuated joint | 5 kHz | Joint position |
| Incremental optical encoder (motor side) | 30 | Each motor | 10 kHz | Motor position, torque estimation via SEA |
| 6-axis IMU | 1 | Torso (pelvis) | 1 kHz | Base orientation, acceleration |
| 3-axis IMU | 4 | Each foot, each hand | 1 kHz | Limb dynamics, contact detection |
| Force-sensitive resistor array | 2 | Foot soles (4×4 grid each) | 500 Hz | Contact pressure distribution |
| Joint temperature sensor | 30 | Each actuator | 10 Hz | Thermal protection, wear monitoring |
Estimation Pipeline #
- Low-level (10 kHz): Motor encoder differencing for velocity; dual-encoder torque estimation
- Mid-level (1 kHz): Error-state EKF fusing body IMU, leg IMUs, forward kinematics, and contact detection
- High-level (100 Hz): Body schema update incorporating learned corrections for kinematic/dynamic parameter drift
- Background (1 Hz): Thermal monitoring, wear detection, calibration drift assessment
Open-Source Implementation Notes #
All estimation code should target real-time Linux (PREEMPT_RT) with deterministic scheduling. The EKF and kinematics run in a hard real-time thread at 1 kHz. The learned components (contact detection, body schema update) run in a soft real-time thread at 100 Hz, with bounded worst-case execution time enforced by the scheduler.
The sensor interface layer abstracts hardware specifics behind a common API, allowing the community to swap encoder types, IMU brands, and force sensor technologies without modifying the estimation pipeline. This abstraction is essential for an open-source project where contributors will use diverse hardware.
Connection to the Series #
This article builds directly on Sensing and Perception (Ivchenko, 2026)[15], which introduced the sensor hardware for external and internal sensing. Here we have drilled deeper into the proprioceptive subsystem — the sensors and algorithms that give the robot knowledge of its own state.
The proprioceptive system provides critical inputs to several previously discussed subsystems:
- Bipedal Locomotion (Ivchenko, 2026)[16] depends on accurate base state estimation for balance control
- Force Control and Compliant Motion (Ivchenko, 2026)[17] requires torque sensing for impedance control
- Safety Systems (Ivchenko, 2026)[18] uses proprioceptive signals for collision detection
- Actuation (Ivchenko, 2026)[19] defined the mechanical interface that proprioceptive sensors must instrument
Conclusion #
Proprioception is the quiet foundation of humanoid robotics. It lacks the visual drama of computer vision or the intellectual appeal of motion planning, but without it, nothing else works. The field has advanced significantly in 2024–2026, with multi-IMU architectures, learning-assisted estimation, robust slip detection, and the emergence of continuous body schema learning. For the Open Humanoid project, we commit to a proprioceptive architecture that is redundant, modular, and honest about its uncertainty — because a robot that knows what it does not know about itself is safer than one that assumes perfect self-knowledge.
References (20) #
- Stabilarity Research Hub. Proprioception and Internal State Estimation: Joint Encoders, Torque Sensing, and Body Schema for Humanoid Robots. doi.org. dtir
- Roychoudhury, Arindam; Khorshidi, Shahram; Agrawal, Subham; Bennewitz, Maren. (2023). Perception for Humanoid Robots. link.springer.com. dcrtil
- Access Denied. mdpi.com. rtil
- Armleder, Simon; Bergner, Florian; Guadarrama‐Olvera, Julio Rogelio; Nakanishi, Jun; Cheng, Gordon. (2025). Real‐Time Control of a Humanoid Robot for Whole‐Body Tactile Interaction. advanced.onlinelibrary.wiley.com. dcrtl
- (20or). [2503.04580] DogLegs: Robust Proprioceptive State Estimation for Legged Robots Using Multiple Leg-Mounted IMUs. arxiv.org. tii
- Multi-IMU Proprioceptive State Estimator for Humanoid Robots | IEEE Conference Publication | IEEE Xplore. ieeexplore.ieee.org. rtil
- Rate limited or blocked (403). mdpi.com. rtil
- (20or). Fast Decentralized State Estimation for Legged Robot Locomotion via EKF and MHE. arxiv.org. tii
- Sun, Peng; Li, Qi; Hu, Hao; Qiang, Junjie; Wu, Weiwei. (2025). Proprioceptive slip detection and state estimation of multi-legged robots in slippery scenarios. link.springer.com. dcrtil
- Robust State Estimation for Legged Robots With Dual Beta Kalman Filter | IEEE Journals & Magazine | IEEE Xplore. ieeexplore.ieee.org. rtil
- (2025). Chen et al. (2025). chaoliu.tech. v
- (2023). Frontiers | Recent advancements in multimodal human–robot interaction. frontiersin.org. dcrtil
- (20or). Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning *co-corresponding authors. arxiv.org. tii
- (2025). Advancements in humanoid robot dynamics and learning-based locomotion control methods. oaepublish.com. v
- Stabilarity Research Hub. (2026). Sensing and Perception: IMU, Depth Cameras, Force-Torque Sensors, and Sensor Fusion for Humanoid Robots. doi.org. dtir
- Stabilarity Research Hub. (2026). Specifying the Impossible: A Complete Engineering Specification for an Autonomous Humanoid Robot. doi.org. dtir
- Stabilarity Research Hub. (2026). Force Control and Compliant Motion: Impedance Control, Contact Estimation, and Safe Physical Interaction for Humanoid Robots. doi.org. dtir
- Stabilarity Research Hub. (2026). Safety Systems and Fault Tolerance: Emergency Stop, Collision Detection, and Safe Failure Modes for Humanoid Robots. doi.org. dtir
- Stabilarity Research Hub. (2026). Actuation: Selecting Motors, Torque Budgets, and Degrees of Freedom for a Walking Robot. doi.org. dtir
- (2025). Frontiers | Multimodal perception-driven decision-making for human-robot interaction: a survey. frontiersin.org. dcrtil