Andrew Thompson

SSVAE-Based Latent Control Mapping for Assistive Robotic Interfaces

Structured generative models for real-time user intent embedding and control-space customization
Andrew Thompson, Fiona Neylon, Brenna Argall | Northwestern University + Shirley Ryan AbilityLab


Overview

This project explores the use of variational autoencoders (VAEs) to learn structured latent control spaces from body-machine interface (BoMI) data. The goal is to enable intuitive, low-dimensional user intent embeddings that generalize across users and time, and to support robust control of high-dimensional robotic systems from sparse or noisy human input.

Unlike PCA-based mappings used during deployment, the SSVAE-based approach is generative and probabilistic. It allows for intent inference under uncertainty, meaningful and continuous interpolation across embeddings, and latent-level customization—capabilities that support safer and more generalizable assistive control.


Motivation

  • Improve robustness of control mappings to signal drift, fatigue, and inter-user variability.
  • Reduce calibration overhead by learning generalized motion embeddings across users.
  • Enable few-shot or zero-shot transfer of control strategies between sessions or individuals.
  • Capture control uncertainty to improve shared autonomy, safety, and intent disambiguation.

System Design

Architecture

  • Input: 8–24 channel filtered IMU signals or joystick control time windows
  • Encoder: 3-layer MLP with ReLU activations, layer norm
  • Latent space: 2–6D Gaussian embedding (μ, σ) with KLD penalty
  • Decoder: Symmetric 3-layer MLP; reconstructs control signals or time-aligned action vectors
  • Loss: ELBO with optional auxiliary losses (e.g., latent decorrelation, temporal smoothness)

Training Pipeline

  • Data sampled from longitudinal BoMI sessions (~190 total)
  • Signals preprocessed with filtering, windowing, and normalization
  • Models trained with PyTorch, Adam optimizer, and cosine learning rate decay
  • Evaluation on reconstruction accuracy, latent smoothness, and downstream control decoding

Experiments

Experiment Goal Outcome
Latent stability across sessions Assess how latent axes shift with session index Axes remain consistent across 3+ sessions with single-user training
Cross-user generalization Train on one user, test on another Latents retain structure, but decoder needs fine-tuning
Control-space mapping Latent → 6-DOF robot control mapping Mapping feasible with linear decoder or MLP; interpretable axes emerge
Few-shot retraining Fine-tune encoder with small new-user dataset Rapid convergence observed with 20–50 examples


Key Insights

  • VAE-based latents capture user-specific movement signatures in a compact and reusable form.
  • Latent spaces are smoother and more robust than PCA in the presence of sensor noise or posture shift.
  • The probabilistic nature of the model supports uncertainty-aware intent prediction, which is valuable for shared autonomy systems.
  • While not yet deployed in real-time trials, offline control decoding from the latent space has yielded promising results.

Additionally, ablation studies were on all of the additional cost terms to see their effect.


Relation to PCA-Based BoMI Deployment

This work builds directly on the PCA-based BoMI system used in a 190+ session longitudinal study, where we gathered data from and evaluated teleoperation performance from a cohort of 10 individuals with cervical spinal cord injuries (cSCI). The SSVAE model was trained entirely on data collected during those sessions. Whereas PCA was used for control deployment, the SSVAE supports:

  • Post hoc analysis of motor strategies
  • Improved control personalization
  • Future real-time deployment with adaptive blending

Ongoing Work

  • Incorporating temporal priors (e.g. LSTM-VAE, temporal VAE) to model motion transitions
  • Conditioning latent space with goal or task context
  • Real-time latent encoding + ROS2 interface for on-device teleoperation
  • Ablation studies on dimensionality, loss structure, and encoder architecture

Citation

Preprint (In Review):
Structured Semi-Supervised Generative Methods for Learning Robust Control Embeddings from Human Motion Data
Andrew Thompson, Fiona Neylon, Brenna Argall


Access

All models and training scripts are currently housed in private repositories. Code excerpts, latent visualizations, and sanitized motion samples are available upon request.