Vision Language Action (VLA) Poster - Digital Download
Vision Language Action (VLA) Poster - Digital Download
$8.00
Vision Language Action (VLA) models are a powerful new class of robot model that directly outputs control signals, given images from on-board cameras, natural language prompts, and the current robot state. This poster walks through the architecture of the Physical Intelligence Pi0 foundation model (Black, Kevin, et al. arXiv:2410.24164), giving a detailed breakdown and visualization of how data flows through the various components of the model.
Download is a high quality pdf (~50MB).

