What happens when we train the largest vision-language model and add in robot experiences?
The result is PaLM-E 🌴🤖, a 562-billion parameter, general-purpose, embodied visual-language generalist - across robotics, vision, and language.
Website: https://t.co/ouMkeQiGr5