Training Large Language Models to Reason in a Continuous Latent Space [pdf]

聊天