Synthetic data will play I big role, yes. There's other challenges though, like how verbal descriptions of objects would affect their spatial behavior. Building a generalized simulator that combines those modalities is hard.
In this particular case with Factorio, I suspect generating the synthetic data would be easier, since the rules of the environment are relatively simple and well defined, with quantifiable outcomes.
In this particular case with Factorio, I suspect generating the synthetic data would be easier, since the rules of the environment are relatively simple and well defined, with quantifiable outcomes.