Great, but how do you imagine multimodal with text, video. Just 2 for simplicity...

		numba888 9 months ago \| parent \| context \| favorite \| on: Ask HN: Any insider takes on Yann LeCun's push aga... Great, but how do you imagine multimodal with text, video. Just 2 for simplicity, what will be in the training set. With text model tries to predict next, then more steps were added. But what to do with multimodal?