D is the dimensionality, and they give a picture example with D = X.pixels * Y.pixels
How does D change when you introduce video? (Picture arrays)
E.g. can a NN recognize something more by having access to an array of approximately the same image as it sweeps through space/time?
D is the dimensionality, and they give a picture example with D = X.pixels * Y.pixels
How does D change when you introduce video? (Picture arrays)
E.g. can a NN recognize something more by having access to an array of approximately the same image as it sweeps through space/time?