I have been doing some reasearch on mpeg motion vectors lately, and think that they could be used to make a real time object avoidance system for a robot.
specificaly I want to detect variarion in the motion vector grid to detect places that move faster than others. this motion vector grid is updated at about 24 frames per second, and I hope to make a simple decision making system to avoid obstacles that will also run this fast.
So far I think I could detect obstacles that rise above the flat ground with a verry small amount of code if I could get acces to the mpeg vector information. This system wouldn't be fooled by changes in the ground colors or shadows.
as the camera moves from a to b, side A has a higher downward movement vector than the ground, relative to the camera.
looking at the vector map a decision of left, right, or straight can easily be made.
I am studying an open source mpeg player made by some people at berkeley, but I am having a tough time as it is very complex, and I havn't done much programming before. i think that just extracting the motion vectors will be much easier than playing the entire video.
any help deciphering this code would be much appreciated.
6/14/02...
Video.h in the berkeley player seems to hold the key to how the mpeg data is stored. an object? vid_stream has many sub variables, this is where I will focus my efforts.
6/16/02...1:02am
after about 3 hours of trying to follow code, and deleting code i didn't think i would need, I think I have hit paydirt in video.c in a function called ParseMacroBlock. A comment says"Here's where everything really happens. Welcome to the heart of darkness." I think that's a good sign. I don't know what Parse means, but I'm sure i'll find out.
6/18/02...
couldn't figure out how motion vector codes translated into +y = up, and +x = Right, which is how i usually make sense of vectors. i didn't know what corner was zero, and which direction was positive. today i looked for diferences between mpeg_play and mpeg_blocks, and soon found how the motion vectors are used. the key is a comment above a new function CalcDot in a video2.c "sx, sy = x,y coordinates of the upper left corner of the box" then a little lower "The box size is 24x24, so add 12 [from upper left stated earlier] is about in the center". now I finally have a refrence orientation to let me visualize what is going on in other functions. all I have to do is find function calls and compare parameters to "carry in" the orientation to the rest of the program.