I see this test/cmp all the time after the instruction and I don't understand it. pcmpestri will set ZF if edx < 16, and it will set SF if eax < 16. It is already giving you the necessary status. Also testing sub words of the larger register is very slow and is a pipeline hazard.
You've got this monster of an instruction and then people place all this paranoid slowness around it. Am I reading the x86 manual wrong?
You've got this monster of an instruction and then people place all this paranoid slowness around it. Am I reading the x86 manual wrong?