It came as a sort of surprise to me when I got to know that people actually optimize the performance of their software based on what processor they are using , that means that if we say Matlab is a very optimized s/w , them some one would have painstakingly sat down and coded in assembly the functions or routines that Matlab implements so that what all hardware options and advances are present they are utilized to their core.
Now though I am not very clear about the issue but sometimes companies offer cross-compiler so that a normal person could sit down and writing the routines in popular languages like C , he could cross compile the routines to be more optimized for a particular hardware say a DSP processor.
In upcoming platforms like Beagle Board , Hawk Board , which have a different processor than your PC or Laptop , an application will not be able to make full use of the hardware present if somebody has not optimized it for that processor say TMS300 or ARM 9 .
In case of Intel , it has Integrated Performance Primitives which is a library of optimized functions for different utilities. So is the case with other Processor makers.