In almost all assembly books you’ll find some nice tricks to do fast multiplications. E.g. instead of “imul eax, ebx, 3” you can do “lea eax, [ebx+ebx*2]” (ignoring flag effects). It’s pretty clear how this works. But how can we speed up, say, a division by 3? This is quite important since division is still a really slow operation. If you never thought or heart about this problem before, get pen and paper and try a little bit. It’s an interesting problem.
64 bit is a lot!
When people talk about porting their applications to 64 bit, I sometimes hear them wonder how long it will be until they have to port everything to 128 bit – after all, the swiches from 8 to 16 bit (e.g. CP/M to DOS), 16 to 32 bit (DOS/Windows 3 to Windows 95/NT) and 32 to 64 have all happened in the last 25 years.