I found this link helpful (which references this post). It puts the difference between sleep(), wait(), and yield() in human terms. (in case the links ever go dead I've included the post below with additional markup)
It all eventually makes its way down to the OSâ€™s scheduler, which hands out timeslices to processes and threads.
sleep(n) says â€śIâ€™m done with my timeslice, and please donâ€™t give me another one for at least n milliseconds.â€ť The OS doesnâ€™t even try to schedule the sleeping thread until requested time has passed.
yield() says â€śIâ€™m done with my timeslice, but I still have work to do.â€ť The OS is free to immediately give the thread another timeslice, or to give some other thread or process the CPU the yielding thread just gave up.
.wait() says â€śIâ€™m done with my timeslice. Donâ€™t give me another timeslice until someone calls notify().â€ť As with sleep(), the OS wonâ€™t even try to schedule your task unless someone calls notify() (or one of a few other wakeup scenarios occurs).
Threads also lose the remainder of their timeslice when they perform blocking IO and under a few other circumstances. If a thread works through the entire timeslice, the OS forcibly takes control roughly as if yield() had been called, so that other processes can run.
You rarely need yield(), but if you have a compute-heavy app with logical task boundaries, inserting a yield() might improve system responsiveness (at the expense of time â€” context switches, even just to the OS and back, arenâ€™t free). Measure and test against goals you care about, as always.