Each ray tracing toolkit does this a little differently. But they all have the same pattern:
color = BRDF(random direction) * cosine / pdf(random direction)
The complications are:
0. That formula comes from Monte Carlo integration, which is a bit to wrap your mind around.
2. pdf() is a function of direction and is somewhat arbitrary, through you get noise if it is kind of like the BDRF in shape.
3. Even once you know what pdf() is for a given BRDF, you need to be able to generate random_direction so that it is distributed like pdf
Those 4 together are a bit overwhelming. So if you are in this for the long haul, I think you just need to really grind through it all. #0 is best absorbed in 1D first, then 2D, then graduate to the sphere.