This slide shows an example that the calculation starts from $u_1$. If we start from $u_2$, we will get a different result and both of them are "correct" orthonormal basis. Is there any order we should follow in practice or just start normalizing a vector randomly?
For a large number of vectors and nearly parallel vectors, what are the better algorithms to use and how do they work?
A much better way is to compute the QR decomposition of a matrix whose columns are given by the vectors of interest, using Householder or Givens rotations. There’s some good discussion of trade offs on the Wikipedia page.