This slide shows an example that the calculation starts from $u_1$. If we start from $u_2$, we will get a different result and both of them are "correct" orthonormal basis. Is there any order we should follow in practice or just start normalizing a vector randomly?

ericchan

For a large number of vectors and nearly parallel vectors, what are the better algorithms to use and how do they work?

keenan

A much better way is to compute the QR decomposition of a matrix whose columns are given by the vectors of interest, using Householder or Givens rotations. There’s some good discussion of trade offs on the Wikipedia page.

This slide shows an example that the calculation starts from $u_1$. If we start from $u_2$, we will get a different result and both of them are "correct" orthonormal basis. Is there any order we should follow in practice or just start normalizing a vector randomly?

For a large number of vectors and nearly parallel vectors, what are the better algorithms to use and how do they work?

A much better way is to compute the QR decomposition of a matrix whose columns are given by the vectors of interest, using Householder or Givens rotations. There’s some good discussion of trade offs on the Wikipedia page.