Why are neural networks initial weights initialized as random numbers? I had rea

ID: 650594 • Letter: W

Question

Why are neural networks initial weights initialized as random numbers? I had read somewhere that this is done to "break the symmetry" and this makes the neural network learn faster. How does breaking the symmetry make it learn faster?

Would'nt initializing the weights to 0 be a better idea? That way the weights would be able to find their values (whether positive or negative) faster?

Is there some other underlying philosophy behind randomizing the weights apart from hoping that they would be near their optimum values when initialized?

Explanation / Answer

The basic intuition behind initializing weight layers into small (and different) values is just so that the bias of the system is broken and weight values can move along and away and apart to different values.

More concretely, you'ld probably want your initial weights to be distinct and have "a small gap" between them, this 'gap' expands out as you go along and forces the weights to be a bit larger at every iteration, and this helps the network to converge faster, i.e. the learning process speeds up.

If you would instead have all your weights to some constant, each weight will be updated at a very slow (~fixed) rate, and this won't help much, specially if the initial values are 'very far' from the final values.

Navigate

Why are my numbers wrong??? Brady Construction Company contracted to build an ap

Why are nonpoint sources becoming a greater problem than point sources of pollut

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Why are neural networks initial weights initialized as random numbers? I had rea

Question

Explanation / Answer

Related Questions

Navigate