Which optimizer can automatically adjust the learning rate?

Disable ads (and more) with a membership for a one time $4.99 payment

Prepare for the Huawei Certified ICT Associate – AI Exam with flashcards and multiple-choice questions, featuring hints and explanations. Gear up for success!

The AdaGrad optimizer is designed to automatically adjust the learning rate based on the accumulated historical gradients of each parameter. This dynamic adjustment allows it to adapt the learning rate for each parameter individually according to the frequency of updates. Parameters that have infrequent updates get a larger learning rate, while those with more frequent updates receive a smaller learning rate. This mechanism helps in improving convergence during training, especially for sparse data, as the optimizer can tailor the learning rate to the characteristics of each specific parameter.

In contrast, mini-batch gradient descent, SGD (Stochastic Gradient Descent), and Momentum optimizers typically have a fixed learning rate that does not change adaptively during training. While SGD can be modified to use momentum or learning rate schedules, it does not implement the same parameter-specific adjustments as AdaGrad directly. Thus, AdaGrad stands out for its inherent ability to auto-tune learning rates, making it the correct choice for this question.