Skip to content

Optimizer step being called 2 times when using deepspeed #45656

@harsh2912

Description

@harsh2912

System Info

In version transformers==4.57.3, and deepspeed==0.18.3,

in below screenshot, when accelerator.backward is called, the deepspeed backward internally calls engine.step which is performing optimizer step at gradient accumulation step

The below snapshot is from trainer.py in transformers library

Image

Also, inside trainer as well optimizer.step is called again post this backward at gradient accumulation step, attaching the below SS for reference.

Image

So inherently in a single iteration it is doing two optimizer step which is wrong. Please update this bug.

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

This bug is currently working as a feature

Expected behavior

There should pe single step, but there are inherently two steps for optimizer which is wrong

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions