Cannot create distributed environment

although I have set distributed environment with accelerate config

- `Accelerate` version: 0.16.0
- Platform: Linux-5.15.0-58-generic-x86_64-with-glibc2.31
- Python version: 3.9.13
- Numpy version: 1.21.5
- PyTorch version (GPU?): 1.13.0+cu116 (True)
- `Accelerate` default config:
        - compute_environment: LOCAL_MACHINE
        - distributed_type: MULTI_GPU
        - mixed_precision: no
        - use_cpu: False
        - dynamo_backend: NO
        - num_processes: 1
        - machine_rank: 0
        - num_machines: 1
        - gpu_ids: [0]
        - rdzv_backend: static
        - same_network: True
        - main_training_function: main
        - deepspeed_config: {}
        - fsdp_config: {}
        - megatron_lm_config: {}
        - downcast_bf16: no

when i try accelerate test, it was still display no distributed environment

stdout: **Initialization**
stdout: Testing, testing. 1, 2, 3.
stdout: Distributed environment: NO
stdout: Num processes: 1
stdout: Process index: 0
stdout: Local process index: 0
stdout: Device: cpu
stdout: Mixed precision type: no
stdout: 
stdout: 
stdout: **Test random number generator synchronization**
stdout: All rng are properly synched.
stdout: 
stdout: **DataLoader integration test**
stdout: 0 tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
stdout:         18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) <class 'accelerate.data_loader.DataLoaderShard'>
stdout: Non-shuffled dataloader passing.
stdout: Shuffled dataloader passing.
stdout: Non-shuffled central dataloader passing.
stdout: Shuffled central dataloader passing.
stdout: 
stdout: **Training integration test**
stdout: Model dtype: torch.float32, torch.float32. Input dtype: torch.float32
stdout: Model dtype: torch.float32, torch.float32. Input dtype: torch.float32
stdout: Training yielded the same results on one CPU or distributed setup with no batch split.
stdout: Model dtype: torch.float32, torch.float32. Input dtype: torch.float32
stdout: Training yielded the same results on one CPU or distributes setup with batch split.
stdout: BF16 training check.
stdout: Model dtype: torch.float32, torch.float32. Input dtype: torch.float32
Test is a success! You are ready for your distributed training!