Help : integrating a Agilex PiPER robot arm to lerobot

wego-hansu · October 1, 2025, 9:43am

Hello.

I working for integration a PiPER robot arm to lerobot.

I made some classes(bus, motor, robot, follower, leader…).

I got the dataset and the replay worked fine using a PiPER.

so i trained ‘smolvla’ model but inference output is nothing.

Why AI do not grab my pretty yellow car?

Code

Dataset

Train option

lerobot-train   \
    --policy.device=cuda  \
    --policy.type=smolvla   \
    --dataset.repo_id=wego-hansu/piper_pick_yellow_car   \
    --dataset.video_backend=pyav   \
    --batch_size=64   \
    --steps=20000   \
    --eval_freq=5000   \
    --output_dir=outputs/train/piper_smolvla_pick_yellow_cars   \
    --job_name=piper_smolvla_yellow_car   \
    --wandb.enable=false

async inference client option

python src/lerobot/scripts/server/robot_client.py \
    --server_address=127.0.0.1:8080 \
    --robot.type=piper_follower \
    --robot.port=can0 \
    --robot.id=black \
    --robot.cameras="{ \
        top: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}, \
        left: {type: opencv, index_or_path: 4, width: 640, height: 480, fps: 30}}" \
    --task="Grasp the object and put it in the bin" \
    --policy_type=smolvla \
    --pretrained_name_or_path=outputs/train/piper_smolvla_pick_yellow_cars/checkpoints/020000/pretrained_model \
    --policy_device=cuda \
    --actions_per_chunk=50  \
    --chunk_size_threshold=0.5 \
    --aggregate_fn_name=weighted_average \
    --debug_visualize_queue_size=True

server log

INFO 2025-10-01 16:08:42 y_server.py:191 Observation #0 has been filtered out
INFO 2025-10-01 16:08:42 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-10-01 16:08:42 y_server.py:175 Received observation #0 | Avg FPS: 27.46 | Target: 30.00 | One-way latency: 1.73ms
INFO 2025-10-01 16:08:42 y_server.py:191 Observation #0 has been filtered out
INFO 2025-10-01 16:08:42 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-10-01 16:08:42 y_server.py:175 Received observation #0 | Avg FPS: 27.46 | Target: 30.00 | One-way latency: 1.79ms

client log

INFO 2025-10-01 16:08:42 t_client.py:217 Sent observation #0 | 
INFO 2025-10-01 16:08:42 t_client.py:470 Control loop (ms): 4.07
actions_available False
INFO 2025-10-01 16:08:42 t_client.py:217 Sent observation #0 | 
INFO 2025-10-01 16:08:42 t_client.py:470 Control loop (ms): 5.66
actions_available False

John6666 · October 1, 2025, 11:20pm

There’s a robotics channel on the Hugging Face Discord, so asking there would be the most reliable option.
Also, the recently created Hugging Face science Discord, Hugging Science, might be a useful option in this case too.

Anyway, asking someone who knows is always best…

John6666 · October 1, 2025, 11:33pm

If it’s a just code-related error, it might be mitigated with standard troubleshooting methods.

wego-hansu · October 2, 2025, 5:03am

I found something.

obs number not change and obs is filtered because timestep is already exist in predicted_timesteps.

ERROR 2025-10-02 13:26:36 y_server.py:246 Error in StreamActions: ‘observation.language.tokens’

INFO 2025-10-02 13:26:36 y_server.py:150 Time taken to put policy on cuda: 17.6774 seconds
DEBUG 2025-10-02 13:26:36 y_server.py:199 Client ipv4:127.0.0.1:34152 connected for action streaming
DEBUG 2025-10-02 13:26:36 y_server.py:157 Receiving observations from ipv4:127.0.0.1:34152
INFO 2025-10-02 13:26:36 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
DEBUG 2025-10-02 13:26:36 ort/utils.py:76 <Logger policy_server (NOTSET)> Received item
DEBUG 2025-10-02 13:26:36 ort/utils.py:93 <Logger policy_server (NOTSET)> Received data at step end size 1843755
DEBUG 2025-10-02 13:26:36 y_server.py:167 Received observation #0
INFO 2025-10-02 13:26:36 y_server.py:175 Received observation #0 | Avg FPS: 0.00 | Target: 30.00 | One-way latency: 1.80ms
DEBUG 2025-10-02 13:26:36 y_server.py:182 Server timestamp: 1759379196.881661 | Client timestamp: 1759379196.879858 | Deserialization time: 0.006293s
DEBUG 2025-10-02 13:26:36 y_server.py:278 Enqueuing observation. Must go: True | Last processed obs: None
INFO 2025-10-02 13:26:36 y_server.py:205 Running inference for observation #0 (must_go: True)
ERROR 2025-10-02 13:26:36 y_server.py:246 Error in StreamActions: ‘observation.language.tokens’
DEBUG 2025-10-02 13:26:36 y_server.py:157 Receiving observations from ipv4:127.0.0.1:34152
INFO 2025-10-02 13:26:36 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
DEBUG 2025-10-02 13:26:36 y_server.py:199 Client ipv4:127.0.0.1:34152 connected for action streaming
DEBUG 2025-10-02 13:26:36 ort/utils.py:76 <Logger policy_server (NOTSET)> Received item
DEBUG 2025-10-02 13:26:36 ort/utils.py:93 <Logger policy_server (NOTSET)> Received data at step end size 1843755
DEBUG 2025-10-02 13:26:36 y_server.py:167 Received observation #0
INFO 2025-10-02 13:26:36 y_server.py:175 Received observation #0 | Avg FPS: 33.06 | Target: 30.00 | One-way latency: 1.39ms
DEBUG 2025-10-02 13:26:36 y_server.py:182 Server timestamp: 1759379196.911497 | Client timestamp: 1759379196.910107 | Deserialization time: 0.004333s
DEBUG 2025-10-02 13:26:36 y_server.py:256 Skipping observation #0 - Timestep predicted already!
INFO 2025-10-02 13:26:36 y_server.py:191 Observation #0 has been filtered out
DEBUG 2025-10-02 13:26:36 y_server.py:157 Receiving observations from ipv4:127.0.0.1:34152
INFO 2025-10-02 13:26:36 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
DEBUG 2025-10-02 13:26:36 ort/utils.py:76 <Logger policy_server (NOTSET)> Received item
DEBUG 2025-10-02 13:26:36 ort/utils.py:93 <Logger policy_server (NOTSET)> Received data at step end size 1843755
DEBUG 2025-10-02 13:26:36 y_server.py:167 Received observation #0
INFO 2025-10-02 13:26:36 y_server.py:175 Received observation #0 | Avg FPS: 31.35 | Target: 30.00 | One-way latency: 1.42ms
DEBUG 2025-10-02 13:26:36 y_server.py:182 Server timestamp: 1759379196.945079 | Client timestamp: 1759379196.943659 | Deserialization time: 0.002359s
DEBUG 2025-10-02 13:26:36 y_server.py:256 Skipping observation #0 - Timestep predicted already!
INFO 2025-10-02 13:26:36 y_server.py:191 Observation #0 has been filtered out
DEBUG 2025-10-02 13:26:36 y_server.py:157 Receiving observations from ipv4:127.0.0.1:34152
INFO 2025-10-02 13:26:36 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
DEBUG 2025-10-02 13:26:36 ort/utils.py:76 <Logger policy_server (NOTSET)> Received item
DEBUG 2025-10-02 13:26:36 ort/utils.py:93 <Logger policy_server (NOTSET)> Received data at step end size 1843755
DEBUG 2025-10-02 13:26:36 y_server.py:167 Received observation #0
INFO 2025-10-02 13:26:36 y_server.py:175 Received observation #0 | Avg FPS: 30.93 | Target: 30.00 | One-way latency: 1.44ms
DEBUG 2025-10-02 13:26:36 y_server.py:182 Server timestamp: 1759379196.978303 | Client timestamp: 1759379196.976861 | Deserialization time: 0.002258s
DEBUG 2025-10-02 13:26:36 y_server.py:256 Skipping observation #0 - Timestep predicted already!
INFO 2025-10-02 13:26:36 y_server.py:191 Observation #0 has been filtered out
DEBUG 2025-10-02 13:26:37 y_server.py:157 Receiving observations from ipv4:127.0.0.1:34152
INFO 2025-10-02 13:26:37 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
DEBUG 2025-10-02 13:26:37 ort/utils.py:76 <Logger policy_server (NOTSET)> Received item
DEBUG 2025-10-02 13:26:37 ort/utils.py:93 <Logger policy_server (NOTSET)> Received data at step end size 1843755
DEBUG 2025-10-02 13:26:37 y_server.py:167 Received observation #0
INFO 2025-10-02 13:26:37 y_server.py:175 Received observation #0 | Avg FPS: 30.68 | Target: 30.00 | One-way latency: 1.41ms
DEBUG 2025-10-02 13:26:37 y_server.py:182 Server timestamp: 1759379197.011643 | Client timestamp: 1759379197.010235 | Deserialization time: 0.002167s

John6666 · October 2, 2025, 5:26am

Hmm… Might be a clue…?

Diagnosis: two independent blockers.

Missing language input.
Error in StreamActions: 'observation.language.tokens' means the SmolVLA policy was initialized with a language modality, but your client payload lacks the tokenized instruction. This is a known SmolVLA failure mode and shows up as a KeyError on that field. Send an instruction so the client or server produces tokens, or run a checkpoint/config without language. (GitHub)
Duplicate timestep.
“obs number not change” and “Skipping observation #0 – Timestep predicted already!” means the async server’s dedup logic sees the same step ID repeatedly, so it filters them. Async only enqueues actions after accepting a new observation step. Ensure the client increments the step index or uses fresh timestamps per frame. This behavior follows the async streaming contract. (Hugging Face)

Fixes in order:

A) Provide language or disable it

Fast path: pass a task string from the client so tokens are generated.
Example launch knobs are documented in the async guide; the policy expects language tokens when enabled. (Hugging Face)
No-language path: use a SmolVLA config or checkpoint that omits the language branch, or stub observation.language.tokens on the server with a valid empty/default tensor to confirm the rest of the pipeline. The exact KeyError has been reported and tracked. (GitHub)

B) Make steps monotonic

In your robot_client loop, increment timestep each send. Also ensure client_timestamp increases. Verify server logs advance as Received observation #0, #1, #2 …. The async tutorial describes the per-step flow and acceptance before action chunking. (Hugging Face)

C) Re-prime the async pipeline

Start policy_server, confirm the model loads and prints expected inputs.
Start robot_client with the instruction and fixed step counter. Actions appear only after the first valid step is accepted, which matches reports where no actions flow until inputs are correct. (Hugging Face)

Minimal verification checklist:

Server prints “Loaded SmolVLA …” and lists expected features including language if enabled. (Hugging Face)
First inference no longer throws the KeyError on observation.language.tokens. (GitHub)
Server log increments observation index and stops filtering as “predicted already.” (Hugging Face)
Client flips to actions_available: True after the first accepted step, consistent with prior async issues. (GitHub)

Context to remember:

SmolVLA expects inputs exactly as configured at train/eval time. If language is part of the policy, the language field is required. The SmolVLA writeups and docs describe language tokens as a standard input. (arXiv)
Async inference decouples sensing from acting. The server deduplicates per step to avoid recomputing. Reused IDs will be dropped by design. (Hugging Face)

Apply A and B, then recheck C. If anything still blocks, paste the server’s “expected inputs” line and a dump of received keys for one accepted step.

John6666 · October 2, 2025, 10:31am

Hmm, the logs might have given me a glimpse into the code, but if it’s not a code-related issue, I can’t figure it out…

wego-hansu · October 2, 2025, 10:40am

Maybe this error is based on processpipeline update.

They added pipeline into record, replay, train… but not added inference.

The pipeline has TokenizerProcessorStep process.

The success of first inference is occurred restart only async client.

First inference is failed when both server and client restart so that is another bug.

John6666 · October 2, 2025, 11:00am

Hmm… If that’s the case, maybe we can manage by pinning the library to an older version…?

Edit:
Yeah. Seems buggy.

github.com/huggingface/lerobot

using "lerobot/svla_so100_stacking" "lerobot/svla_so100_sorting" LeRobotDataset error

opened 02:40AM - 25 Sep 25 UTC

axia75

policies processor breaking change

### System Info ```Shell using "lerobot/svla_so100_stacking" "lerobot/svla_…so100_sorting" LeRobotDataset error pyarrow.lib.ArrowInvalid: Could not open Parquet input source '<Buffer>': Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file using lerobot/svla_so101_pickplace import torch from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy from lerobot.datasets.lerobot_dataset import LeRobotDataset # policy = SmolVLAPolicy.from_pretrained("E:\models\smolvla_base") device = "cuda" dataset = LeRobotDataset("lerobot/svla_so101_pickplace") # policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base") policy = SmolVLAPolicy.from_pretrained("/home/aigc/.cache/modelscope/hub/lerobot/smolvla_base") observation_state = dataset[0]['observation.state'].unsqueeze(0).to(device) observation_image_top = dataset[0]['observation.images.top'].unsqueeze(0).to(device) observation_image_wrist = dataset[0]['observation.images.wrist'].unsqueeze(0).to(device) # observation_image_top = dataset[0]['observation.images.up'].unsqueeze(0).to(device) # observation_image_wrist = dataset[0]['observation.images.side'].unsqueeze(0).to(device) task = dataset[0]['task'] observation = { "observation.state": observation_state, "observation.image": observation_image_top, "observation.image2": observation_image_wrist, "task": [task], } action = policy.select_action(observation) print(action) error: File "/home/aigc/work/project/cogx/lerobot-main/outputs/eval/diffusion_pusht/test3223.py", line 29, in <module> action = policy.select_action(observation) File "/install/anaconda3/envs/lerobot/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/home/aigc/work/project/cogx/lerobot-main/src/lerobot/policies/smolvla/modeling_smolvla.py", line 305, in select_action actions = self._get_action_chunk(batch, noise) File "/home/aigc/work/project/cogx/lerobot-main/src/lerobot/policies/smolvla/modeling_smolvla.py", line 260, in _get_action_chunk lang_tokens = batch[f"{OBS_LANGUAGE_TOKENS}"] KeyError: 'observation.language.tokens' ``` ### Information - [ ] One of the scripts in the examples/ folder of LeRobot - [ ] My own task or dataset (give details below) ### Reproduction run code report key Error ### Expected behavior run

github.com/huggingface/lerobot

async inference doesn't generate actions.Observation #0 has been filtered out

opened 01:13PM - 13 Jul 25 UTC

closed 07:59AM - 30 Jul 25 UTC

milong26

I tried to use async inference, I run this in server: `python -m lerobot.script…s.server.policy_server --host="xx,xx,xx,xx" --port=9000` And have models in server. run this locally: `PYTHONPATH=src python src/lerobot/scripts/server/robot_client.py --config_path=async_evaluate/template.yaml` the yaml looks like: ```yaml server_address: xx.xx.xx.xx:9000 robot: type: so100_follower port: /dev/ttyACM0 id: congbi cameras: side: type: intelrealsense serial_number_or_name: aaaaaaaaaaa width: 640 height: 480 fps: 30 use_depth: false wrist: type: opencv index_or_path: 6 width: 640 height: 480 fps: 30 task: "pick up the cube and place it into the box." policy_type: smolvla pretrained_name_or_path: outputs/train/bbb/checkpoints/last/pretrained_model policy_device: cuda actions_per_chunk: 50 chunk_size_threshold: 0.5 aggregate_fn_name: weighted_average debug_visualize_queue_size: true # dataset: # repo_id: eval/eval_firts50_baseline # num_episodes: 1 # episode_time_s: 60 # push_to_hub: false # reset_time_s: 0 ``` Then I saw these in server(they look similar) ``` INFO 2025-07-13 12:59:54 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 2.17ms INFO 2025-07-13 12:59:54 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:54 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:54 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 1.89ms INFO 2025-07-13 12:59:54 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:54 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:54 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 1.80ms INFO 2025-07-13 12:59:54 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:54 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:54 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 2.52ms INFO 2025-07-13 12:59:54 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:54 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:55 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 2.25ms INFO 2025-07-13 12:59:55 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:55 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:55 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 2.10ms INFO 2025-07-13 12:59:55 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:55 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:55 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 1.77ms INFO 2025-07-13 12:59:55 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:55 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:55 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 1.81ms INFO 2025-07-13 12:59:55 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:55 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:55 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 1.92ms INFO 2025-07-13 12:59:55 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:55 /helpers.py:354 Starting receiver INFO 2025-07-13 12:59:55 y_server.py:175 Received observation #0 | Avg FPS: 5.95 | Target: 30.00 | One-way latency: 1.84ms INFO 2025-07-13 12:59:55 y_server.py:191 Observation #0 has been filtered out INFO 2025-07-13 12:59:55 /helpers.py:354 Starting receiver ``` By the way I saw `reduce layer to 16` so the server got policy message and load policy correcly. And to test I add an Error om `getAction` function ``` # Create and return the action chunk actions = async_inference_pb2.Actions(data=actions_bytes) raise KeyError("generate an action") self.logger.info( f"Action chunk #{obs.get_timestep()} generated | " f"Total time: {(inference_time + serialize_time) * 1000:.2f}ms" ) ``` But nothing happened. So I think getAction wasn't used at all. And locally terminal said: ``` INFO 2025-07-13 20:59:54 t_client.py:471 Control loop (ms): 166.87 INFO 2025-07-13 20:59:55 t_client.py:213 Sent observation #0 | INFO 2025-07-13 20:59:55 t_client.py:471 Control loop (ms): 168.18 INFO 2025-07-13 20:59:55 t_client.py:213 Sent observation #0 | INFO 2025-07-13 20:59:55 t_client.py:471 Control loop (ms): 169.37 INFO 2025-07-13 20:59:55 t_client.py:213 Sent observation #0 | INFO 2025-07-13 20:59:55 t_client.py:471 Control loop (ms): 165.41 INFO 2025-07-13 20:59:55 t_client.py:213 Sent observation #0 | INFO 2025-07-13 20:59:55 t_client.py:471 Control loop (ms): 165.57 INFO 2025-07-13 20:59:55 t_client.py:213 Sent observation #0 | INFO 2025-07-13 20:59:55 t_client.py:471 Control loop (ms): 166.10 ``` (My server time is incorrect, I run them at the same time.) And of course no action made by my robot.;-; How to solve it? lerobot: d2645cb19fc521e5b117fe03d90a84f698d3d3f6

Topic		Replies	Views
Inference API returns Unkown Error 🤗Hub	1	656	November 15, 2021
Inference API detailed request Beginners	5	2331	September 11, 2020
Sagemaker Serverless Inference for LayoutLMv2 model Amazon SageMaker	17	4407	June 15, 2022
Handler.py not executed in Inference Endpoint Inference Endpoints on the Hub	0	270	September 13, 2023
Error executing pinned inference model 🤗Hub	18	3798	December 10, 2021

Help : integrating a Agilex PiPER robot arm to lerobot

Related topics