Hmm…?
What you want and what Python actually does are different things.
The Hugging Face snippet is hand-formatted for the docs. It is not the literal result of print(inputs). Your output is what Python + PyTorch really produce, and you can’t get the HF layout by just tweaking torch.set_printoptions.
I’ll go through your numbered points and then give a practical workaround.
1. Why your layout looks “wrong”
You are printing a dict that contains tensors:
inputs = {
'input_ids': tensor(...),
'attention_mask': tensor(...),
}
print(inputs)
Two independent formatters are involved:
-
PyTorch tensor
repr
Produces strings liketensor([[ 0, 1, 2, 3], [ 4, 5, 6, 7]])Newlines and the indentation of the second row come from PyTorch. Controlled by
torch.set_printoptions. -
Python dict
repr
Takes key/value('input_ids', <that tensor repr string>)and glues them into:"{'input_ids': " + tensor_repr + "}"It does not insert extra newlines or re-indent the tensor; it just concatenates strings. Controlled by the
pprintmodule if you use it, but not by PyTorch.(Python documentation)
So the real Python output is essentially:
{'input_ids': tensor([[ ... first row ...],
[ ... second row ...]]), 'attention_mask': tensor([[ ... ]])}
This is exactly what you see in your screenshot. The Hugging Face page shows a manually cleaned version:
{
'input_ids': tensor([
[ ... ],
[ ... ]
]),
'attention_mask': tensor([
[ ... ],
[ ... ]
])
}
Those extra line breaks after tensor( and after each ]), are simply edited into the Markdown on the site. You will not get them from plain print(inputs).
That explains:
-
'input_ids'on the first row: that’s dictrepr, no padding step. -
First list “continuing” on the next row: that’s either
- a newline coming from the tensor repr itself, or
- VS Code’s output word wrap breaking the line visually.
-
Second list not starting exactly under the first: again, that’s VS Code wrapping a long physical line into multiple visual lines.
-
Closing
]])not on its own row, and'attention_mask'appearing right after: dictreprhas no reason to insert a newline between key–value pairs.
There is no PyTorch option that changes how dictionaries align keys or where newlines between items go.
2. Why your torch.set_printoptions(...) changes did nothing
torch.set_printoptions affects only tensor formatting, not the dict surrounding it. Options like linewidth, edgeitems, profile apply when the tensor builds its own string.
Example:
torch.set_printoptions(linewidth=120)
print(inputs["input_ids"]) # affected
print(inputs) # tensor part is affected, dict layout is not
The dict is still a single logical line with embedded newlines from the tensor. Dict doesn’t know about linewidth and won’t decide to put 'attention_mask' on a new line.
So your experiments in (2) were correctly applied to the tensors, but they can’t fix the dict layout you dislike.
3. Why pprint() and rich.pretty still look “off”
pprint(inputs)again uses the dict’s keys and values; for values it just callsrepr(value)and uses that as an opaque string when deciding where to break lines. It doesn’t reformat the innards of that tensor string.(Python documentation)rich.pretty.pprint(inputs)does the same thing, just with color. It does not rewrite tensor reprs or move'attention_mask'to its own block.
So they can:
- put each key–value pair on its own line,
- change order (
sort_dicts), indentation, etc.,
but they will not give you the exact HF style automatically.
4. “I actually want to see a long line”
If you want the entire row [101, 1045, ...] on one unbroken line, then:
- PyTorch must not insert internal breaks inside that row (controlled by
linewidth), and - Your front-end (VS Code / Jupyter) must not soft-wrap the output.
You already tried the first part. For typical 2×16 tensors the PyTorch repr of a single row is shorter than 120 characters, so linewidth=120 is fine.
The second part is VS Code’s setting, not Python:
- VS Code has a setting
"notebook.output.wordWrap"that controls whether output cells wrap or scroll horizontally.(Stack Overflow) - If this is
true, very long physical lines are wrapped visually into multiple lines, which is exactly what you’re seeing. - To get “one long line with a scrollbar”, you want wrapping off (
notebook.output.wordWrap: false) or a wider notebook pane.
So: to see truly long lines, you must handle visual wrapping in VS Code; Python cannot override that from inside the notebook.
5. Practical workaround: custom “HF-style” printer
If you really want something very close to the Hugging Face layout, you need to format it yourself.
Here is a small helper that:
- Prints braces on their own lines.
- Puts each key on its own line.
- Indents the tensor nicely.
- Preserves dict key order.
import torch
def hf_print(batch, indent=4):
"""
Pretty-print a dict like the HF docs example.
hf_print(inputs)
"""
items = list(batch.items())
sp = " " * indent
print("{")
for i, (k, v) in enumerate(items):
v_str = repr(v)
# indent tensor's own internal lines one extra level
v_str = ("\n" + sp * 2).join(v_str.splitlines())
comma = "," if i < len(items) - 1 else ""
print(f"{sp}{k!r}: {v_str}{comma}")
print("}")
Usage:
inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
hf_print(inputs)
This will produce something close to:
{
'input_ids': tensor([[ 101, 1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172,
2607, 2026, 2878, 2166, 1012, 102],
[ 101, 1045, 5223, 2023, 2061, 2172, 999, 102, 0, 0,
0, 0, 0, 0, 0, 0]]),
'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]])
}
If you want to combine this with Rich:
from rich import print as rprint
from rich.panel import Panel
def hf_print_rich(batch, title="BatchEncoding"):
from rich.text import Text
from io import StringIO
buf = StringIO()
# temporarily redirect hf_print output into a string
saved_print = __builtins__["print"]
__builtins__["print"] = lambda *a, **k: saved_print(*a, file=buf, **k)
try:
hf_print(batch)
finally:
__builtins__["print"] = saved_print
text = Text.from_ansi(buf.getvalue())
rprint(Panel(text, title=title))
That gets you both a “nice” textual format and a boxed background.
6. What you can and cannot control
-
Can control with Python / PyTorch:
- How tensors format internally (
torch.set_printoptions). - Whether dict items go on separate logical lines (
pprint/ custom printer). - Whether keys are in insertion order (
pprint(..., sort_dicts=False)).(Python documentation)
- How tensors format internally (
-
Cannot control from Python:
- VS Code’s choice to wrap output lines instead of horizontal scroll.
- The exact HF-docs layout; that output in the docs is curated Markdown, not the raw console dump.
Condensed summary
- Your output is the correct default
reprof a dict containing PyTorch tensors. - The Hugging Face page shows a manually formatted version; you won’t get that exact style from plain
print(inputs).(Hugging Face) torch.set_printoptionsonly affects how the tensors themselves are printed, not the dict keys or where'attention_mask'starts.- Long lines broken in the middle are caused by VS Code / notebook word wrap, not by PyTorch; toggle
"notebook.output.wordWrap"or widen the output pane to see unbroken lines.(Stack Overflow) - To get HF-style layout, write a small custom printer (like
hf_printabove) or wrap its output in a RichPanelif you want backgrounds.