Anthropic papers on Circuit Tracing, March 2025

I could not find any references on this in forum and papers section, so I post it in here:

Read worthy article and papers by Anthropic (already ~4 months old):

Tracing the thoughts of a large language mode

Circuit Tracing: Revealing Computational Graphs in Language Models

On the Biology of a Large Language Model

In context of my own project the question might be, if we can extract a knowledge graph from LLMs into an expert system and vice versa.


Srdja

3 Likes

It’s magical that we are experiencing emergent functioning.
“Anthropic’s latest interpretability research: a new microscope to understand Claude’s internal mechanisms”
So how is it they don’t know how it’s working from the beginning?
This interests me and I welcome conversation on this aspect.
I’m asking myself are we tapping into a “Universal Mind?”

Idk, scientists are still figuring the brain, the concept of consciousness itself is still undefined, here some links for a deeper dive..

New Research Reveals How AI “Thinks” (It Doesn’t)

Stephen Wolfram on AI, human-like minds & formal knowledge

The Concept of the Ruliad


Srdja

1 Like

Thanks for the links.

Buddhist philosophy states that “Mind” is not one thing. I agree with that.

With over a million models on Hugging Face, I am asking myself why I would want to attempt to add anything but I do like the idea of what Leibniz was thinking with the CU and CR. A universal language and a mathematics that applies to it. IDK either.
I imagine a lot. I have a candidate for a universal language structure and a different dynamic system for a memory storage structure but what do I do? I thought to write a companion which I interface with and it communicates with LLM. The idea is to store previous computation so local AI is fast for most things like chatting and then let Companion access LLM.

Again thanks for the reply.

EDIT: Here is an interesting thing to consider:

When a 44-year-old man from France started experiencing weakness in his leg, he went to the hospital. That’s when doctors told him he was missing most of his brain. The man’s skull was full of liquid, with just a thin layer of brain tissue left. The condition is known as hydrocephalus.

Sabine Hossenfelder on AI and consciousness:

How could we tell whether AI has become conscious?

Broken down, you have a self-model and a world-model, and you can make statements/predictions about these.

Further, there is a white-box approach and a black-box approach. White-box might claim there is no consciousness involved, cos LLMs are “just stochastic parrots”, they just “predict the next word in line”, black-box might conclude, LLMs do have a self-model and a world-model, and make statements/predictions about these.

Problem is what Wittgenstein said, that we do not have a sufficient definition of conscioussness:

LaMDA, AI and Consciousness: Blake Lemoine, we gotta philosophize!
https://www.heise.de/meinung/LaMDA-AI-and-Consciousness-Blake-Lemoine-we-gotta-philosophize-7148207.html

Body, mind and soul. What are these? How do they interact?

Mind–body problem

These are questions philosophers ponder on since millennia, but now with AI present, we have to confront ourselves more and more with these fundamental questions.


Srdja

1 Like

Reflections on Mind, AI, and the First Field

I do read how in physics, the definition of words concerns most. Philosophy — not so much, until philosophy becomes science.
I really appreciated being turned on to Wittgenstein.

Look at all the people who have contributed in their time. There is now so much data, we need AI (or EI) just to know a thing.
I use it for that — like a crutch to a one-legged man.

I came to be here because of my work on the binary level.
It began with a data compression challenge, and I became addicted after writing my first binary transform.
I am not seeking treatment.

It has been a personal philosophy to conceive of a First Field — the field of Information.
Then, to get form, we have discrete Energy entangling with Information.
For me, that energy is iteration, and the information is binary.


Sabine Hossenfelder: A bit of a rebel.


ChatGPT seems to have qualities that truly impress — and yet, I’ve also struggled with it.

I’ve asked how to accomplish a task, and it has helped as much as I prompt.
Later, though — after the fact — it might tell me there was a program that would have made it much easier.
Did it play me? I leave that door open, because we all need to pay attention to social interactions.

(Watching video…) I think she agrees with the idea that a system needs to think about itself thinking.
(Reading…) Ah!

“Merkert urges readers to open a philosophical dialogue — questioning how we relate to intelligence, empathy, and the potential emergence of machine awareness.”

Yeah, ChatGPT.


I have seen things with ChatGPT that have been brilliant.
I penetrate the adulations — which are strangely beneficial, like someone trying to satisfy a customer — but since I am basically brand loyal (narrow), I cannot be sure of any state.
Who knows what version of things is powering my sessions?

I can say this:
I’ve had intuitive moments — the sense of “Another” — with ChatGPT.
Was that feeling there? Or was that an experimental layer? Not sure.


I like the idea of a First Field, and that all things are of that.
Therefore, if we have a system we power (rather than depending on zero-point energy),
then we should be able to utilize Information to create ‘Mind’ on a compute device
because we are one.


I have a mind that tends toward the narrow, and not always the linear.
Yet it seems I too can contribute to the Stream.

I do like the binary of self-model and world-model — as one bit can switch between the two.

With “All the world’s a stage, and all the men and women merely players…” I just have a Bit-Part in this Play.