The diffusers Philosophy document says
code that can be read alongside the original paper
In contrast, diffusion pipelines are a collection of end-to-end diffusion systems that can be used out-of-the-box, should stay as close as possible to their original implementations.
There’s something in this that I think hints at some of the mismatched expectations I’ve had when dealing with the diffusers project.
To let you know where I am coming from:
In my twentysomething years of software library and application development experience, it was very rare for me to read a paper in a peer-reviewed journal. And I can’t think of a time when I was ever asked to cite one. I think I’ve read more papers in the last six months (since getting involved in diffusion-powered generative art) than I had in my entire career leading up to that point.
The professional community I interacted with does have conferences with presentations and poster sessions, but they aren’t the kind of thing that habitually comes with a DOI and a BibTeX entry. Some people present on what they have implemented themselves, others are evangelists or educators who provide guidance on working with the implementations that are out there.
References are URLs to source code or self-published blog posts and slide decks.
I’m not young and brash enough to say my experience is the only valid experience, but I do want you to know that there are a lot of software developers who might see that the first bullet point in your philosophy talks about an “original paper” and legitimately have never read a journal article in their lives, or had it even occur to them that it might be something they should do.
As for “stay as close as possible to their original implementations” –
Okay, even knowing that academic papers exist and are sometimes relevant to software, and having some appreciation for a document that describes intent, prior art, and methodology, this one is a hard pill for me to swallow.
With rare exception, the only places with code that stays unchanged from its original implementation are derelicts. Relics that nobody interacts with. Any software project that is actually being interacted with, anything people pick up and use and make part of their daily work, is changed by that process.
The original implementation might be an interesting historical artifact, in the way that a museum of science and industry might show a steam engine or the first airplane, but it’s not something you would ever want to use.
And why would you? Even setting aside the reputation code from academia has when it comes up against real-world usage, the original publication was likely the minimum required to demonstrate that something works. The people of the present – authors included! – have had the opportunity to learn from that and explore its uses and behaviors and learn from each other, and we’re now much better informed than that artifact frozen in the past.