Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Linux graphics stack in a nutshell, part 2 (lwn.net)
53 points by zorgmonkey on Dec 28, 2023 | hide | past | favorite | 48 comments


Very opinionated article really.

It starts with unsubstantiated claims that X suffers from "bloat" despite being capable of running on a 486 and dismissing the fact that the average Wayland compositor + necessary infrastructure is much more "bloated" than X ever was.

It fails to distinguish between compositing and hardware accelerated blit operations that allow X to display multiple windows without the need for a compositor.

It talks about hardware planes but fails to mention that no Wayland compositor so far is capable of using them whereas X is using them for the mouse for a long time (which avoids mouse stuttering on high GPU/CPU load).

It proclaims that certain protocols are "commonly used" when they are really in a experimental phase or actually not commonly used.

I get it. Linux userland graphics is huge a mess right now. Mostly because Wayland caused a huge amount of uncertainty and fragmentation. At least call it out as such and don't pretend otherwise.


Agreed, I feel the author even manages to step on their own feet in this manful effort to show malice towards X.

> In contrast to X, Wayland applications always run on the same host as their compositor. Implementations are thus free to optimize for this case

Sweet.. the "go fast" generation is thrilled. We're optimized for a single popular commercialized case!

> Transferring data via shared memory is good enough for software rendering but, for high-performance hardware rendering, it is insufficient.

Whoops.. we're no longer optimal.

> To avoid that penalty, the graphics buffer has to remain in graphics memory. Wayland provides a protocol extension to share buffer objects via a Linux dma-buf, which represents a memory buffer that is shareable among hardware devices

Which, X also has. So, we've gone completely full circle, and we're less capable for it. I'm hopeful for the era of "move a little slower and try to fix a few things along the way."


> Sweet.. the "go fast" generation is thrilled. We're optimized for a single popular commercialized case!

Yes, you should optimize for the 99% case. That is basic software engineering.

> Whoops.. we're no longer optimal.

I don't know what this means. X toolkits since GTK in 1998 have been drawing bitmaps to shared memory. The basic X 2D vector graphics primitives haven't been commonly used for, like, upwards of 20 years.

> Which, X also has. So, we've gone completely full circle, and we're less capable for it.

X doesn't have the synchronization features that necessitated Wayland. It's simply not the case that X can do everything that Wayland can do.


> Yes, you should optimize for the 99% case. That is basic software engineering.

Without considering /why/ it's the 99% case? Even so, just turn this around, top tier games and "tear free" experiences on linux are the 1% case. Why should we rip up the whole graphics stack to chase a "year of linux on the desktop" that's probably never going to materialize?

> I don't know what this means.

I might have failed to follow that thread from the original post, but the author implies this is an issue with Wayland, not X. Then points out that a simple extension covers it, which ironically means, implementations are _not_ free to optimize for any particular case.

> X doesn't have the synchronization features that necessitated Wayland. It's simply not the case that X can do everything that Wayland can do.

Would that be "implicit sync" or "explicit sync?" Anyways, I'm not trying to show malice to any one project or the other or even express favoritism to one or the other, just trying to show that the authors criticism of X is far too strenuous to be useful and seems to be based in a particular and peculiar modern mindset.


> Yes, you should optimize for the 99% case. That is basic software engineering.

Wayland is optimized for car entertainment systems and digital signage. Is this your 99% case?

> X toolkits since GTK in 1998 have been drawing bitmaps to shared memory.

Not true. E.g. several Gtk2.x rendering backends utilized XRender to draw Buttons with gradients without a single bitmap in shared memory.

> X doesn't have the synchronization features that necessitated Wayland. It's simply not the case that X can do everything that Wayland can do.

The X11 extension "Present" does exactly the same as Wayland concerning "synchronization features". What exactly can Wayland do that X11 can't do?


> unsubstantiated claims

Indeed:

> It used to be easy to have any kind of X client connect to a remote X server. Back in the early 1990s on the HP Apollo workstations in college, it was a simple matter of setting the DISPLAY environment variable, but that was before network security was a concern.

Remote controlling an HP 1670G logic analyzer with a Linux PC X server

https://tomverbeure.github.io/2023/12/26/Controlling-an-HP-1...

10 hours ago here on HN: https://news.ycombinator.com/item?id=38778807

and also:

> I recently scored a Hewlett Packard 1670A Deep Memory Logic Analyzer and I finally had a chance to fire it up. This unit dates back to 1992 and is packed with all sorts of interesting options for connecting peripherals to it. One particular feature that caught my eye was the option to connect to an X Server.

A Testament to X11 Backwards Compatibility

http://www.theresistornetwork.com/2013/12/a-testament-to-x11...

10 years ago on HN: https://news.ycombinator.com/item?id=6850591


Yeah, I run Linux VM in background and Xserver on my Windows to work on it. It works very well. The problem is bloated browsers like Firefox where it lags terribly over X.

  gfx.webrender.force-disabled = true
Was helping for a while...


> "bloat" despite being capable of running on a 486

The worst bloat is structural and involves network round-trips, usleep(10000) calls, etc that can't be removed due to expectations that have thoroughly baked themselves into an ecosystem. No amount of Moore's law can make usleep(10000) faster.

I don't know if that applies to X, but I have vivid memories of learning how to tunnel X sessions only to find that VLC was faster and more responsive while using less bandwidth on extremely simple UIs. Something in that protocol had become so shamefully suboptimal that it lost a box drawing contest to pixel pushing, and I tend to suspect there is more where that came from.


You mean VNC?


> It starts with unsubstantiated claims that X suffers from "bloat" despite being capable of running on a 486 and dismissing the fact that the average Wayland compositor + necessary infrastructure is much more "bloated" than X ever was.

"Bloat" doesn't mean "slow" or "memory-hungry". What "bloat" refers to is all the server-side windowing stuff that's no longer used by anything but Xaw or Motif, as well as extensions such as XRENDER that are useless nowadays. X servers carry around all of this legacy baggage that essentially goes entirely unused.

> It fails to distinguish between compositing and hardware accelerated blit operations that allow X to display multiple windows without the need for a compositor.

Compositing has been non-optional in every major desktop OS since 2006 except X. There's no point in supporting a non-compositing mode anymore, as it results in unavoidable tearing. 17 years is a long enough timespan to drop obsolete technologies.

> It talks about hardware planes but fails to mention that no Wayland compositor so far is capable of using them whereas X is using them for the mouse for a long time (which avoids mouse stuttering on high GPU/CPU load).

Huh? Weston has been using them since 2013 [1].

[1]: https://www.phoronix.com/news/MTI5NTE


Throwing out X11 because some features of it are legacy is and always was the entire point. And I don't think it's justified or good engineering practice.


According to my recollection, it was more that the codebase became difficult and unpleasant to work on to the point that it was becoming unmaintainable. And nobody was interested in cleaning it up because that was just not interesting work


That's true, it was a bit of both. But in my experience "codebase became difficult and unpleasant" usually just means that the codebase is large and complex, and people don't want to take the time to understand it. It's always funny to me that people will talk about how elegant a codebase is one day, and the very next decide that it's crufty and old. Did it flip overnight? Did the cruft build up while they were calling it elegant? No, the only thing that changed was the complexity level. When people say cruft, I usually assume they're just frustrated by necessary complexity.


> Mostly because Wayland caused a huge amount of uncertainty and fragmentation

The most divisive thing were CSD (client side decorations). That is mostly a solved issue now (options like libdecor to deal with Gnome and etc.), though an annoying one. What other fragmentation was there?

The article was pretty on point in general anyway.


> What other fragmentation was there?

No strong reference implementation. No protocol for access control and resource sharing. No centralized font rendering (which is a mistake. every app looks different now even when using the same library). No centralized drawing (which pushes things like multi dpi monitors + fractional scaling down to clients). And many other things. Essentially everything a typical application needs except pushing some pixels.


No centralized XYZ isn't necessarily a bad thing and I don't think replicating X being everything and your sink with the same kind of design is a good idea either. But having a common way to do it (libinput, pipewire and etc.) surely helps to reduce fragmentation.


With "centralized" i really meant standardized. X11 has a standardized API that tells you how to implement window managers. Window managers themselves can be swapped, even at runtime without the need for swapping the whole display server or gui toolkit and are as such "decentralized". Similarly there should be a standardized interface to render fonts which every application can/should use and where the actual rendering backend can be swapped (ideally at runtime as well) because a lot of people have a vastly different opinion about how to render fonts (which comes down to taste in the end) but they want all their applications look the same.


The problem is that Wayland predates DX12, Vulkan, and Metal.

The graphics card vendors have converged to a standard. Wayland isn't compatible with that standard. No one on Wayland is willing to throw it out and rearchitect to the new standard.


Huh? Wayland is a surface compositing protocol. It's like DWM/DXGI, not like DX12; like EGL (sorta), not Vulkan; and like Core Animation and the private windowing API in Core Graphics, not like Metal. Different layer of the stack entirely.


I think the above comment refers to common Wayland compositor design using implicit synchronization that's aligned with ideas of OpenGL / EGL. It doesn't have to be an inherent part of Wayland world, but it de-facto became one. Explicit synchronization being an idea from Vulkan / WSI requires a lot effort to be plumbed through all layers.

There was a good post about it: https://www.collabora.com/news-and-blog/blog/2022/06/09/brid...

It doesn't mean what Wayland somehow is incompatible with such ideas, but there is clearly a lot of work to do for them to become fully applicable.


Thanks for that article. It is a good post.


Am I the only one who finds it ironic that this is Part 2 of an 'in a nutshell' document?


O’Reilly is famous for its "in a nutshell" books, some of them quite thick.

Some nuts are bigger than others.


Speaking of nuts, have they cracked HDR yet? I remember hearing some optimism about a year ago and I really look forward to it.


KDE Plasma 6 will have some basic support for HDR features when it comes out. The Wayland color management protocol needed for full support is not yet finalized although there is a working informal implementation of HDR for the Steam Deck OLED.


Well, Stephen Hawking's book uses the same term and it's both gigantic and as complex as it gets :D


It's crazy reading the back and forth in these comments. It seems each side (pro-X11, pro-Wayland) always gets something wrong about the other. Makes it hard as an outsider to figure out what's true.

FWIW as a regular user of Linux desktops, fractional dpi scaling is very good on Wayland and sucks on X11. That's been the main thing driving me to want to use Wayland.


I'm a total noob when it comes to graphical Linux environments, but isn't the issue with scaling caused by the applications themselves? Like GNOME only accepting integer scales, etc.

Setting X11 DPI worked just fine for me to achieve exactly the scale I want.


And it allows applications to display a PDF with an A4 page at 100% to have exactly the size of an A4 paper.


> On top of the application windows, the compositor draws its own user interface, such as a taskbar where the user can interact with the compositor itself.

Isn't this exactly what the compositor is no longer guaranteed to provide (in case of Wayland)?


That's a pretty informative overview! It's funny that CRTC refers to cathode ray tube even for all modern displays.

> This DRM fbdev emulation acts like a DRM client from user space, but runs entirely within the kernel

I wish kernel's terminal emulation would allow more modern features like true color support and sixels. There is no reason it can't work? Though if the plan is to replace it with userspace one like the article suggests, may be it will be easier to use better terminals when switching to tty.


haha, what could be less modern than sixel? it predates even phigs and it's been obsolete for over 30 years


lol, true, but Linux kernel terminal support is so barebones and hardcore, that it doesn't even have that.


that's because it's a bad idea


The idea is good, but it's bad to have it in the kernel. So moving it out should give more room to make it look better.


no, sixel for terminals is a terrible idea. there is no universe in which it ever made sense, but it's an even worse design now than when it was new. (sixel for printers was, by contrast, a good idea)

sending raw, uncompressed pixels over a bytestream to display them on a monitor is reasonable sometimes, but never packed six bits per 8-bit byte, and certainly not with an extra start and stop bit. sending uncompressed pixels over a network is basically never a good idea; you should always compress them with at least zstd. when you're running over a network with possible packet loss, a guaranteed-delivery bytestream of pixels (compressed or otherwise) is only a reasonable idea for applications like displaying a static image; for displaying a gui you don't want to retransmit old frames of video that have been corrupted by packet loss, you want to discard them and display the current state of the stream, the way netflix does. xpra works that way, mosh works that way, and x11 should have worked that way, but didn't

sending pixels over a unix pseudo-tty is an even worse idea, because asynchronous output from background processes will corrupt your screen image and, in many cases, go unnoticed

vnc is a bytestream protocol for viewing a remote gui, and you can implement a minimally working vnc server in 300 lines of golang (http://canonical.org/~kragen/sw/dev3/vncs.go). multiple people can connect terminals to the same remote gui at once, you can disconnect and reconnect later (as with mosh and xpra), it's generally reasonably efficient, and although it does uselessly retransmit outdated pixel data in the face of packet loss, it gracefully handles the situation where there are too many screen updates to transmit to the terminal. you could do better than vnc but there's no reason to do worse

the blit terminal protocol and mgr are two somewhat more reasonable approaches to extending traditional serial terminal i/o to support full graphical interaction

sixel in dumb terminals was always a stupid idea because, if your terminal is smart enough to have a color framebuffer to decode the sixel data into, it's smart enough to run a more reasonable protocol than sixel. like, it can run tcp/ip and x11. dec was desperately trying to keep people from fleeing from the hierarchical world of the vax controlling a bunch of dumb terminals to the world where everybody had a computer of their own, but obviously it didn't work

sixel between two processes on the same system is even stupider. seriously, just put the uncompressed image in a shared memory buffer or a file. you can even use inotify to get asynchronously notified when the file has changed if you want. there's no point in encoding and decoding the image with some kind of shitty 01980s kludge designed to run over a 9600-baud rs-232 cable

i mean obviously if you want to write games in brainfuck or display graphics with sixel or whatever there's nothing wrong with that. but it's important to keep in mind that sixel belongs to the set of things that people do because they make easy things hard, not the set of things that people do because they make hard things easy


Regarding sixels: a bad standard is still better than no standard at all, a better terminal protocol for a pixel framebuffer, controllable via ASCII sequences would indeed be a good thing.

> it's smart enough to run a more reasonable protocol than sixel. like, it can run tcp/ip and x11.

...at the cost of an incredibly bloated software stack. Sixels just need the equivalent of a VGA framebuffer and would also work on Linux installations that don't boot into a windowing system.


no, ascii sequences are the wrong tool for blitting pixels into your framebuffer

there is no point in looking for a 'better standard' for pounding nails with mashed potatoes

i've definitely run tcp/ip on a lot of things that don't boot into a windowing system or even have framebuffers at all


Hm, looks like there is another alternative - kitty:

https://sw.kovidgoyal.net/kitty/graphics-protocol/

But if anything, mpv works with sixels better than with kitty for whatever reason.


ascii terminals don't support anything that looks better than ascii art. they don't support sixel either. there are a variety of proposals for how to add graphics to ascii terminal emulators in a backwards-compatible way, such as mgr and notty https://github.com/alacritty/alacritty/issues/51 but they all suffer from some of the same problems as sixel

but the 'terminal' that a terminal emulator is emulating is a device which provides a user access to a remote computer. it is called a 'terminal' because it is at the end of a long wire connecting it to the computer (perhaps through a network). normally nowadays this is a laptop or cellphone. in that case, yes, you can use x-windows, xpra, vnc, rfb, spice, or a web browser


That's the whole point. I mean terminals that can work without graphics environment. Such as what kernel implements for tty.


well, you can't display mpv videos in your terminal emulator with sixel graphics or any other kind of graphics if your terminal emulator is running in an environment doesn't support graphics, such as if it's displaying on an adm3a. perhaps you mean you don't want your terminal emulator to run inside a window manager. well, the x server and the wayland server don't run under a window manager; the window system runs under them, if it runs at all


That's exactly the point. Why can't you? Because no one implemented that support in that terminal emulator? Or it's impossible for some reason?

I get the reason of may be kernel not wanting to overload the minimalist terminal with features due to security concerns and etc, but as the article suggests, it can be moved out of the kernel, so I don't see why it shouldn't be doable.


i said you can't 'if your terminal emulator is running in an environment doesn't support graphics, such as if it's displaying on an adm3a'. that is because the adm3a hardware physically cannot draw any graphics


There is plenty of hardware where graphics is possible, it's just not happening becasue of terminal limitations.


yes, that was why i was trying to clarify what you meant by 'without graphics environment'; whether you meant a window manager (which i'd already given several suggestions for) or an environment capable of supporting graphics (in which case you've made the problem insoluble)


You need to use --vo-kitty-use-shm=yes when using the kitty backend with mpv.


shared memory is, by contrast, an extremely reasonable way to get pixels into your user interface (i was going to say 'terminal' but it doesn't really make sense in this context)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: