Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Exploring the Fragmentation of Wayland, an xdotool adventure (semicomplete.com)
100 points by viraptor 8 days ago | hide | past | favorite | 93 comments




Xdotool and Xmodmap are the two main reasons why, after a few months running Wayland+keyd+dotool I went back to X11. I found really hard to have the following things working at once:

- Italian layout for my keyboard with heavily-customized AltGr keys for mathematical notation (in X11 it's just a matter of having a Xmodmap file)

- Using Espanso for many common shortcuts like :date: (current YYYY-MM-DD date) and :pidigits:

- A reasonable way to run Windows in a VM while using an Italian layout for my keyboard

- The possibility to use automation scripts using something as close as possible to xdotool

- Sometimes I use my home keyboard, sometimes I use my work keyboard, and sometimes I use my laptop keyboard. I expect the system to work in the same way regardless of my input device

It's not that Wayland prevents one from doing all this stuff, but the available solutions were fragile and complicated and took me so long before figuring solutions that only worked partially... For instance, to make keyd work as expected, I was forced to set up my Italian keyboard as an English keyboard and then remap all the keys manually... And every time I plugged a new keyboard, I had to tell keyd to enable my customizations on it, because telling it to use the layout with any keyboard conflicted with VirtualBox.

I understand that X11 is too complicated to be maintained, but from an user's perspective, so far I am far more efficient in X11.


> A reasonable way to run Windows in a VM

What's wrong with this case? Virtual machine reports invalid key codes to the guest? You need to have the proper layout in Windows, as (virtual) hardware only reports key codes.


A few months have passed and I might not remember everything correctly, but there was a series of problems:

- I use several symbols as Greek letters (α, β, γ…) and mathematical operators (×, −, ·, ∂…), and after much digging I found that the only way I could make keyd work with them was to choose a US keyboard layout. So, I had to write a configuration file for keyd to remap not only the special characters listed above, but every character of the Italian keyboard (è, é, ò, à, ù…). This extensive remapping required then an exception for Espanso to prevent `keyd` from intercepting its virtual keyboard output.

- However, this forced US-layout setup created a conflict with VirtualBox that I was unable to solve. When I installed Windows and selected the Italian layout inside the VM, the guest OS received the raw key codes corresponding to a US physical keyboard (due to the keyd remapping layer). Since the guest OS expected Italian key codes, all the standard Italian keys (like è, à, ò) stopped working correctly. Without keyd enabled, the standard Italian layout worked perfectly in the VM.

- The attempts to create application-specific exceptions (e.g., to disable keyd for the VM window) using tools like keyd-application-mapper did not function correctly in my KDE environment because of known issues in these tools.

- Finally, introducing new hardware like my Corsair keyboard added another layer of complexity, as its Linux driver (ckb-next) was incompatible with the active keyd remapping layer. This was the point when I decided to revert to X11.

I should definitely collect all these details and write a blog post about it…


You'll never find me saying that Wayland development is good in its present state. I think it's a mess and it has a lot of issues.

But let's be honest about Xorg. The overwhelming majority of people who worked on Xorg are now developing Wayland. Why? Because developing Xorg is a massive pain in the butt. It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt. I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.


> Because developing Xorg is a massive pain in the butt.

That's really no reason to build an entirely new system, and then half-ass it the way Wayland did. The Wayland gang should have started with a new, modernized, cleaned up window system API running as a layer on top of X11 and then start replacing the cruft piece by piece while keeping both the original X11 API and the new API working all the time, basically build the whole project from the user perspective (both 'regular' Linux users and programmers who need to build Linux apps).

I guess though the main problem is that feature parity with X11 wasn't even a design goal, they intentionally threw out the baby with the bathwater, and also intentionally fragmented the Linux desktop even more. It almost smells like sabotage (at least self-sabotage).

Also, it's been 17 years since Wayland was released, that's as if X11 would have barely started to become usable by around 2005.


> It almost smells like sabotage (at least self-sabotage).

See also: Ubuntu Unity, Gnome 3, KDE 4... all widely panned by their most loyal users. 2010s were the lost decade of Linux.


there were many of us who really liked gnome 3 though. Not that it was issue free, but the workflow was very appealing.

Sure, I quite enjoy it myself now. But I'd be lying if I said it wasn't borderline unusable for years.

It's not sabotage or self-sabotage, implying some intention behind it. It's just classic second system syndrome: hyper-fixating on real or perceived flaws that were never able to be solved or done right in the previous system, which can be to the detriment of other considerations.

There wasn't a need to have 10s of different wayland compositors. There is not a need to endlessly bikeshed over extentions instead of delivering user value. These are failures of leadership in driving the replacement of X.

Just compare this to Windows and how they made this rearchitecture of making their compositor more modern without splitting into 10s of compositors and breaking a ton of apps.


Here you're just comparing proprietary closed source development to open source development. In the proprietary version the goal is to improve a product. The OSS goals are much harder to pin down and can be different person to person, but it wouldn't be unreasonable to have a goal of "make it so that other devs can make their own compositors easily" and therefore you're describing an obvious success.

Short term this might be a far slower and worse approach. It's not clear that's the case long term though, making things easier to try out different ideas and then finding a winning compositor project could be better than being stuck with one.


It isn't particularly easier to make your own compositor either, as you now also have to bring your own window manager. What made the X architecture much more interesting is that it avoided coupling the window manger to the compositor. Hell: there even are multiple popular compositors for X, as they also managed to avoid coupling the compositor to the display server (which would be the one part of the system that you don't find too many of -- though there were multiple implementations over the years! -- but that's not really much different than Wayland where everyone is using the same library to implement the behaviors as part of their coupled-together balls of mud.)

>What made the X architecture much more interesting is that it avoided coupling the window manger to the compositor

This is the industry standard, putting the compositor and window manager in separate processes.

Android separates SurfaceFlinger and WindowManagerService.

iOS separates quartz compositor and springboard.

Windows separates dwm and explore.

MacOS separates WindowServer and Dock.


Dwm is both the compositor and window manager in Windows. It’s the same with WindowServer in macOS.

If you are talking about the shell or task switcher, then yeah your point stand with Gnome but KDE has kwin and plasmashell processes.


Tell that to the Wayland people? ;P

But why would I ever want to have a separate compositor and window manager? Like the display stack benefits from "vertical integration", being modular is a tradeoff, often of performance and significant complexity.

Why not just make a display server (which handles everything rendering related, compositing included), and then add a window manager as a plugin/extension on top? Window managers are not that complicated.


I dunno: I have never ever wanted to make a compositor -- which, to me, feels like a really boring piece of graphics infrastructure -- and yet I have used multiple window managers over the years and have absolutely wanted to make my own window manager? In X, making your own window manager was so popular of an activity that it honestly felt kind of unreasonable just how many window managers existed, and yet everyone used one of a handful of compositors and I'm honestly not sure why anyone in their right mind would bother making their own display server?

for instance as a X11 user I don't want a compositor at all

(Same... I know people use them to get some pretty effects; but, they add a frame of latency I do not want and require lots of memory and assume acceleration I don't need.)

There is no way to avoid a frame of latency without "racing the beam", which AFAIK quite complicated and not compatible with most GUI frameworks. That is, if you don't want tearing.

But I may be wrong here


One frame of latency and adding a frame of latency are different things. The first is required (without tearing) the second should be avoided at all cost (athough high display refresh rates reduce the problem of "long" swapchains quite a bit).

> Short term this might be a far slower and worse approach.

We are way past the short term with Wayland!

Wayland is 17 year old.


Existing != used

It does not matter if the devs alone worked on it in isolation but at what point there was public use and how it has evolved. The earliest you could argue it was being used by user on a distro would be 2016 in fedora. Actual mainstream use in ubuntu around 2021 but optional and default literally only just this year.


Perhaps proprietary closed source development is better for making operating systems. Is it a coincidence that Google was able to scale Linux to billions of devices while open source development ones weren't? Open source development should take some lessons if they want to be successful and not aggrevate developers writing apps for your platform like what happened in the article, forcing them to do extra work.

If development for X is ceasing now, there isn't time to experiment on finding the true successor.


I think the hard part about the Linux desktop ecosystem and its development pattern is the cobbled-up-parts nature of the system, where different teams and individuals work on different subsystems with no higher leadership directing how all of these parts should be assembled to create a cohesive whole. We have a situation where GUI applications depended on X.org, yet the X.org developers didn't want to work on X.org any more. If the desktop Linux ecosystem were more like FreeBSD in the sense that FreeBSD has control over both the kernel and its bundled userland, there'd be a clearer transition away from X.org since X.org would have been owned by the overall Linux project. However, that's not how development in the Linux ecosystem works, and what we ended up with is a very messy, dragged-out transition from X to Wayland, complete with competing compositors.

Bazaar-style development seems to work for command-line tools, but I don't think it works well for a coherent desktop experience. We've had so much fragmentation, from KDE/Qt vs GNOME/GTK, to now X11 vs Wayland. Even X11 itself didn't come from the bazaar, but rather from MIT, DEC, and IBM (https://en.wikipedia.org/wiki/X_Window_System).


> Perhaps proprietary closed source development is better

Perhaps...

> Open source development should take some lessons if they want to be successful

A lot of people who write the gui stuff for Linux do it because they want to. Success is not necessarily the same metric as a company making a product.

There are companies working within the space and I doubt the licensing really makes much difference to the outcome (i.e. your Google example)

> If development for X is ceasing now, there isn't time to experiment on finding the true successor.

Why? Again, the people working on it because they want to don't need to do anything, they can experiment. Someone can still fix up issues in X. Some companies will fund the development of things that are important to them. You make it sound like the oss community should be acting like one entity to achieve something, but there is no overarching goal nor a reason for there to be one. People will continue pulling in different directions.


> A lot of people who write the gui stuff for Linux do it because they want to.

Think about how many people might want to write for it if it had a compelling ui stack, tho


Access to virtually infinite cash had more to do with Android's success than the source being proprietary.

Linux (the kernel) is also open source and doesn't suffer from the fragmentation problem. It's pretty much unique to the Linux desktop because there are too many cooks involved.

But even if there's only one cook, it could be worse (if that cook is the gnome team). At least with multiple cooks we can pick kde instead of gnome.


You think Linux isn't widely used at scale?

Phones are a different market from computers, even though they're technically the same thing. A large segment of people own "phones" but not a computer. Linux runs a large chunk of the internet. I think it's used quite well at scale.


Even in the server market the success of having a stable app platform can be attributed to Linux, the kernel, solely for having a policy to never break userspace. The base of the app platform was already figured out a long time ago, and if you look at the bulk of Linux contributions you will see that they are coming from companies using Linux commercially.

> You think Linux isn't widely used at scale?

Certainly not in a high-productivity environment. Google has to swap out most of the runtime components with distributed alternatives to make it compelling in a corporate (distributed) environment.


> it wouldn't be unreasonable to have a goal of "make it so that other devs can make their own compositors easily"

I can't say i've ever wanted a second compositor to choose from. Ideally it would just be part of the window server.


How can you compare the Cathedral with a bazaar? This is not a technical difference at all.

Apple/Microsoft can do whatever they want, just break compatibility at any point and everyone else wanting to have their programs supported on their platform will adapt.

Meanwhile for Linux network effect has a much bigger role to play, you can't tell anyone else what to do, but protocols can only emerge from working together.

Also, I wouldn't bring up Microsoft's display stack as a positive example at all.


> Apple/Microsoft can do whatever they want

Those two are worlds apart when it comes to backward compatibility.

> Also, I wouldn't bring up Microsoft's display stack as a positive example at all.

Why not? It's doing exactly what it's supposed to do, and has been since the late 90s. There's tons of fundamental improvements since then, but they're all under the hood without affecting user-facing features. I'd say the Windows display stack modernization is an excellent example of how it should be done (a real shame though that Microsoft is actively ruining Windows by adding user-hostile features on top of the pretty good technical base).


User experience and developer experience for an OS are real things. It's easy to make a bad experience, people have to actually care about being able to deliver a good experience. Even if you can't tell people what to do, it should be possible to align on something that can deliver a good experience for users and developers.

>This is not a technical difference at all.

Which is why I said it was a problem with leadership than with the technical merrits.


"Failures of leadership" implies that leadership actually exists. Does it?

Right, this is basically peoples' hobby projects. Nobody is incentivized to "lead" the Wayland project.

Actually, that'd probably be a better outcome. But as it is, Red Hat & Ubuntu et al pay people to work on Wayland and those people follow corporate priorities rather than centralized priorities.

I think Red Hat wants a working desktop but I don't think they have strong official opinions on how to get there. I think individual people are responsible for the GNOME/Wayland/Freedesktop messes.

Windows gets to completely rearchitecture their compositor because they only provide one stable ABI to get pixels on the screen: link to USER32.DLL, create the necessary objects to represent a window of your application's class, then create and pump a message queue for it. It's ancient, but it works, and more specifically will never change. Even the higher level toolkits Windows ships ultimately are creating USER windows, and USER has been the only UI ABI since version 1.

macOS is the same way, except Carbon (a light modification to the procedural Toolbox API) and Cocoa (the Mac's first OOP toolkit) were "toll-free bridged" to each other rather than, say, writing Cocoa in terms of Carbon.

In contrast, X11 is a protocol anyone can implement and speak. There is no blessed library that you must use. No, Xlib doesn't count. Servers have to take their clients as they come. And Wayland, while very much deliberately stripped down from X, still retains this property of "the demarc point is a protocol" while every proprietary OS (and Android) went with "the demarc point is a library".


You don't see me working on this stuff, but people keep complaining about this because instead of one thing that works but is a pain, we have two things that work but are a pain. It's pretty obvious that while Xorg works for a lot of people, it's not the way forward; but I think it's apparent that Wayland might not be either... although I think it's likely some will end up running a wayland server with Xwayland as the single wayland client to get continuing driver support.

This is a lot different than say OSS vs ALSA. OSS really could have worked (and still does on FreeBSD afaik), but ALSA fully replaced OSS. I think pipewire seems likely to replace PulseAudio, even if it may not have PulseAudio's key functionality of ruining audio when things used to work just fine.


>This is a lot different than say OSS vs ALSA.

ALSA is an absolute nightmare to work with infinitely worse than Wayland. At best 10% of it is ‘documented’ through Doxygen. For the rest the only reference is the source code. This is one reason applications don’t tend to support ALSA anymore.


It's easier to write documentation than to completely rewrite a subsystem.

You're right, Xorg and X11 should be abandoned and for good reason. That should have happened decades ago. But Wayland doesn't actually fix anything that really needed fixing, other than wiping the slate clean. It's a good thing that Arcan exists, or the future of Unixland would be quite bleak.

How big is the Arcan development team? Is there any prospect of Gtk or Qt adding at least basic native support?

I don't know about GTK (and frankly hope anything will be ported to something else and the whole GNOME project get nixed), about QT they recently implemented a QPA. https://codeberg.org/vimpostor/qtarcan

That begs a question: if they had that much experience why they chose to structure wayland in a way that's such PITA to write for ? This just looks like some massive second system effect.

They just decided X11 did everything wrong and did it differently rather than pick up the pieces (if in spirit of idea, not code) that work and fix parts that don't


I wrote an app using Wayland and XCB/X11 and honestly, I found the Wayland part to be much easier to write than the XCB part, even though it required me to write more code.

This is partly due to the fact that everything you can do with Wayland is defined in protocols that are straightforward to use whereas in X11 you have atoms and messages with arcane name and structures for everything, a lackluster documentation and terrible error handling.


Well, yes. As you say:

Q: if they had that much experience why they chose to structure wayland in a way that's such PITA to write for ?

A: Because they were reacting to Xorg, so they wrote the exact opposite of that.

And for bonus points, because one of the problems they wanted to solve was "Xorg is hard to maintain", they made sure that the replacement was much much easier to maintain and develop... for them. Not for application devs, not for users, but for the folks making wayland, I have no doubt it's very well streamlined and easy to work on.


> they made sure that the replacement was much much easier to maintain and develop... for them

Tbh, if that were the case I would expect much faster progress.


The reason Wayland progress is slow is not technical. We have a coordination problem, people have differing priorities and views on what should be allowed.

There are people opposed to things like a allowing windows to specify their own bounds, and unless all the stakeholders agree to implement such protocols in their respective projects, the ecosystem will remain fragmented. Multiply this against every feature that people want.


It's unfortunately possible for multiple things to be true here. Xorg is an unmaintainable piece of tech debt. And Wayland made several poor design decisions that are now sending it down an eerily similar path, just with different, fresh tech debt.

And that tech debt possibly duplicated across multiple different projects now, too.

Right, I think we can all mostly agree that the old state of things wasn't great/sustainable. The problem, IMHO, is that they went hard on the second-system syndrome and went way too far the other way. This allowed them to replace a massive messy codebase with a nice clean codebase that doesn't do the things people actually need from it.

Xorg put everything - way too many features - into one single display server (Xorg). Wayland put everything in the hands of the compositor, and then spawned an endless array of them (most of them implementing only a fraction of needed features).

X11 de jure and de facto required all those features to be present. In theory you could have an X server missing new features, but there was no way to get rid of really old features, and in practice you really needed all the new ones or apps would break. Wayland made essentially everything optional, to the point of fracturing the ecosystem.

Xorg was a monolithic reference implementation. Wayland ships a reference implementation in the form of weston, and it's so feature poor as to be useless.

X11 has, in practice, really poor security. (There were/are attempts to improve this, but it's not been terribly successful.) Wayland is really big on security. So much so that they refused to implement little things like screen shots and a11y features because they could be abused.

IMHO, with hindsight, they should have done this in 2 stages: First, do the backend refactoring to get the nice driver-facing parts (GBM, AIUI). Essentially, make rootful XWayland the only Xorg, but in a way that is completely invisible to users. (Or, put differently, ship https://gitlab.freedesktop.org/wayback/wayback in 2010 instead of 2025.) Second, after you've done that and vastly simplified a huge chunk of code and made upkeep and refactoring easier, start working on X12. For the sake of argument, this can still be basically the same protocol as the wayland we actually got. However, don't actually ship that at first. Instead, go build/port an actual complete desktop environment to it, including all the features people actually want - clipboard, screen sharing, a11y and automation tools, remote desktop, etc. - and actually implement all the protocols needed for those. By all means make them optional add-ons to the core protocol, but make them up front. Also, I really recommend making one of those a window management protocol, so that 90% of window managers don't have to be a compositor, though some will. Then, after the thing is actually functional, start trying to get people to switch over. Don't start pushing people to adopt something half-baked and mess about for years on basic protocols that should have shipped day one (last I checked, in 2025 there are still 3 different incompatible wayland screenshot protocols). Make it an improvement, not a regression that only benefits you the Xorg developers.


> IMHO, with hindsight, they should have done this...

FWIW, it was also obvious to many people--certainly anyone who had ever been part of one of these big refactors before, whether as the platform or the user--that this is how it should have been done when they started... they just didn't care, and then they spent a decade both directly and indirectly (by condoning the behavior) bullying people who were concerned about the process and insisting that people who even still today have perfectly working systems were/are committing some kind of cardinal sin by not embracing the one true path of Wayland, despite regressions. It is extremely difficult to find any sympathy for the people involved :/.


Because I had to look it up:

a11y = accessibility

There's some irony here, I think. =]


The overwhelming majority of people who worked on x11 are retired and a growing minority are dead.

This is the fourth incarnation of x11 and the people working on it now have nothing to do with the people who developed it.

Xorg is the castodian group who started life as a fork, of a fork, of a fork of an spinout from mit.

Them trying to kill X11 is laughable to anyone who knows anything about its history.

Wayland on the other hand is now 18 years old and we've been told it will be good any day now for 18 years.


I mean, Wayland works fine for me. I'm using niri and an nvidia card.

Yeah. From what you read on here you would think it was totally unusable. Yet I use wayland all day every day. I use an AMD card and it even works. Sway for work KDE plasma for games. Everything I use works completely fine in wayland.

> The overwhelming majority of people who worked on Xorg are now developing Wayland.

I've never seen this documented.

> It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt.

So we have people who want to create features but do not want to pay for technical debt. So.. they create more technical debt? Is there some indication that the wisdom of the crowd is particularly valuable here?

> I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.

It seems like all the paid developers are working on Wayland while many of the volunteers are working hard to continue Xorg despite all the sponsored efforts to artificially shutter the project.

The article authors main complaint seems to be that distributions forced users to choose between one or the other when, at this point in history, there are zero good reasons to have done that.

Open source used to be about choice. Now it's about paid interests bullying you out of that choice. And Hacker News readily defends this in the name of modernity for it's own sake. It's truly a bizarre outcome to me.


> > The overwhelming majority of people who worked on Xorg are now developing Wayland.

> I've never seen this documented.

What do you mean you can look at the history of wayland on Wikipedia (it was started by Kristian Høgsberg) the person who wrote the DRI2 implementation for xorg. Other major xorg contributors like Hutterer have also been major wayland contributors.

I think the misconception is that people thought there are lots of xorg developers. That's just false, around the time when wayland was started there were maybe 10. And now there are even less.

> > It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt.

> So we have people who want to create features but do not want to pay for technical debt. So.. they create more technical debt? Is there some indication that the wisdom of the crowd is particularly valuable here?

But that's not what they did?

> > I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.

> It seems like all the paid developers are working on Wayland while many of the volunteers are working hard to continue Xorg despite all the sponsored efforts to artificially shutter the project.

Who? Looking at xorgs git there is essentially 1 developer making changes that are not related to xwayland?

> The article authors main complaint seems to be that distributions forced users to choose between one or the other when, at this point in history, there are zero good reasons to have done that.

> Open source used to be about choice. Now it's about paid interests bullying you out of that choice. And Hacker News readily defends this in the name of modernity for it's own sake. It's truly a bizarre outcome to me.

You mean the choice not to work on xorg? You're welcome to use X, but you can't bully others into keeping it going for you.

The reoccurring theme in these comments is that the people complaining have little knowledge of X internals, have usually not done any work programming a WM, a compositor or X or wayland libraries. Listening to the people who have done that (e.g. Rasterman, deVault...) are widely positive about wayland over xorg. It's also an indication that most recent interesting desktop experiments/projects (niri, sway, hyprland...) have been happening under wayland. And AFAIK none were corporate sponsored.


> The overwhelming majority of people who worked on Xorg are now developing Wayland. Why?

CADT.


There's so much hot debate about how bad Wayland is, how incorrect it is. But theres something I respect enormously about Wayland which is that: it is so so so much less than Xorg.

It uses the kernel's graphics buffers. It uses the kernel's mode setting. These alone are humongous differeniatiors.

There's so many other amazing glorious ways that Wayland is less. The protocol-centricity is vastly under rated, a massive win for the bazaar that can keep seeking truth versus the (imo utterly pathetic clining) absolutionist monolith style.

It's revolting to see such persistent bitter angry low user disdain, anger. Without any acknowledgement at all. That protocols allowing multiple implementations allows constant honing in, allows for dynamic change and evolution.

Reflecting on the Hindu Trimurti, a cycle of creation/newness, stasis/pattern, and decay & rot, it's amazing how the protest no-change/stasis-only voice has such a loud undying protest going. X is never getting better, has no room to improve, cursed by its own egocentric insanity which it has recursed into far far too far: which the core devs all agree.

It's not pleasant for everyone that Wayland allows a freedom of implementation. But generally most of the protest here has fallen away: support for major features is just here, on most implementations. That competitors can compete, don't have to keep using the same base is hugely advantageous to humanity. But the protest no-change anger-only voice is so loud. Doesn't know doesn't care.

Humanity should respect systems where competition and improvement are possible. X was a single consigned fate, with no growth or improvement. The competition of Wayland is an incredible breath of fresh air, and the growth of protocol competition here is telling, to not necessarily the "everything just works and is great" desire path of the low tech-ig orant beggar class, but which has enable so much Bazaar democratic figuring shit out, that still shares the ideas while allowing innovation within, in a way that few projects have ever enabled before. We are in a magic age of so so much, such cooperative competitive improvement, and it's just so unspoken, so missed, amid the squeaky wheels offering no actual technical critiques, unable to reflect upon the different (much better) age of possibility the bazaar model has opened us into.


This is exactly backwards. Whenever some team that is maintaining a monolith will look at the possibility of splitting it up and going with this protocol idea, they will look at Wayland as a cautionary tail of just how badly that works out in practice.

What an utterly vacuous statement saying nothing. Making no refutable claims. Typical hot air, full of nothing. Boring as fuck, nothing here.

Further, your point is spoken from the perspective of a company, a single entity. Companies are utterly unable to bank on Bazaar practices, to embrace the multitudes way of finding answers. They lack the hackerly blood to try many approaches. They are not creative enough to do anything but build their one Cathedral.

As a company, no, you should not try to build interoperable protocols to foster internal completion on. Duh, no shit. But strong command and control-while it may be good for a company-is not going to be how a much broader ecosystem finds the best paths to take.

It's incredibly impressive how much Wayland compositors compete/cooperate for better. Sway/wlroots for one example has a new Vulkan backend. They could just go try and do new things. There's protocols to implement and they made new implementations, and now there's a half dozen Wayland compositors that have new cutting edge tech they are trying out. Innovation at the edge, but working together, is the shit. Yeah it's not a model that helps the corpo's but that's because open source is searching a much wider field of options, looking much better for wins, and the cathedral model isn't going to get you any of that.

I'm still impressed what vacuous say nothing piece of shit useless Fear Uncertainty and Doubt folks can spread. This era has such a virulent pox of hatred, built around such empty words. None of these bitter words actually say anything, this whole discussion is filled with rabid useless disdain. Piss on ye, say something contestable you villainous cowards. What does you are, saying nothing, but trying to dynamite it all. A pox.

The calculus of what some team does is totally different than what open source does.


I don't understand a lot of the complains. It asks for a remote connection? That's because of xwayland (which is x11 inside) not wayland AFAIK. Also all the comments about how that is weird on a single system, mmh the whole X server/client architecture always sounded like one was running like on a remote system.

I actually like the approach that compositors are much more different from each other than WMs used to be, that allows people to experiment much more. Also let's not forget that X was a plethora of different plugins and incompatabilities. The reason many didn't encounter that was that the almost everyone was running xorg with all plugins, that said I still remember the hoops one had to jump through to get transparency etc. You needed a compositor and not all compositors were compatible with all WMs (and all had different capabilities).

That said I do also wish that the protocol would evolve faster. It is my impression that if it wasn't for the wlroots people not much would have happened, especially because the gnome guys seem to rather just implement something for themselves and don't try to use or push the standard.


The general movement of UI paradigm has been from one tech to the next with a focus on backwards compat. Almost amusingly so at times, but this is how all the earlier users and use cases can most easily progress. E.g.

* hollerith cards and sundry + printer * printing teletype * dumb (video) terminal * smart (cursor addressable) terminal * images of smart terminals * images of smart terminals with color (businesses resisted color for years) * ... ?

And in the meantime we have an evolution of support for modelling things visually and working with more descriptive protocols - or even function-defining protocols to raise the abstraction chatting with the display server in realtime. In this, "abstracted" means something that can be sent over the network instead of using a local buffer. These are in a less strict order than foregoing...

* text, color plotters, VDST, and all that other old slow stuff * [skipping a bit up through bitmapped greyscale graphics] * bitmapped color graphics * abstracted 2D graphics (-> W and X) * abstracted 3D graphics (OpenGL + GLX) * dynamically client-extendable remote graphics servers (NeWS, mostly 2D) * ... ?

So here I am, waiting for the next stage in these. Hypothesizing that finally we'll get something with 3D abstracted, network graphics (display lists in GLX but accelerated with something like XCB?), where the primary display coördinate space is (x, y, x) instead of (x, y), where the client can push some code to the remote server and raise the abstraction on the fly, finally. Where maybe we'd be able to permission the objects in that space and share it among users live. Where the 2D apps would be inside the 3D space instead of the other way around. Something for the 2000s instead of familiar abilities provided in 1990.

But instead, Wayland. Wayland, which is not backwards compatible with X. Wayland, which is 2D at its heart. Wayland, another 1990 era graphics system with a super thin offering of features for actual end users (not devs) which come at substantial cost in lost X features. Wayland, which resists the one user doing things we've long thought of as normal - in the name of "security".

Wayland is not what I've been waiting for.


Yes the international keyboard support is pretty bad in both X and Wayland. For example, try using Left Shift to switch to layout 1 (while retaining its shift functionality) without patching Gnome. It's impossible.

Or, try making a virtual on-screen keyboard that would send characters that are not in the layout (for example, Greek character with US keyboard layout). Again, you cannot do that, and it's difficult to understand why virtual keyboard has to be restricted with characters printed on physical keyboard.

And if you want to use remote desktop from a computer with Greek layout to a computer with US layout... again, it's going to be difficult. X server-based remote apps would simply temporarily patch the layout and add non-existent keys there to be able to report the key press on a remote machine with different layout. xdotool, I think, used the same hack to input characters that are not in the layout.


>try using Left Shift to switch to layout 1 (while retaining its shift functionality) without patching Gnome. It's impossible.

It's certainly possible with X. Not sure about compatibility with Gnome, but then it's a Gnome's problem, not X's


It is possible if you write a custom program for that, but not by simply configuring X. If you configure Shift to switch layouts, you won't be able to use it as a modifier.

For comparison, in Windows you can use Ctrl+Shift or Alt+Shift to switch layouts and use shortcuts (like Ctrl+Shift+A) with these modifiers. In X, you cannot.


"Custom program" is a bit too serious sounding wording for a 5 lines shell script.

>in Windows you can use [...] In X, you cannot.

Well, I can. What you probably mean is that in X you don't have a GUI for this. But X is text configurable/scriptable by design. GUI is a concern of an upper level software.


At this point the Wayland project is effectively keeping desktop Linux from succeeding. It might as well have been a plant project or a strategic intelligence war from Microsoft to keep Linux on the server only.

It's a ten+ year disaster project that held desktop linux back at the precise moment of complete insanity on the part of the Windows designers with Windows 8 and the dual desktop/tiles disaster and yet-another-window-kit.

Microsoft is still pissing off its customers actively, but now we have real traction with Steam for getting gamers off of MS and onto Linux.

The opportunity is still there.


This is analogous to calling unix account separation "fragmentation". Why can't I just run all my services as root? It has worked for years!?

The answer is that it is a fragile, unmaintainable security nightmare.

Wayland has separation of concerns to fix that problem, with the tradeoffs described in the blog post.


No, this is analogous to forcing everything into separate accounts in the name of "security" and then failing to implement any way to pass data between them. It would be fine to have optional protocols on top of the core wayland protocol, and it would be fine to require a single permission prompt, but only if they actually get implemented and there's actually a way to persistently give permission. Otherwise you've just reduced the functionality of the system.

And yet unix account separation really did turn out to be overcomplicated and useless. Hosting providers were never able to separate untrusted users by user account, they either use VMs or containers or give up on offering shell access at all, and on home machines the whole effort falls prey to https://xkcd.com/1200/ .

The post shows a common issue with Wayland. The protocol is there, but each compositor handles things a bit differently, so tools like xdotool end up running into gaps or inconsistent behavior.

Wayland is improving, but there is still a difference between what the spec supports and what developers can rely on across the ecosystem.

A good look at why automation on Wayland still feels rough for some users.


I am surprised, 87 comments and so far nobody has mentioned X11Libre.

They claim 30 developers right now. Don't know if true and if they are any good, but when the time comes to update my xserver, I will have a look at them - just to show my support.

Ah aeh, Wayland, it too got pitched to me recently. But I don't see it improves on any use case I have, but actively disables functionality I need. So a big no go and pass. Why should I change anything in my single-user distro - thank you.

Some of us thought to ourselves in the 90s, I don't have to use Windows. There is something called Linux. I remember installing SUSe from 5 1/4 floppies (A LOT of them) and configuring scanlines and hoping my CRT survives the first startx command. I gave no one authority to say what get's deprecated and what not to me. The only person who can do this I am myself.

I use Linux because of people back then as Linux Torvalds and today Jordan Sissel, who do the right thing out of passion and not expectation of financial or other reward. Just for themselves and then share it, because it might be useful for others too. It's not a 9 to 5 job to them.

People like Lennart Poettering or some other kids who want to coax me into accepting their toys are a reason for me to run away as fast as possible from such shenanigans. I survived editing the scan lines, I don't need software from IBM.

Regarding Wayland and GUIs: GUIs have been much worse than command line and batch environments for automation. xdotool is kind of the best we have (basically just creating macros like in an editor for the whole system), but neither X11 nor the applications are really designed for automation. AppleScript and d-bus all kind of never really worked out. What will happen now with text based gen-AI models, we will go back to good old text (plus speach) interfaces. We will just tell (e.g. in a text box) the AI what we want and they find a way to deliver whatever it is we asked for. Then finally the AI properly controls a web browser for us, but we don't need to see any of that.


Second system effect is the curse of FOSS projects. It's been that way for decades. I don't see a reliable solution for the structural problem that doesn't somehow end up like a Benevolent Dictatorship. At the end of the day, designing complex systems by committee is hard to do. Maybe there is a maximum size of a group beyond which the communication matrix between the members starts to fracture?

"that doesn't somehow end up like a Benevolent Dictatorship"

Is that a problem though? If you want to get shit done, you need someone to take responsibility for the decisions. Otherwise you get design-by-committee and endless bikeshedding and software nimbyism.

I don't see how else it could work...


I would claim the dictators -- even the "benevolent" ones--tend to do this more often than committees, as they have more inherent power to do so: the committees tend to get stuck in backwards compatible land forever (for better or for worse). I mean, look at Larry Wall or Guido Van Rossom with their respective debacles. Bjarne Stroustrop couldn't mess up in that way even if he seems to want to. As another example, HTTP only started having this problem with Google being able to railroad everyone. The only major second system effect caused by what I believe is a committee that I can easily come up with is IPv6?

IPv6 is a victim of the nature of the problem and a lot of under informed observers. I see too many comments asking why it's just backwards compatible or suggesting less bits would make it easier.

There are real problems but really the issue is that it was a hardware and software problem wrapped into one as well as being a collective action problem.


Which Guido van Rossum debacle are you referring to? I can't think of one where he unilaterally caused a mess by dictating a hampered "second system" into life, as your comment seems to imply, but may just me misunderstanding or being ignorant.

The closest thing I can imagine is where he actually resigned as benevolent dictator after having to meditate the walrus operator design committee/community, which is not a good example for your argument. Python 3 also does not seem to fit the bill as a "debacle" or a "second system" in the usual parlance.

I'm asking because I'm interested to learn of a significant event in Python's history I might not be aware of.


Python 3 was a debacle that nearly destroyed the Python ecosystem due to egregious incompatibilities when it was decided that, not only could they fix a few important things, but in the process they might as well throw backwards compatibility entirely away and encourage people to treat it as a new language. It was over a decade between Python 2 being declared dead and Python 3 being usable, and many of the key groups of people who actually used Python during that time actually did give up and move on: as an example, if it were not for that mess, Python would likely own web development. And the big corporate backer that helped popularize it in production -- Google -- largely did give up, and started working on Go, while the community focussed on stuff that Google thought was not just a waste of time but made their efforts constantly break. That data scientists (including at Google) appeared to give it a new life in machine learning was super lucky.

I dunno... maybe read this? (edit: I forgot to add the link, lol; I have now added it ;P.) If you weren't there at the time, maybe it is easy to pretend none of this had happened, but it was a super big deal and there were the same kinds of bullying campaigns to get people to upgrade even when stuff was clearly slower and more broken and you knew it would become easier to port later.

https://gregoryszorc.com/blog/2020/01/13/mercurial%27s-journ...

Frankly, it all started to finally turn around in the Python 3.3-3.7 timeframe, with the biggest turning points being 3.6/3.7, which is when Guido finally was cracking under community pressure against his agendas and decided to start forming a committee to manage the language, before stepping down... until just now I hadn't realized that that was probably the thing that truly saved that language.


I picked up Python around 2.5 and went through the migration to 3. Although it was not smooth and took a while on the whole, it was very painless for me, and in my environment I did not experience it as such a big almost-catastrophe as you describe it. Python is better for it, and personally I'm very grateful for the Unicode compatibility breakage they did.

I get that your experience may have been different, and I appreciate that the transition cannot be said to have gone well (despite ending well); but nevertheless I feel that using words like debacle paints an overdramatic picture that suggests an outcome far removed from where Python is today, and it does not leave much linguistic room for the multitude of possible worse occurrences that would truly deserve to be called a debacle. But that's of course just my opinion.


I guess, in my mind's eye, I see a world in which Python dominated, and that didn't happen: it didn't die the death Perl did, but, in the grand scheme of where it was going -- and even what it had been -- it has become a very niched language: we now only really see it in education, machine learning, and (sometimes) system administration (but even that has been losing to Go). It didn't entirely disappear, and it actually got where it was going (unlike Perl 6), but it seems very strange to say it "ended well"... they squandered a decade on some really bad decisions that they only started to undo around 3.3-3.7, and the ecosystem simply moved on (in some cases quite forcefully, such as Google going to great lengths to get off of Python via Go). What else can we call such a fall from grace? If not somehow a debacle it is certainly a tragedy...

Maybe I am just in a happy little Python niche, then. I'm saying this in all earnesty. Maybe Python could have been bigger, I don't know. It still seems very present, it did not die, but came out better than before.

Given the performance difference between Python and Go, and the rationale given for its invention, I'm not convinced Google would somehow have a chosen Python as their blessed language, as you seem to suggest.

Anyway, we seem to have different measuring sticks for things like debacles and tragedies. :)


Srs question, I keep reading everywhere from experienced people Wayland sucks. I need to start learning of these stacks, should I go with Wayland or should I go with Xorg?

If I didn't know any better I would learn the Wayland API. Just like how: if I didn't know any better I would learn Swift (instead of Objective-C). But thankfully I do know better and I know to stay far away from Swift [1]. Is it the same deal with Xorg/Wayland? It seems like noobs prefer Wayland but the experts prefer Xorg.

1. https://youtu.be/ovYbgbrQ-v8?t=1456


tl;dr Wayland doesn't have a good set of universally adopted input emulation and UI automation protocols yet, which makes a portable UI automation utility with the full scope of `xdotool` impossible to write. Work remains to be done to close this gap.

The X protocols in this area were not very good, but due to there being a single viable implementation you could rely on them being present (similar to using MSIE-only features in that browser's dominant era).


In my opinion, three basic things are needed:

- Device emulation: uinput covers this; requiring root is reasonable for what it does.

- Input injection. Like XTEST, but ideally with permissions and more event types (i.e. tablet and touch events.) libei is close but I think it should be a Wayland protocol.

- UI automation: Right now I think the closest you can get is with AT-SPI2, for apps that support it. This should also be a Wayland protocol.

None of these are actually easy if you want to make a good API. (XTEST is a convenient API, but not a particularly good one. Win32 has better input emulation and UI automation features IMO.)

Also the tangent about how crazy the compatibility layers are is weird. Yes, funny things are being done for the sake of compatibility. XWaylandVideoBridge is another example, but screen sharing is an area where Wayland is arguably better (despite what NVIDIA has to say) because you can get zero copy window and screen contents through PipeWire thanks to dmabufs.

Some of the lack of progress comes down to disagreements. libei mainly exists, by my best estimate, because the GNOME folks don't like putting things in Mutter, and don't want to figure out how to deal with moving things out of process while keeping them in protocol. (Nevermind the fact that this still has to go through Mutter eventually, since it is the guy sending the events anyways...) However, as far as I know, lack of progress on UI automation and accessibility entirely comes down to funding. It's easy to say "why not just add SetCursorPos(x, y)" and laugh it off, but attacking these problems is really quite complex. There was Newton for the UI automation part, but unfortunately we haven't heard anything since 2024 AFAIK, and nobody else has stepped up.

https://blogs.gnome.org/a11y/2023/10/27/a-new-accessibility-...

Color management is the perfect example of how a simple ask can be complicated. How hard could it really be? Well, see for yourself.

https://gitlab.freedesktop.org/wayland/wayland-protocols/-/m...

If Wayland lasts as long as X11 did, it's preposterous to not spend the time to try to get the "new" version of these things right even if it is painful in the meantime.

After all, it isn't like UI automation on Linux was ever particularly good. Anyone who has ever used AutoHotkey could've told you that.


This is a good and informative comment.

Wayland is fantastic, but it requires some external tools for some non-default (seemingly 'simple') utilization. Therein lies double work, devs scratching their itch in whatever programming language they prefer, sure. The most popular one isn't always the best. And all of that is the nature of the bazaar. But the way I regard it, the bazaar can make use of curators who make a cathedral and sell that (anyone can, in theory). In other words, this is a service problem.

I used ydotool [1] in Sway years ago, worked perfectly fine. Had to setup the permissions though, IIRC via some udev rule. There are also other tools which do something similar, each being slightly different and sometimes with different features or pros/cons. For example, there was this tool for just swapping buttons (wtype), one for reading the input and echoing what was being pressed (wev), and there is one doing that with a keyboard visual picture, too (wshowkeys). Basically, sircmpwn (author of Sway) wrote a lot of useful Wayland utils [2].

Then for running GUI apps remotely, there's Waypipe [3], and for running Android apps on Wayland there's Waydroid [4].

The beauty of all this, is that with Wayland you don't have to run QubesOS in order to have a somewhat secure desktop OS.

..but it did require some work to get all of this working. I already knew that the moment I went for Sway instead of Gnome or KDE.

Also, I believe the mobile Linux DE's each use Wayland, too. pmOS with say Phosh, Lomiri, Plasma Shell, you'd always be using Wayland. That all started with N9 MeeGo and SFOS, while the predecessor of N9 (N900) still used X.org.

[1] https://github.com/ReimuNotMoe/ydotool

[2] https://git.sr.ht/~sircmpwn

[3] https://gitlab.freedesktop.org/mstoeckl/waypipe

[4] https://waydro.id/


Wayland’s fragmentation is less about one problem and more about how the ecosystem grew. Each compositor implements only what it needs, so tools like xdotool run into gaps and inconsistent behavior.

The post highlights a real coordination issue. The protocols exist, but adoption is uneven and expectations differ across compositors. Users see small breaks and developers face a moving target.

Wayland is improving, especially with work from GNOME and KDE, but stronger shared conventions for automation and accessibility are still needed.

Good write-up that shows why experiences on Wayland vary so much depending on the compositor.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: