My driver experience has been overwhelmingly positive, but I do tend to run older (cheaper) chipsets.
On my work computer with a Ryzen 7 "Renoir" GPU the problem was not the drivers but the fact that the latest Ubuntu LTS ships with an older kernel, so having to run latest mainline.
I have a feeling this sort of situation will become more and more problematic, especially when new architectures are released. Somewhere the Linux community may have to step up for more easily having recent kernels.
I'd a fairly modern/powerful blob-free SOC. For now, the closest I to it I got was an rk3399 based SBC. It is a start. RISCV may finally do it, but it is still a few years away if ever.
I have also experienced the crash on resume issue. It is only some of the time--like once every few days. But it is still pretty annoying. (currently using kernel 5.8x) It might not be the highest priority for them, but it would be cool if they could fix it.
I'm on a Ryzen APU too, the Raven Ridge 2500U, and have had an equally disastrous experience on linux. I will say though in the last 6 months I have _finally_ figured out how to make the machine stable and stop randomly crashing. Here's the key, and it might help you to try too:
- Upgrade to _the very latest_ kernels. I'm not kidding, you need to start following mainline stable builds. In Ubuntu or ubuntu-based derivatives check out the ubuntu-mainline-kernel script: https://wiki.ubuntu.com/Kernel/MainlineBuilds I'm on 5.10.4 right now but probably should uprade. You need a _minimum_ 5.4 kernel to fix some egregious instability issues, but in general keep your kernel bleeding edge updated.
It's a night and day difference with these fixes in place. I still sometimes get a touchpad lockup on wake from sleep, but it's maybe 1 out of every 50 times and kind of expected for linux. No longer do I have random kernel panic-level lockups every hour. The little laptop just flies and flies with these Ryzen processors.
I really wish they would write a showcase Linux Desktop that would enable their CPUs to shine and create a market for ARM chips they may release in response to M1, but that's about as likely as a genie lamp actually existing.
My current workstation is a dual-Xeon Gold 6252 with 1 TiB of ram. The next one will be either Threadripper Pro or Epyc. AMD have just knocked it out of the park; I want to reward them.
Honest answer: medical imaging. If your data doesn't fit into ram, it's a right royal pain in the arse, and being able to do linear algebra on huge, 14-dimensional datasets and seeing if the final result actually looks like the inside of a person in real time as you fiddle with the code is a huge plus.
And yes, Gnome is still sometimes completely unresponsive...
Very cool. Is it all CPU-based or can you leverage GPGPU for that? My linear algebra is admittedly a bit rusty, but I would have imagined that GPUs would do very well in applications like that.
150 MB RSS of a single worker process (out of typically 4+)? Did the dude at least try running `free -h` before and after starting any Electron application? Multiple that by at least 5 and you're somewhat closer to the truth.
But seriously -- wine should have equal performance to running a program "natively" on windows because it essentially is running the program natively, just with different system DLLs that call back to the linux kernel instead of the NT executive.
Their implementation may be slightly slower in places, but it's not a problem inherent to Wine itself, and could always be fixed.
I know there's a stigma against running win32 apps on linux, and possibly rightly so, but there really isn't a reason why Wine couldn't be a legitimate runtime environment for linux. You can make 100% open source software that targets Wine, and never needs to use or link to proprietary software.
Wine has even been ported to architectures where there have never been native windows ports, like the ppc64le port of Wine.
(Wine/win32's binary interface is also easier to intercept and automatically translate/emulate calls for non-native architectures, which is the basis of WoW64 and x86 on ARM emulation, and in Wine land, projects like Hangover https://github.com/AndreRH/hangover )
> wine should have equal performance to running a program "natively" on windows
That is not guaranteed. Windows programs and Win32 APIs are writting for and optimized to run on the the NT kernel which has different perfomance characteristics from the Linux kernel. Some example:
- Wine needs to emulate a case-insensitive filesystem on top of a case sensitive filesystem, which is less efficient thatn using a filesystem / kernel FS layer that is designed for this.
- Some locking primitives are different enough that Linux will need additional syscalls to let Wine reach the same perfomance when emulating the Windows ones [0]
Hmm, the filesystem thing would be a constant slowdown, and there is the locking/async primitives that are generally better on Windows, but at least those are being worked on, both for Wine and Linux in general. I kind of recall reading about an option for ext4 to allow optional case insensitive operation? Ah! Recently added it seems. Needs a pretty recent kernel and a format-time config option that I doubt any linux distro is using yet, though [0]
I just meant that, in general, Wine should be very close to as fast on Windows, since it can usually implement win32 DLLs without much emulation/translation needed. Excepting all the bits that do :-)
Now that it is becoming easier to run linux on windows and maybe macos, linux dev should make an effort to guarantee linux apps run REALLY fine on window and mac. That way instead of creating 3 native apps (win, mac and linux), you could simply target linux.
This won't work - developers will always target the biggest platform first and then complain that it takes to much work adapt the program to more portable APIs.
If the app had only essential features, it could have functioned well on web.
Back when Russian Facebook rival vk had an unlimited offering of pirated music, their music player on the web was the best. You have a list of songs, a search bar, and play/pause.
Nowadays Spotify web client takes a second or two to register mouse click. This is just sad.
They would have worked just fine on the web, and then some of those 150mb would be shared between all of them at least. I mean, we are talking glorified chat apps and music players, not 3d imagery or physics simulation. They even integrate pretty poorly (or not at all) with desktop systems, so really the whole Electron thing is just sad and often pointless.
I interviewed a few months ago for a HW position and they mentioned remote was fine, even post-COVID. I'd imagine SW would be more flexible but don't specifically know.
Meh... Where are the remote jobs? I bet if you open an office in Oregon (where there's tons of Intel Open Source people) you'll find a lot of candidates. Or, you know, act like the old times of 2020 and hire remote employees?
Don't let the stated location stop you from applying. 3rd party HR software and outdated pre-pandemic rules force hiring managers to state a location even if they are willing to hire to work from any location.
People don't realize how quickly hardware iterates and how many modifications and changes are made on site in the lab.
Or just how often development hardware self-bricks and needs to be recovered.
Doing hardware work remotely is a lot slower. Possible, but slower.
Edit: Or how there is that one person who is really good at assembly and if you can walk down and ask them a question it'll save you a couple hours (days) of debugging.
And then there is that one person who is really good with an oscilloscope and while in theory you shouldn't need to have decode messages by sight from waveforms, well, this one person can and it sure is useful time to time....
This is not my experience. Driver developers don't use oscilloscopes. Changing HW once the first prototype is done is expensive and it takes a lot of time for you to get the second iteration. Or you flip a bit in the driver to disable the new broken feature. Most of the driver development happens before the physical hardware even exists. Everything is virtual.
> Changing HW once the first prototype is done is expensive and it takes a lot of time for you to get the second iteration.
Sure, but minor fixes soldered in place aren't uncommon.
My experience is with firmware for embedded and consumer electronics, maybe video cards are sufficiently complex that none of the things I saw are even possible!
Lots of issues resolving power states, clock trees, monitoring buses, docs from suppliers were always subtly wrong, power management chips never quite worked how they were supposed to, things like that.
I do happen to work writing low-level device drivers, and I do work remotely. The hardware engineers are always one click away from me. I can give them access to my machine with another click on the machine reservation system. It is very much possible and viable to have everything remote.
would guess since this is AMD and low level Linux, there may be a good deal of interfacing with physical hardware, in addition to any other preferences they might have
Interestingly they seem to have been optimizing the OSS driver for workstation uses lately where before the focus was on desktop/gaming applications only while the recommendation was to use amdgpu-pro for workstations. [0]
If laptops are following phones and tablets into ARM it means x64 is most likely to survive on large workstations, servers, HPC, and in data centers. It makes a ton of sense for AMD and Intel to prioritize Linux and make sure their support is first class.
In HPC the situation seems a little weird. Fundamentally, lots of HPC workloads should scale pretty well to lots of little ARM cores. If anything the x86 advantage is more that it is just a conservative field full of scientists and engineers, who are happy with their old math libraries, which grew up on x86 and are heavily tuned for the environment.
If I were AMD or Intel, I'd keep investing in HPC, but be terrified that my lead there was just one good ARM BLAS library away from evaporating.
Fujitsu A64fx already exists, the threat is quite near. Arm on the server market is quite interesting too (Graviton2 and Ampere Altra today).
And those are big cores. For a ton of little cores, looks less clear-cut for me because of the communication overhead. (could work out though, but won't be exactly easy).
Windows on Arm64 machines are all UEFI + ACPI too. :-)
(even for 32-bit Windows on Arm, back in the Windows RT times, this was the case. It's a shame that Microsoft decided to lock those down back then)
On more affordable Arm machines, Honeycomb LX2K has an official UEFI firmware (with ACPI) and NVIDIA's Jetson AGX Xavier has an official (but experimental) UEFI+ACPI option.
Those aren't complete yet in mainline when using ACPI. For the LX2K, onboard networking currently doesn't work when using mainline kernels. And for the AGX Xavier, you lose PCIe (until the quirk to support it there is merged, or the ECAM through PSCI patchset) and integrated GPU support.
When using a Windows on Arm laptop, most Qualcomm drivers only have device tree bindings, not ACPI at least at this point. As such, you might need to load a device tree at boot for a more complete experience. (you can look at repos on https://github.com/aarch64-laptops if you have one)
For SBCs, forget about almost anything not named "Raspberry Pi", for which... a third-party UEFI + ACPI firmware is available at https://rpi4-uefi.dev, which allowed the original RPi4 to be certified as SystemReady ES.
A lot of the things that actually drive making people implement UEFI + ACPI is that Windows, RHEL and VMWare ESXi require UEFI + ACPI to boot on 64-bit Arm.
Don't forget about our Marvell series of boards :) While HoneyComb is our flagship for SystemsReady ES certification, We will also have our Armada 8040 and CN913x based products that will also be fully supported UEFI + ACPI systems.
Yes, the kernel has great support for ARM. Distros that need to package up the kernel, tweak the bootloader, deal with peripherals, etc.... less so on great support. ARM is a bit of the wild west with SoCs defining all kinds of esoteric and exotic hardware configs managed by the kernel's device tree infrastructure instead of a typical BIOS. If you're dealing with server workloads it matters much less (and they probably support UEFI) so stick with a distro that has good support (Ubuntu, Fedora, etc.). If you're dealing with ARM hardware directly like little SBC boards, start getting comfortable rolling your own device tree, bootloader config, etc. as you're probably going to need it. In all cases though the stock mainline kernel is fine to use.
If anything, I would argue HPC is one of the more flexible markets. A lot of HPC software is distributed in source form, so recompiling for a different architecture is seldom a big deal. As long as the system provides BLAS and MPI libraries, and good bang/buck, you're good.
It's kind of weird, but I don't really see so much gain from getting an ARM compared to x86. All I see is closed hardware, locked bootloaders and non-free operating systems in the future.
There's nothing that requires closed boot loaders and locked OSes on ARM, nor is there anything preventing the same on X64. Most X64 chips have embedded security processors that can and sometimes are used to do just that, while the Raspberry Pi and countless other ARM boards are fairly open.
The new M1 Mac is fairly open too in that you can boot non-MacOS OSes on it. It differs from the iPhone and iPad in this way as the Mac is made for a different market niche.
The best way to fight closed devices is to not buy them regardless of their CPU ISA.
ARM is just a different ISA with somewhat superior efficiency characteristics due to fixed length easier to decode instruction encoding. That's about it.
ARM will definitely need a BIOS interface standard before it sees big traction in servers, desktops, and cloud. There's apparently some work going on here. As it stands there just hasn't been a big motive to create such a standard because ARM has mostly been used on embedded and mobile devices that ship with canned images anyway.
The advantage of ARM fixed length decoder is probably largely irrelevant in laptop/desktop segment. Memory model has probably much bigger effect although that's not exactly clear yet either.
But overall people assign too much weight to ISA, especially in case of M1 and forget other aspects of the CPU design.
The fixed length encoder means more in those segments.
X64 decoders are really complex. I've heard 5% of the die. Unlike other parts like the ALU, SSE/AVX, AES engines, etc., the decoder is almost 100% "hot" all the time. This adds significant energy overhead vs. a simpler decoder.
The other problem is that it's hard to decode lots of X64 instructions at once without this energy cost ballooning. Apparently 4X decode is kind of a magic number. Intel cranked it to 6X in Ice Lake and newer with some magic, but beyond 6X is gonna be hard.
The Apple M1 has 8X decode, meaning that it can fully saturate 8 execution units for a lot of instruction level parallelism. Nothing in ARM makes it that hard to go higher. If there's a performance advantage to it we will see 12X, 16X, or even higher decode pipelines.
Another big win for ARM is looser memory semantics, allowing for more memory write reordering and more efficient cache designs.
Overall ARM is just easier to optimize than X64. X86/X64 is only fast because vast amounts of money has been spent optimizing these designs. If the same amount were spent optimizing ARM, we'd be way ahead of where we are now.
P.S. The only area where X64 still really shines is its big complex vector units, but these massive vector instruction sets are to some extent a hack to get around the difficulty of decoding X64 instructions. Think of huge SSE/AVX instructions as big macros. I'm sure ARM will get 256-bit and maybe even 512-bit vector extensions at some point, at which point the gap will shrink a lot in this area too.
P.P.S. SMT / hyperthreading is a hack to get around the difficulty decoding lots of instruction in parallel. It's a way to keep a big parallel core busier. It comes with its own overhead though, which increases power consumption, and is a security minefield. Are there even any SMT ARM cores? It's just not as big of a win when you can add more decode width easily.
Well I see in my testing and use 20% more performance and 30% _less_ cost with AWS Graviton 2 ARM instances vs x86. The cost savings can be enormous. The bottom line of your company doesn't really care about bootloaders and OS 'freedom'.
I was thinking more of my workstation. If I lose access to free software, I might as well just leave the whole industry and stop using technology.
It is kind of important for me I do not have to use closed source blobs or operating systems. And I think I'm not the only one.
Btw, I've yet to see a Graviton instance that compiles our Rust project faster than my 3090x workstation. Maybe there is, but it costs a fortune. And what Amazon does here is it just kicks out all competition who cannot build their own CPUs. Then they can ramp up the prices.
¯\_(ツ)_/¯ I'm with you on not enjoying loss of access to my hardware and getting further and further away from the metal. But also, we gotta make a living and pay the bills... it's a competitive disadvantage with online services to be using more expensive to run instances.
In my experience and testing Rust lang is lagging a bit with ARM support. It's not that the language doesn't have good support (it does, it's great and very easy to cross compile), it's that the ecosystem around rust is _very_ x86 focused. Like I saw a simple JSON parsing library decided it would be cool to add SSE and SIMD support and now it doesn't work well on ARM (which has a totally different set of SIMD extensions). I have a strong suspicion your rust projects depend on libs with similar issues--but be ready because once those libs support similar optimizations for ARM, it's going to be super fast.
Yep, it's flying under everyone's radar right now too. In my experience Graviton 2 instances are ~20% faster and 30% _cheaper_ than the best x86 instances for typical web workloads (transform some templates, talk to a DB, do IO bound network work, etc.). It's an amazing way for nimble little SaaS startups to pluck away customers from giants by offering much cheaper prices. The little guys that build up ARM expertise are going to be great exits and pickups for the larger fish in the ecosystem too (because if your platform is an old thing with a lot of technical debt, moving it to ARM is not trivial and the expertise to do it is in wild demand).
If you're working on online services and NOT looking at or using ARM servers right now--you are in big trouble and already too late to the game.
If it is only for GPU... the drivers are already pretty good.
If it is for their Ryzen processors... please hire more. I am thinking about going back to Intel laptops after the pitiful support in the current Linux kernel.
I've experienced the following issues on my laptop (Dell G5 SE):
* Crashes on suspend/resume. [1]
* Kernel warning on boot [2]
* It crashes for me without amdgpu.runpm=0 kernel config [3] (fixed?)
* Crashes when inserting and removing USB-C monitor (fixed?)
* The onboard GPU works, but the dedicated GPU crashes with vsync. (didn't test recently)
1. https://gitlab.freedesktop.org/drm/amd/-/issues/1222
2. https://gitlab.freedesktop.org/drm/amd/-/issues/912
3. https://gitlab.freedesktop.org/drm/amd/-/issues/1304