Hacker Newsnew | past | comments | ask | show | jobs | submit | more netbsdusers's commentslogin

The void type has considerable heritage, dating back all the way to ALGOL 68, and is traditionally defined as having one member:

> The mode VOID has a single value denoted by EMPTY.


Usually when I hear "robbery" it brings to mind someone stealing your phone or wallet at knife-point. Certainly not training some model on some code that involves neither violence nor depriving anyone of anything.

The concept of copyright is the fiction that information - something that can be freely modified, copied, or transmitted - is of the commodity-form, that information should be treated as if it were a single real object that were inherently scarce.

It is a fiction that exists to serve of some of the most hateful, mafia-like firms - your Disneys, UMGs, Getty Images, and the like.

So if the AI interests are powerful enough to deal that whole rotten system a serious blow, then I'm all for them.


Server bandwidth is a commodity. If a site gets DDOSed by LLM training, then actual humans won’t be able to access the information.


> you can technically look at some of the code that comes with it, but you can't modify it.

This is equally true of the GPLv2. The attempt to close this loophole - Tivoisation as Stallman called it - only won him a lot of scorn from Linux land and a refusal to adopt the GPLv3!


Fair; I suppose that concern is technically beyond copyleft.


> How can you implement an object capability system on WASM?

It's been well known for decades that the germ of an object capability system already exists in Unix - the file descriptor (that's why the control message for transferring them over sockets is called SCM_RIGHTS).

Capsicum was developed to make that potentiality a reality. It didn't require too much work. Since most things are represented by file descriptors, that was just a matter of tightening what you can do with the existing FDs (no going from a directory FD to its parent unless you have the right to the parent, or some parent of that parent, by another FD), and introducing process and anonymous shared memory FDs so that a global namespace is no longer needed to deal with these resources.

So WASI has derived itself from an actually existing object capability architecture - Capsicum - one which happens to be a simple refinement of the Unix API that everyone knows and every modern OS has at least been very much inspired by.

https://www.cl.cam.ac.uk/research/security/capsicum/


Capsicum looks very cool, but looks like support never got finished in Linux. It's still in FreeBSD, though, other BSDs as well? From what I understand (admittedly little), capabilities in Linux are more about ways to apply granular use of permissions that would otherwise need root. Not around limiting the ambient authority within one process. Seccomp can drop permissions, but again for the whole process.

On a related note, I found Thomas Leonard's blog post (2023) on Lambda Capabilities to be a very interesting approach: https://roscidus.com/blog/blog/2023/04/26/lambda-capabilitie...


Parts of it are in Linux. Namespaces and pidfd got in, at least. PowerBoxes are in every OS these days including Linux via Flatpak.


Oh, interesting thanks! That would be like what's described here? https://docs.flatpak.org/en/latest/sandbox-permissions.html

Looking at Leonard's post from my earlier comment, I was really appreciating the ability to do this sort of restriction _within_ a single application. I know that the code I am writing is not doing anything malicious, but I'm still at the mercy of whatever dependent libraries I'm calling. (Think file parsers, for examples of code that often goes sideways.) His Eio effects library for OCaml supports Capsicum, which I could see as being awesome for any sort of multi-user server process in particular. https://github.com/ocaml-multicore/eio?tab=readme-ov-file#de...


FDs are owned by processes, not libraries, and are by themselves not sufficient to implement a sandbox. A lot of policies people want to implement in the real world can't be expressed in UNIX or with file descriptors. For instance: UNIX kernels don't understand HTTP, but restricting socket access to particular domains is a common need. Of course, all of this can be hacked on top. Another example: stopping libraries quitting the process can't be done with fds.

Every modern OS has very much not been inspired by UNIX. Windows has little in common with it e.g. no fork/exec equivalents, the web is a sort of OS these days and has no shared heritage with UNIX, and many operating systems that ship a UNIX core as a convenience don't use its APIs or design patterns at the API level you're meant to use, e.g. an Android app doesn't look anything like a UNIX app, Cocoa APIs aren't UNIX either.


Windows has extreme similarities with Unix, and if you would look at a really different OS like, for instance, IBM i, it becomes clear. The Windows-Unix affinity is so great that you even interact with devices through file handles by Read, Write, and IoCtl methods.

Check "Inside Windows NT" by Helen Custer, an official account. She explicitly credits the handles to Unix. That's not surprising - not only was Unix fresh on the minds of the NT developers, with quite a few of them having Unix backgrounds, but every conceptual ancestor of Windows NT was at least significantly influenced by Unix:

- VMS: The VAX/VMS team were in regular contact with Ken Thompson, and got the idea for channel IDs (= file descriptors) for representing open files and devices from him, as well as the idea of three standard channels which child processes inherit by default: input, output, error (the error one was at the time a very recent development, I think in Unix v6 or v7)

- MICA: Would have been a combined Unix and VMS compatible system.

- OS/2: FDs with read, write, ioctl again.

Even MS-DOS is already highly Unix-influenced: they brought in file descriptors in DOS 2.0 and even called the source files implementing the API "XENIX.ASM" and "XENIX2.ASM" (see the recent open source release.)

I have deliberately chosen to not make anything of the fact that Windows NT was intended to be POSIX compatible either (and even supports fork, which WASI mercifully doesn't) because my point is the fact that all modern general-purpose operating systems are at least very much inspired and deeply indebted to Unix. I would accept that OSes that are not general purpose may not be, and old operating systems made in siloed environments like IBM are fundamentally very different. IBM i is very different to Unix and that's clear in its native APIs even though

Cocoa and Android APIs don't look much like the basic Unix APIs, it's true, even if they are implemented in terms of them. WASI wants to define APIs at that lower level of abstraction. It's tackling the problem at a different level (the inter-process level) to what object capability _languages_ are tackling (the intra-process level).


Right, if you compare against IBM i then everything looks the same :)

NT might have been intended to have a POSIX personality in the very beginnings of the project, but that never happened. People who have tried to make this work over the years have always retreated licking their wounds, including Microsoft themselves. WSL1 tried to use this and failed because NT is too different to UNIX to implement a Linux personality on top, so WSL2 is just a regular VM.


Computerised CBT is even already being delivered and by quite a bit less sophisticated systems than LLMs. Resourcing constraints have made it very popular in the UK.


Solaris achieved some kind of integration between the ARC and the VM subsystem as part of the VM2 project. I don't know any more details than that.


I assume that the VM2 project achieved something similar to the ABD changes that were done in OpenZFS. ABD replaced the use of SLAB buffers for ARC with lists of pages. The issue with SLAB buffers is that absurd amounts of work could be done to free memory, and a single long lived SLAB object would prevent any of it from mattering. Long lived slab objects caused excessive reclaim, slowed down the process of freeing enough memory to satisfy system needs and in some cases, prevented enough memory from being freed to satisfy system needs entirely. Switching to linked lists of pages fixed that since the memory being freed from ARC upon request would immediately become free rather than be deferred to when all of the objects in the SLAB had been freed.


They're not obscure or niche, they are everywhere in operating systems. The linked list is probably the single most essential and the most used data structure in any kernel.


Implementing an operating system is incredibly niche and obscure.

The practices that apply to kernel development do not generally apply to very much else.


Ah yes, operating system kernels, famously super mainstream programs that almost everyone works on.


The idea of using a third party init system has always been quite alien to BSDs, the sames goes for almost all other Unix-like systems, which are almost all developed with a greater deal of integration within the core system. Linux is exceptional in this respect, that it has ever had a diversity of init systems.

This war of words between the BSD community and systemd, as far as I've been able to tell, dates back to when Poettering went to the GNOME mailing list to propose making GNOME depend on systemd. He made this request with the proviso that it shouldn't necessarily be a hard dependency, so that needn't have been a problem in itself, but then he made a remark in an interview with linuxfr.org:

> I don't think BSD is really too relevant anymore, and I think that this implied requirement for compatibility with those systems when somebody hacks software for the free desktop or ecosystem is a burden, and holds us back for little benefit.

and as you can imagine this was ill-received by the BSD community.

Could systemd, or at least a useful subset of it, have been made cross-platform from the get-go? It would've taken more work. I don't think the amount of work necessary would have been particularly onerous, which I hope InitWare shows. It would have required making certain compromises like systemd being happy optionally running as an auxiliary service manager rather than as the init system.

In the end, though, Poettering has his preference to target GNU/Linux only, and he is entitled to that.


Very informative, thank you.


Systemd uses groups for two things: for tracking processes other than direct children of the service manager, and for imposing resource limitations. Both can be done with other mechanisms, like kqueue's EVFILT_PROC and login classes respectively. But my experience in any case was that hacking up systemd to build and run under BSD it didn't need cgroups at all for basic running. Supervision of `Type=simple` and `oneshot` services worked fine. It wasn't particularly surprising to see this as cgroups really aren't ideal as a tracking mechanism - under cgroups v1, you only had a "cgroup empty" notification available as far as tracking the lifetime of processes within a cgroup, and even that was unreliable and could be left undelivered! So systemd used them to augment a more traditional process supervisor. That's why Pottering insisted on having it be PID 1, and got subreapers added to Linux for the per user systemd instances so that they could get the more traditional SIGCHLD based notification of process exits.


Okay, but ... if you only get something that seems to work, but isn't actually reliable, what's the point?

You seem to be wrong about cgroup v1; freezing works and is sufficient to reliably kill all children. Half-killed services was one of those really annoying problems back in the dark ages of sysvinit (not the most common problem, but perhaps the hardest to detect or deal with when it did come up).


I'm saying that it did work perfectly fine and reliably for the common case of types oneshot and simple services. To expect it to work for type Forking services would be absurd since no mechanism would exist to even try to keep track of them. It's just a point to illustrate that systemd is not as intimately and irretrievably integrated with Linux features as some imagine.

Freezers were never used by systemd as part of its process tracking mechanism. And cgroup emptiness notification was unreliable under cgroups v1. So that's not wrong. It used some horrible mechanism where a binary is launched (!) when the cgroup becomes empty. And that can fail to happen under situations of low memory availability.

Related read is Jonathan de Boyne Pollard on cgroups:https://jdebp.uk/FGA/linux-control-groups-are-not-jobs.html


My point is that a lot of apparently "simple" services do in fact call fork internally. Just a few things I've seen:

* fork to periodically make a snapshot of server state, to avoid slowing down the main server

* spawn an external gzip to compress a log file

* spawn a handler for some file format

* spawn a daemon to actually handle some resource, which might be used by other processes too (this really should be a separate managed service, but in the anti-systemd world this is often not the case)

If everything is working fine, you'll only waste a bit of server RAM for a few seconds if you fail to kill the children alongside the parent. But the circumstances in which you want to restart the service are often not "everything is working fine".


You can do the same with a modren C compiler - the extern and auto mean the same and int is still the default type.


In C23, auto doesn't have a default type, if you write auto without a type then you get the C++ style "type deduction" instead. This is part of the trend (regretted by some WG14 members) of WG14 increasingly serving as a way to fix the core of C++ by instead mutating the C language it's ostensibly based on.

You can think of deduction as crap type inference.


Design by committee, the outcome is usually not what the people on the trenches would like to get.


Nobody in the trenches seemed to use old-style auto in the last decades.

BTW: The right place to complain if you disagree would be the compiler vendors. In particular the Clang side pushes very much for keeping C and C++ aligned, because they have a shared C/C++ FE. So if you want something else, please file or comment on bugs in their bug tracker. Similar for other compilers.


> Nobody in the trenches seemed to use old-style auto in the last decades.

To the beat of my knowledge, there was no case where "auto" wasn't redundant. See e.g. https://stackoverflow.com/a/2192761

This makes me feel better about repurposing it, but I still hate the shitty use it's been put to.


What's your better use for it, then?


Synonym for "self" when programming in Greek.


Indeed, however many in the treches would like a more serious take on security, complaining has not served anything in the last 50 years until goverment agencies finally decided to step in.


This is again a problem compilers could have addressed, but didn't. Mostly because the users in the trenches did not care. Instead they flocked in droves to the compiler optimizing in the most aggressive way and rejecting everything costing performance. So I do not have the feeling that users were really pushing for safety. They are very good at complaining though.


GCC and Clang support asan/ubsan, which lets you trade performance for nicer behavior related to memory access and undefined behavior. Whenever I do C development for a platform that supports asan/ubsan, I always develop and test with them enabled just because of how much debugging time they save.


Yes. Ubsan you should probably also turn on in production.


It is like democracy, election results not always reflect the needs of everyone, and some groups are more favored than others.


I think my point is that a standardization committee is not a government.


It surely looks like one from the outside.

Features only get added when there is a champion to push for them forward across all hurdles (candidate), and voted in by its peers (election), at the end of a government cycle (ISO revision), the compiler users rejoice for the new set of features.


You may have noticed that most features existed in compilers before.


Isn't the original inclusion of the auto keyword more in line with what you expect from design by committee? Including a keyword which serves no purpose other than theretical completeness?


I was talking more in general, not specific regarding auto.

Actually I did use C compilers, with K&R C subset for home computers, where auto mattered.

Naturally they are long gone, this was in the early 1990's.


An interesting secondary meaning of "design by committee", the reason why what you mention happens, is "design in committee".

People can skip the usual lifecycle and feedback for an idea by presenting jumping directly to committee stage./


People in the trenches seem pretty happy with what the committee designed here.


It doesn't matter. The people in the trenches don't update their standard versions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: