There are three near-term products to develop on this:
- A small router (home/small office, not data center)
- A DNS server
- A BGP server
Those are standalone boxes which do specific jobs, they're security-critical, and the existing implementations have all had security problems. We need tougher systems in those areas.
I was worried a whole UNIX might be too big a project so I gave him old school recommendations before. Essentially told him to make a split system that lets you run on tiny kernel bare metal, run UNIXy apps with different API, and let them communicate through some IPC mechanism. That way, like security & MILS kernels, people can write security critical components in isolated partitions with checks at interfaces. Already done with Ada and embedded Java. Rust is best one to try it with now.
Far as your recommendations, I like those. I'll add a web server, Redox or UNIX compatible, that's efficient enough to be deployed in all these web-enabled embedded devices. The dynamic part can just be Rust plugins or something. However, just a robust Ethernet stack, networking stack, time API, and simple filesystem could be used to implement all of yours, the web server, and more. So, I encourage people building these OS projects to stay on 80/20 rule hitting features almost all critical things use. Others can jump in at application layer from there.
A web server gets big, complicated, has lots of add-on parts, and has performance constraints. Small routers, DNS servers, and BGP servers are small, closed systems that should Just Work. You want to get them working, lock the code into read-only memory, and forget them.
If you are reading RFC2616 closely enough then you'll see that you don't really need to implement anything more complicated than what is in 1.0. E.g it technically requires you to support Keep-Alive, but it also states that the server is allowed to close the connection anytime it wants.
What counts as a web server is quite a bit more complex, esp implementation. Even lighthttpd is non-trivial. The standards are for the tiniest core of the problem while leaving off significant issues. That might be acceptable in a web interface to trusted computer as Im advocating. It's barely a web server, though.
A web server has no utility by itself, it only gains value by running content on top of it. This is where simple becomes too simple in almost every case. A DNS server however can basically start being useful as soon as it is connected to the internet.
> I'll add a web server, Redox or UNIX compatible, that's efficient enough to be deployed in all these web-enabled embedded devices.
I took that to mean something capable of being used as a web admin console for the other services. HTTP/1.0 is fully capable of that. I agree that HTTP/1.0 would be a bit anemic for those that want to support an actual site, but as a simple included way to provide an admin interface, it should be sufficient.
That was the intent and solution I had in mind. Appreciate the positive feedback as I might do it myself given all the ridiculously tiny web servers people have made.
My favorite trick I saw was one that replied directly with TCP/IP packets pre-encoded from the HTML. Cool, huh? That could even be done with highly assured tools in safer language as part of CMS or web build system.
Kbenson gets what Im saying with a simple, static server for interfaces and such. I totally agree that a more feature-rich server, esp tolerant of hostile networks, has all kinds of performance and structuring issues that arent easy to do. Got to learn tat the hard way applying B3 class assurance to one I built. The FSM's and info flows piled up quicker than most would think.
In other words, like GNU HURD. That design is very very hard to make correct and fast.
Correctness suffers because the UNIX API has all sorts of interactions between different parts. This includes atomicity. It's a bear to get this right with IPC.
Performance suffers because you are unable to effectively share data structures. This too relates to the interactions between different parts of the UNIX API.
Look, there is a reason GNU HURD is slow and suffers from incompatibility. It's a cute thought experiment, dominating academia around 1990, but it's not actually fast or maintainable. Experience has proven this.
I was talking about systems like KeyKOS and GEMSOS fielded in production before Hurd was a thing. Then, systems like QNX, OKL4, BeOS, Minix 3, and others that largely removed performance issues often with self-healing and legacy app support. Actually, BeOS and QNX in Blackberry Playbook outperformed monolithic competitors. These altogether long proved our approach builds reliable, fast-enough systems with more security.
GNU Hurd is some crap along lines of Mach that tried to mix too many models while not leveraging lessons learned by others far as I can tell. It's something I hear about every year without any real field use or evaluation results. It's not representative of anything in microkernels except a bad approach.
GenodeOS is a better example where they apply many lessons from old school with modern components and virtualization. Already proven for embedded with desktops in alpha stage.
No, those do not perform well. They are just less horrible than GNU HURD. When you apply similar optimization and implement similar functionality, monolithic kernels always win. It cannot be otherwise; think about it.
Self-healing is generally a security problem. It gives the attacker a second chance. It's also generally a failure. You might think you can restart, but there are huge problems: Instead of a crash, you may get a memory leak or hang. Hardware may be in a strange state, needing a power cycle to restart. Other things start failing once one driver is down. Most systems are unable to keep DMA from scribbling all over everything in RAM, and probably all are unable to keep it from scribbling all over a filesystem.
"No, those do not perform well. They are just less horrible than GNU HURD. When you apply similar optimization and implement similar functionality, monolithic kernels always win. It cannot be otherwise; think about it."
I do. It doesn't have to be better. It simply has to perform well enough that users accept it. Older systems did that slowly. BeOS was a great example where it was running on 90's hardware several movies, graphic animations, a song, and productivity apps all simultaneously with no slowdown. Blackberry Playbook outperformed iPad in tests I saw in responsiveness with one demo running a 3D game with other intensive apps simultaneously. It's at the point where Linus et al's argument about performance being too limited is ridiculous. Only users maxing out performance with little care of reliability will need a monolithic kernel on COTS hardware.
"Self-healing is generally a security problem. It gives the attacker a second chance. It's also generally a failure."
What are you talking about? There's a bit of extra attack surface due to more code and interactions. The first one, though, was implemented in KeyKOS kernel whose total size was around 20Kloc. MINIX 3's Reincarnation Server is straight-forward, too, given how components it restarts are designed. It's actually easier to get this right than reliability and security of monolithic systems since it's simpler and smaller than them. I mean, starting with UNIX Hater's Handbook and such, it took monolithic UNIX (and Windows too) decades to get where they are in reliability and security. The stuff I push got most of that done in first few years with a handful of people plus acceptable performance. So are you arguing against microkernels getting it done or in favor of throwing 8+ digits worth of labor at monoliths to achieve similar results? Neither look good in face of evidence.
"Tanenbaum is biased. It's time to move on"
Tannenbaum's is most immature system on my list. I could drop everything he's ever said and done while still having others as exemplars for mainframe, embedded, and desktop use that had acceptable or great performance with better security and/or reliability. You must have a beef with Tannenbaum or something. I respect his work & like the one presentation I watched but don't need it to back my claims.
"Hardware may be in a strange state, needing a power cycle to restart. Other things start failing once one driver is down. Most systems are unable to keep DMA from scribbling all over everything in RAM, and probably all are unable to keep it from scribbling all over a filesystem."
That's all interesting except these kinds of systems, esp proprietary ones, have been in the field for years in systems where failure and unpredictability had to be minimalized. They worked as advertised. Security-focused ones also passed pentests and analysis by people who knew what they were doing. These kinds of things are where monoliths, esp UNIXen and Windows, often failed or took crazy amounts of labor. Even the immature MINIX 3 is more reliable than you describe with all kinds of failures at the component level for a system that stays up. Your DMA example shows you're really grasping at straws to fight microkernels with an example that (a) represents a tiny set of failures in complex HW/SW systems and (b) still applies to monoliths with the exact solutions available for both styles.
Btw, the first IOMMU I found was in a system called SCOMP: a microkernel-like system that was first to be certified to high-security after IIRC 5 years of analysis and pentesting. Name one monolithic OS that pulled anything like that off. Don't worry, I'll wait.
By self-healing giving the attacker a second chance, I mean that it allows an unreliable attack to succeed. Consider defeating ASLR or winning a race condition. Each time the service restarts, you get a second chance to attack.
I have done a professional evaluation of a EAL6+ certified microkernel OS. There were plenty of bugs and design flaws (which I can not reveal) and an even bigger problem. To obtain certification, most functionality is left out. The users actually need this functionality though, so they put it in the uncertified code running on the certified OS. The overall result is less secure because each user program drags along a buggy reimplementation of what would normally be OS functionality. BTW, despite the EAL6+ nonsense, they were way behind OpenBSD and even Linux. It was that bad.
I have also been a professional kernel developer for a different microkernel OS. I assure you that maintainability is not a property of microkernels. You poke something here, and it pops out there. Good luck tracing out why, and good luck making any serious changes to the OS. The reason is that microkernels are deceptive. The individual components are simple, but they have very complex interactions. Glue isn't free. Compared to that, even Linux is trivial to understand and modify.
"Consider defeating ASLR or winning a race condition. Each time the service restarts, you get a second chance to attack."
I considered it. Those problems are handled by eliminating that problem with other means. Input validation, pointer/buffer/array protection, and so on are a start. Restarts are mainly for hardware faults or problems from lingering state. The concept was field-proven for reliability down to CPU level by a certain vendor whose systems ran NonStop. Many others at various levels, esp app's. Recently, academia showed it with "micro-restarts" paper cataloging problems that built up at every layer while showing component restarts knocked out a good chunk with imperceptible downtime. One of my own designs leverages what you describe in an instrumented system to automatically taint and trace execution after components restart enough. Idea being the failed attacks will take me right to vulnerability and patching it. This is only on paper but CompSci teams did similar things in stuff they built.
"BTW, despite the EAL6+ nonsense, they were way behind OpenBSD and even Linux. It was that bad."
I keep hearing these things. It wouldnt' surprise me if it were true given how I called out one vendor over mislabeling what was certified and not mentioning extra untrusted code. Forced them to change their website. Probably same as assholes given there's only so many EAL6+ kernels out there. ;)
Yet, what analysis and pentesting I've read of such assurance activities dating back to 60's shows they deliver results. We have even more methods today. Whereas the CVE's and severity I get out of low assurance software are laughably bad. It might be true that modern vendors are bullshitting through evaluations. Says more about evaluation politics than the methods used: they only work if applied for real. I endorse the methods most of all, old and new.
Btw, the latest from CompSci aiming at EAL7+ is seL4 kernel. The source code for that is available. Feel free to find their vulnerabilities and show them where their models/proofs were inadequate. Whatever you find will factor into other efforts. If you find little, that would be a testament of itself, yeah? I'm neutral as I'm interested in what exact metrics will be for a ROI analysis.
"The users actually need this functionality though, so they put it in the uncertified code running on the certified OS. The overall result is less secure because each user program drags along a buggy reimplementation of what would normally be OS functionality. "
I agree with that one on security front. This often happens. That's why I push for standardized, core functionality in them. QNX and BeOS were again great examples there although not designed for high-security. GenodeOS is doing clean-slate stuff and pulling in components from UNIX land. They're security focused. So, there's potential there. QNX could conceivably be redone for real security but it's not likely to happen. This is a social problem more than technical. A real issue with barebones stuff but not fundamental.
"I assure you that maintainability is not a property of microkernels. You poke something here, and it pops out there. "
"The reason is that microkernels are deceptive. The individual components are simple, but they have very complex interactions. Glue isn't free."
Yes, these are totally true. It's why you need different tooling for debugging them. My old technique was modelling the software as a monolith in source with bug prevention or hunting using same techniques as finding concurrency errors in shared thread and/or actor models. You can also use taint-based methods that track things through the system live or virtualized. Tannenbaum and Hansen had some other methods. Quite a few out there in CompSci and industry.
Yet, you are in for a world of hurt if you try to debug them like you do a monolith esp with tools designed for monoliths. I have a feeling that's what you were doing. I'm not saying there's a lot of publicly available tooling plus guides on it where you'd have had it easier. This stuff, like high-assurance vs mainstream, tends to silo up with knowledge getting obscure/lost and tools getting dusty. The tricks are prevention by your resource sharing and/or middleware plus tooling that models and tracks flows in distributed systems w/ easier subset of its assumptions.
We get this complaint enough that I think I'll try to dig up a collection of tools or methods from CompSci and proprietary sectors to recommend or further develop into something widely available. If I can find time that is. Got many projects I'm working on outside a demanding job. It needs to be done, though. No valid excuse for us hearing this in 2016 without a Github link to reply with except bad priorities among microkernel community.
"Compared to that, even Linux is trivial to understand and modify."
You're the first to ever tell me that lol. I've seen many people give up on high-assurance UNIX/Linux, even significant architectural changes, because of too many difficulties. Largely tight coupling and legacy effects. So, they ended up working at hardware, compiler/language, or microkernel levels to solve issues. Managed to get them solved in believable ways. Makes me think Linux wasn't so trivial. Peer review will tell over time if each issue was really solved. Meanwhile, Linux today has most of the problems it had when I reviewed it 10 years ago. More reliable and usable than before, though, with it only hosing my packages and freezing my desktop every few months instead of days. The backups and restores work great, though. ;)
"Most systems are unable to keep DMA from scribbling all over everything in RAM, and probably all are unable to keep it from scribbling all over a filesystem."
In what context do you mean? When would this occur?
That was burfog's comment I was quoting. It refers to the fact that direct memory access by some devices can bypass any OS or software protections. Breaks whole security model as memory can change arbitrarily. So, risk of attacks or leaks should be mitigated there.
A few methods follow:
1. Use non-DMA links.
2. Use trusted hardware/firmware that mediates things properly.
3. Use IOMMU to enforce access controls on DMA.
4. Use a combo of full safety in system and careful API for access to DMA features.
I used 1 and 2. A few use 4. Number 3 is most common with basic version mainstreaming. Not enough, though, as complex firmware and OS's still provide attack opportunities.
Note: There's also interrupt floods and other esoteric issues to counter. So, it's the start rather than full solution. EMSEC issues too with malicious peripherals.
thanks for your clarification. IOMMUs are a standard part of most motherboards chipsets these days, what would be the reason for not taking advantage of that and using them?
It depends if you trust those chips or not to start with. Not having the weakness, as in trustworthy I/O, is always superior to a tactic attempting to stop the weakness. The other issue is whether mere restrictions on memory accesses will help. In software, microkernels isolating where or who you could talk to were only a start: attacks could happen via a series of compromises. Failures too in the Byzantine failure model. Plus, with monolithic OS's, you can get full control of the system with kernel-mode attacks that further facilitate DMA attacks where the main system looks unmodified in most operation while malware is in peripheral firmware.
Many issues due to how things connect and store information as burfog said. There's not just one thing. Even my microkernel recommendation is only for the start of the software part of a secure system. Actually, trustworthy CPU, ROM, bootloader and drivers are the start if we're being technical. ;)
Every example of a "fast" microkernel has either ripped out expected functionality (debug traces for example) or simply been the first to make an optimization that can be applied to monolithic kernels as well. Fundamentally, microkernels are slower. A bit of thought should make it clear that this can not be otherwise. No matter how fast you can pass a message, it's still faster to not pass a message at all. Also, the overhead of TLB misses when changing MMU mappings is huge. Microkernels can only win when they compete against badly-optimized monolithic kernels and there is no technique that can get past this fundamental truth.
It's an architecture problem. What makes QNX fast are a few
basic design decisions:
- The basic interprocess communication mechanism works like a synchronous subroutine call - you call, you wait, you get data back. Most slower microkernels have unidirectional I/O as a primitive.
- This is very tightly integrated with CPU dispatching, so that calling a service which isn't currently busy is just a context switch, not a full pass through the CPU dispatcher. This and the above are what make QNX fast. If you do interprocess communication by writing to a socket without blocking, then wait for a reply by reading from one, it takes several extra trips through the CPU dispatcher to call another process. Worse, every such call can put the handoff to the new process at the end of the line for CPU time. If you're CPU bound, this kills performance, in some systems by orders of magnitude. The ability to toss control back and forth between processes at high speed is essential. (This is where Mach blew it.)
- Userspace programs can be placed in the boot image and loaded at boot time. So can shared code objects. This eliminates the temptation to put stuff in the kernel so it's available early in startup. File systems and networking are all in userspace.
We keep the same IPC mechanism. We compile the filesystem process right into the kernel. Having done this, we can now avoid half of the IPC mechanism. We enter the "microkernel" just once now, instead of twice, and we leave it once instead of twice. Since the filesystem is now in the "microkernel", we don't need to switch MMU state and have a TLB invalidate. This is a huge win. Now let's repeat this design change for the disk driver, the network stack, the network hardware driver, and all the rest. Performance keeps getting better. This, BTW, is pretty much what most Mach systems ended up doing. They became microkernel in marketing only. The final step is to clean up the code, and then you have a normal monolithic kernel.
Let's also look at things from the other perspective. You could add the QNX IPC mechanism into any monolithic kernel. AFAIK, Solaris DOORS might even qualify. Well, there you go. You can move things to use that whenever you are willing to sacrifice speed and maintainability. If this is so good, why haven't people done it? Hmmm.
Performance is great until those things crash my system. The stuff still happens with graphics drivers on my Linux distro's. I know it's not necessary because it doesn't happen on the microkernel systems and even Windows dodges a lot of it with their SLAM toolkit.
"Let's also look at things from the other perspective. You could add the QNX IPC mechanism into any monolithic kernel. "
Congratulations: you've just reinvented security kernels w/ legacy support from the 80's-90's plus modern separation kernels w/ legacy support of the 2000's. Here's an example to support your point that our model is better with microkernels, user-mode drivers, and monolithic API's in isolated partitions:
Even just putting the drivers and a few critical components in partitions can work wonders. That's what Nizza-like architectures like Turaya and Genode do. Their TCB's are many fold smaller than UNIX's with acceptable performance. You don't even notice it with the laptops of commercial ones (eg INTEGRITY-178B, LynxSecure, VxWorks MILS). Plus, there's around a billion mobile phones running OKL4 mainly for baseband or legacy isolation alongside Android or Windows Mobile. Notice how your smartphone is so much slower than older ones that didn't do that? Wait, you thought it was faster and better than the last one? Exactly. :)
IMO microkernels are a dead end, to get better performance you have to shove stuff in kernelspace, while for say an exokernel the opposite is mostly true.
If we're talking tech, then your post couldn't be more wrong given my BeOS and QNX examples. Performance was equal to or better than monoliths of the time. BeOS especially destroyed competition in concurrency performance due to its architecture. QNX runs at hardware speed basically with real-time properties and POSIX support. BeOS disappeared due to Microsoft monopoly with Haiku making a OSS clone. QNX was at $40 million a year in revenue when Blackberry bought it. Green Hills and VxWorks are doing OK, too, with VxWorks making more than QNX per quarter. Both have desktops virtualizing Windows, Linux, etc on microkernels w/ Gbps throughput.
I don't see why we keep getting these theoretical counters given the proven results of microkernel performance in the field. Tell me why microkernels are too slow when they can only do this on 90's era hardware:
Our side produced highly reliable and secure systems plus high-performance systems. It was always done with a small group with little time. The monoliths took a decade and thousands of man hours to do the same. It's up to you people to justify why those hours were well-spent.
"while for say an exokernel the opposite is mostly true."
I've seen the VxWorks code. VxWorks is not a microkernel.
That BeOS demo did not heavily use privileged interactions. Mostly it showed computation which is the same on any OS. The best thing it showed was a process scheduler which was good at giving priority to things that a user would care about. A more interesting test would be serving files or building software.
I think one should be careful not to read too much into CVE numbers. People aren't exactly trying to mess with KeyKOS, Haiku, QNX, and other weird things. Few people want to bother. None of the Linux problems are inherently specific to monolithic design. The best you could say is the you might have a sandbox that makes things more difficult for the attacker. On the other hand, restarting means you give attackers more chances to succeed.
The best thing you can say is a bug in kernel code that hoses my whole system is less likely to happen several times over. Suddenly, hackers or faults have to work through components' information flows. You keep ignoring that in your analyses. Also why I brought up CVE's because it's impossible that the microkernels had as many in kernel mode just by code size. Still plenty to be found in privilieged processes but POLA and security checks are way easier when memory model is intact.
Btw, one person here who wrote about QNX desktop demo mentioned doing productivity stuff while compiles ran in background with no lag. So there's that use case except not for BeOS. The link below will show you BFS was more like a combo of NoSQL DB, files, and streaming server:
Due to its nature, compilation and build systems are about the slowest things you can do on it. I've seen numbers ranging from 2.5x to 20x slower than Linux but they didnt share specs. I'd swap out the magic filesystem for a simpler one if on a development box. BeOS was aimed at creating, editing, and viewing streaming media, though. Did that very well.
Re sandbox more difficult
No kidding! That's the entire point: get it right or make it harder to beat at least. Monoliths on mainstream hardware are amusement parks with free rides and victims everywhere for attackers. Microkernels on COTS hardware and even modular, typed monoliths on POLA hardware are a series of sandboxes with adult supervision during play and movement. Quite a difference in number of problems showing up and damage done.
Re more chances to succeed
You keep repeating this too without evidence. Attackers need vulnerabilities to succeed. They'll know some to use ahead of time or they won't if we're talking OS compromise. A flaw in one module lets them take one module no matter how many restarts. A flaw in two with a flow means they'll get it in first try. This is why you design it so each flow and individual op on them follow security policy.
The only time restarts give attack opportunities is if your using probabilistic tactics (eg ASLR) or they're waiting for intermitent failure (eg MMU errata). Any high assurance system better not exclusively rely on tactics (ever) and should account for latter (eg immunity-aware programming).
All in all, anything you've said about microkernel systems applies to monoliths in various ways. One model just limits system-hosing faults and hacks a lot better. The question is do you want to accept that risk to squeeze out max performance or eliminate that risk with acceptable performance? Microkernels choose risk reduction while mainstream monoliths choose performance.
An exokernel is definitely not a microkernel, one provides abstraction via (usually) processes 'servers' the other one does via a library which is vastly cheaper overhead wise. I do understand that there will be a need for some sort of IPC just not to the extent of a microkernel.
A microkernel is an abstraction over hardware with minimal code and API. An exokernel is a form of microkernel since it has these properties. It just does thing very differently from most microkernels. Hence a name for that style.
Mach was a microkernel that tried to do a bit too much. Performance and security stayed horrible. Other designs had acceptable to great performance or security. So, it's not representative of microkernels in general despite being interesting research platform back in its own day.
Now, Darwin starts with Mach then basically adds BSD and graphic stack onto it in kernel mode. So, stuff that would have user-mode isolation and performance penalties due to Mach bloat has no penalty but less protection.
So, OS X is clearly not a microkernel system so much as incorporating a microkernel into a monolith. Windows similarly has a microkernel near its foundation for organization purposes I think. Linux is really modular inside similar to microkernels but clearly similarity ends there. So, there's lots of hybrid results where monoliths and microkernels styles are blended a bit for compromise of bensfits.
Can you elaborate on what you mean by "CPU dispatcher"? I am not familiar with the term and have not heard it before. Is this something specific to certain SoC designs? I've never heard mention of this in x86 architectures.
Modern desktop computers are what, a million times faster than counterparts from the 1980's? I will take the TLB misses and reduced efficiency. It's time for safety and reliability to take center stage. Microkernels seem like a great design for that, much moreso than monolithic kernels.
L4Linux proved that this can be done quite effectively, and at the time, it performed better than hypervisors like Xen that were growing in popularity. What GNU Hurd is trying to do is decompose the monolithic UNIX kernel host into smaller services, and that's where the complexity comes in.
If experience has proven anything, is that outside desktop and server OSes, built on legacy of existing infrastructure, micro-kernels rule in the embedded and real time space.
I'm in the midst of implementing DNSSec right now, I've got RRSIG and DNSKEY validation back to the root now. I'm working on negative query validation right now, NSEC and NSEC3. Maybe a new release in a few weeks after that is supported.
Having used Rust for this, what are your thoughts about how Rust fares for such tasks -- is coding this in Rust better than using C; do you think you are having fewer problems in the end-product as compared to if you had used C. What are the kinds of issues that annoy you about Rust for this? Does the promise of Rust hold up? I am trying to see if I should use Rust for network programming and want to find out how Rust feels for these kinds of projects.
Honestly, comparing (safe) Rust to C is like comparing Java to C. Rust is a higher level language than C, so writing in Rust means (especially for security and network programming) means that I don't have to worry about initialization of memory, or allocation, or deallocation. That's where all my bugs came from in C.
In terms of use, I'd say there are still gaps in the number of available libraries, and features of those. I had to add some options to the OpenSSL Rust interface for instance. The non-blocking IO library, mio, is very solid and portable! And I want to play with rotor as a higher level abstraction.
If you use Rust instead of C, you will have fewer memory related bugs, and you have more portable code than std C. Rust makes happy low level programmers :)
IRONSIDES uses SPARK to show freedom of exceptions or single-packet DOS. A translation to Rust would probably preserve most of those properties. I also thought a parallel development between Ada2012/SPARK and Rust would be interesting to see if one set of checkers catch something another misses in same piece of software.
My understanding is that Rust doesn't have so much to offer Ada/SPARK. I haven't looked that hard at Rust though, so maybe I've overlooked some of it's features.
You have to jump through the same sort of hoops in Rust as you do C++ to get the sort of basic type safety found in Ada/SPARK. Rust doesn't so much have a focus on safety and correctness in general, as a focus on memory safety.
Admittedly, Ada/SPARK doesn't make the kinds of guarantees that Rust does about memory. However, accessibility checks help prevent dangling pointers and the use of memory subpools can avoid the need for explicit frees.
Additionally, Ada offers some features like the ability to return a dynamically sized array as if it were on the stack which helps reduce how often memory needs to be manually managed. I believe SPARK may not allow this sort of thing though.
I'm also not sure why you would want to abandon the actually verified properties of Ironsides for a hope those properties remained in an unverified port.
"My understanding is that Rust doesn't have so much to offer Ada/SPARK. I haven't looked that hard at Rust though, so maybe I've overlooked some of it's features."
I liked the juxtaposition of that haha. It kind of negates anything about Rust in your counterpoint. Your points on Ada/SPARK are still worthwhile.
"Ada offers..."
No doubt. It was systematically designed to reduce the existence or impact of flaws throughout software. They kept this up in extensions. SPARK straight up proves the absence of them in a subset. So, it's what my gold standard is for safe, systems programming. Rust is the newcomer with interesting additions inspired by safer-than-C imperative and functional languages. It's confusing scheme for memory and concurrency protection is said to be quite effective. It might end up better than Ada over time or just different tradeoffs. The important thing, though, is that it has what Ada/SPARK will likely never have: large-scale adoption with uptake by big companies and users who will add to its ecosystem. Ada FOSS community is barely there with biggest deliverables coming from AdaCore.
Remember Gabriel's Worse is Better? Real lesson of it is certain things get adoption better than others. Best to bet on them. Rust is in that category a bit but with Right Thing elements. So, it's worth investing in for long-term benefit of IT.
"I'm also not sure why you would want to abandon the actually verified properties of Ironsides for a hope those properties remained in an unverified port."
I'm not necessarily suggesting that. There's a reason that OpenBSD and some proprietary vendors keep their system working on several compilers. There's a reason NASA project here used 4 static analysis tools. There's a reason B3/A1 systems kept detailed, formal specs side-by-side with code with both checked. The reason is that different tools catch different types of problems. Further, a problem detected in one might apply to the other. Now, a semantic difference exists between Ada/SPARK and Rust. Yet, there's usually a subset and/or style one can use where results from one about memory or control issues will apply to the other.
So, I said do it in parallel: both Ada/SPARK mix and Rust. It displays how each language handles the problem. It let's the analysis tools of each check it for problems that might exist in both. Tool version of many eyeballs except works consistently. ;) It also means that compiler optimization problems in one might not affect an other. Can help with hotfixes while compiler team develops guidance or patches. Also, provides two implementations that can help with adoption and support. There's an issue of maintaining parity but the simpler version is write it in Ada/SPARK w/ Rust release getting uptake.
That's the lines I was thinking on. Rust is popular and better than usual. Ada is equal or better in capabilities but not going anywhere. Leverage both to get benefits of both maybe defaulting on Rust release for adoption. Far as Ada vs Rust, I'd like a less biased party that knows Rust's protections and features through and through to do a comparison for me. I have a reference listing each issue Ada tackles and how. I probably can dig up one for SPARK 2014, too. I would love to see what mainstream answer to Ada can do vs the baseline it created.
Note: You should get a real account if you're a programmer AND know Ada/SPARK. Seriously under-represented here.
"I liked the juxtaposition of that haha. It kind of negates anything about Rust in your counterpoint."
I almost removed that at the end. I wasn't trying to say much about Rust except that it wasn't predicated on this idea of quickly spinning up new, conceptual types with specific constraints like Ada/SPARK. That sort of thing is really either infused throughout a language, or it isn't used. It doesn't take a great deal of familiarity to see that much.
Actually, just for this response I went and looked into it a little more. Rust has a tiny bit of syntax sugar for creating structs with positional members. So, you can at least wrap things in a struct a little bit easier than in C++, as you don't need to name the internal value. That's really not enough.
Really I mostly wanted to show that Ada also has some means of helping with memory safety - while also being focused on safety/correctness in general. I suppose on that front (and while I'm pushing Ada anyway) I should also mention Ada has a built-in concurrency model, although I suspect you may already be aware of this.
"Ada FOSS community is barely there with biggest deliverables coming from AdaCore."
Yeah, this is the biggest bugbear when it comes to Ada. For what it's worth, there has been some new life on that front and people are working on improving things. So far it's still very, very early days though. The biggest news so far is Gnoga, a web-socket based, uh, web framework that's usable with Gtk WebViews. It will take awhile, but hopefully this trend will continue and the Ada FOSS community will be able to rouse itself.
If you can't tell, I'm not quite ready to count Ada as a lost cause. If it is, and we are looking at running with a new, safe language, at a minimum we shouldn't be forced to take some serious steps back. I'd have loved to see Rust focus on general safety and correctness rather than just memory safety, but it didn't. Losing all that is a rather bitter pill to swallow.
"Ironsides..."
Totally reasonable, and I think it would be very interesting to see.
"Far as Ada vs Rust..."
I've been really hoping to find a good comparison for some time now. I know Rust is able to make stronger guarantees about memory than Ada/SPARK can, but it isn't able to handle every case. I'm really curious to see how well Ada/SPARK is able to handle both what Rust can prove, and what it can't. I also don't know if Rust offers any more imprecise safety nets for the cases it can't prove. Finally, I know nothing about how Rust handles concurrency, so that would be another great comparison to see.
"Note: You should get a real account if you're a programmer AND know Ada/SPARK. Seriously under-represented here."
I might. The plan was for this to be a rather temporary account, but it seems to have stuck.
"wasn't predicated on this idea of quickly spinning up new, conceptual types with specific constraints like Ada/SPARK. That sort of thing is really either infused throughout a language, or it isn't used. It doesn't take a great deal of familiarity to see that much."
I agree that such techniques need to be infused throughout the language. I just don't know enough Rust to comment. I do love Ada's existential types. That's a very simple technique that can knock out all kinds of issues, esp numeric conversions, that other languages have to go out of their way to avoid. Quite a few things like that in the language.
" I suppose on that front (and while I'm pushing Ada anyway) I should also mention Ada has a built-in concurrency model, although I suspect you may already be aware of this."
There was a myth among some users of Rust that it was the first to have safe concurrency. I bust that here an elsewhere citing Hansen's Concurrent Pascal, Eiffel's SCOOP, and Ada's Ravenscar in that order. SCOOP is most exciting given pedigree (Meyer et al) and CompSci research into it. I've seen people formally verify (read: fix) it, livelock/deadlock-free proofs, and eliminate performance penalties. Wild stuff. Rust is latest with safe concurrency but not first. Some CompSci person really needs to do a detailed comparison of these as that might be insightful.
"Yeah, this is the biggest bugbear when it comes to Ada. For what it's worth, there has been some new life on that front and people are working on improving things. So far it's still very, very early days though."
The very, very early days on getting a OSS community around a language that's about three decades old. Whereas I've been fighting C's of a similar age, there was little resistance to Rust, and one spontaneously emerged around Julia. You see why I have little to no hope for Ada? The only place I see a ressurgence of it is business sector where professionals that Get Shit Done (TM) might use it for long-term, mission-critical apps. Aside from what we like, it also has advantage of being designed for readability, easy integrations, and future-proofing. It delivered all that for decades straight. Just say: "Look, you'll be stuck with this app for decades. You want it written in COBOL, Microsoft C++, or AdaCore Ada/SPARK?" Ok, let's be honest: Delphi Pascal was a contender and my recommendation given ease of learning for disposable IT staff. ;)
"If it is, and we are looking at running with a new, safe language, at a minimum we shouldn't be forced to take some serious steps back."
I feel you there. Good news is that the Rust team takes feedback so long as it's productive. They're quite active here on HN. You'd have to learn Rust more thoroughly so you could say exactly what Ada has and it lacks. Maybe suggest how they'd add that without breaking current code as the language is in stable mode. They might bring it up to parity there.
"Finally, I know nothing about how Rust handles concurrency, so that would be another great comparison to see."
Operating systems aren't that big, and if you know your stuff, I'm sure it's not too hard to pull off. Here's an example I borrowed from StackExchange:
According to cloc run against 3.13, Linux is about 12 million lines of code. 7 million LOC in drivers/, 2 million LOC in arch/, and only 139 thousand LOC in kernel/. (http://unix.stackexchange.com/a/223753)
Edit: Would be nice to have the numbers for the latest Minix release for comparison. Does anybody know how big their core team is? (The kernel is something like 15k LOC IIRC.)
Btw, if anyone is interested in learning how to write an OS, and interested in Rust, check out https://intermezzos.github.io/. It's a "learning OS" in Rust, with a companion book. Incomplete, but a good start.
Project lead here. I haven't had time in the last few weeks to work on this, so the book is pretty far behind the kernel itself at the moment. Playing the long game. I was hoping to do some more today, in fact...
I'm reading through now, and would love to contribute. I've been trying to break into this space (systems programming, especially OS and compilers) for a long time and the biggest hurdle for me has been to actually do it. I really appreciate that this project is focused on explaining just enough to jump in, complemented with nods to further resources that can add on to whatever knowledge the reader brings.
Steve, thanks for all your work on this and with Rust. Your attitude toward teaching and your writing ability go a long way in encouraging people to learn and in having the lesson be worthwhile.
The "everything is a URL" concept is interesting. QNX uses pathnames in a somewhat similar way; programs with the privilege to do so can register to own some portion of the pathname space, and requests with such pathnames go to that program. The kernel has no idea what pathnames mean. This gets file systems out of the kernel. It looks like Redox is doing something similar, but not enough of the documentation is written yet for me to tell quickly.
Warning: I have no idea about systems programming.
I understand that the amount of lines of driver code comes from the variety of devices. But still, it looks completely unbalanced, when compared to the kernel code itself. So I have some questions here:
1) Shouldn't there be common interfaces / abstractions for most of the devices?
2) If they exist, could they be improved somehow?
3) A bit unrelated, but how fun / interesting is it to develop driver code?
1/2. Yes there are. Network devices, file systems... most of the things that could be abstracted are done. However, take into account that every device has its special quirks and configurations. Only very generic devices can have generic drivers. But once you get to specific devices, you need specific drivers that can deal with special features and non-standard communications. See for example the number of vendors of ethernet drivers (https://github.com/torvalds/linux/tree/master/drivers/net/et...). And that is just a small part of the code.
3. Interesting? A lot. You get to learn how do low-level parts of your system work, deal with complex structures and interactions between kernel/hardware/userspace and.
But the most interesting thing is the programming ability you need. You have to really understand the code, be able mentally follow the execution paths (which are not simple, believe me) and write as few bugs as possible. Why? Because you don't have many tools. No debuggers, no unit tests... My main debugging tool is a set of debug macros that print to the kernel log. Debugging is a real pain in the ass and crashes can lead to a system restart, so I have to try to get it right as soon as possible.
The header file for compatible OpenGL 4 will run to 50kloc. The jump table and normalization routines will be twice that. That's more LOC than the Linux kernel, and all it does is sigbus.
I would guess, that most of the size comes from history.
Like, once it seemed a good idea to have it like that and later people found out there are a bunch of things missing. Now many of them have to be implemented on the driver side over and over again.
Just making a standards compliant implementation takes a huge quantity of code. And with a standard this complex people inevitably get it wrong, so you'll need workarounds for all the broken devices too.
Not just with hardware interfaces, but even in pure software... I wrote an LMS that the core of which was used by a fortune 100 company, and a few airlines for nearly a decade... I literally left out about 1/3 of the SCORM spec, and implemented one piece badly... it wasn't until about 8 years in that the missing piece was even an issue, and the part I got wrong never became an issue.
In the end, you understand as much as you can, implement what you have to, and do your best to get through it... and even then, someone will mess things up on some end or another... to this day, I'm surprised that SCORM was synchronous... hell, it feels like they're half the reason XHR has a sync option.
I feel the same way when looking at terminal emulators... sigh, so many things to implement to get something useful, even if you're only looking to get a small subset working.
1. There are, but they only exist in software. You have to write a different glue layer for virtually every device. And sometimes the device's model doesn't really fit your driver's.
2. Not... really. Standards are supposed to do that. Many hardware manufacturers don't really implement them in a proper manner, and "whoever designed this flash controller should die a slow, painful death" is not something users care about.
3. If the device is well-documented and relatively standard-abiding, it's extremely fun if you're passionate about it and relatively painless if you're not. Otherwise, it's about as fun as digging a tunnel under Mordor with a toothpick.
Even then, most standards have parts that are obtuse to say the least, others that aren't really needed for a given application and others that are just open to some interpretation or otherwise not clearly defined...
In the end, people tend to do the best they are able to in a given situation. Often that means ignoring some parts, and making assumptions for others so that your product can ship instead of waiting over a year for a standards body that might clarify something.
2) ... but the meat of a driver is putting the right values in the right registers of some chip, and sometimes work around bugs in said chip, or sometimes take into account that popular variant that's almost the same that the original but not quite... It takes a lot of boilerplate code.
3) It depends on your definition of fun. A driver by definition is a middle man between the OS and the hardware; so programming there is a lot about doing what you're told (and sometimes get punished harshly because you misunderstood something), either by the kernel documentation or by the datasheet of the chip you talk to.
I worked at a company that made POS terminals on the OS development team. We had a proprietary OS based loosely on UNIX that was ~100KLOC (kernel). It was largely designed by 3-4 core developers and maintained by ~12 embedded software engineers (in addition to drivers, apps, UI, etc).
Ours didn't have a GUI. It took us less than a year to write and the team was never larger than 6 people. It's less complicated than it seems at first sight.
There are a lot of things about machines that you need to understand, but the technology itself is relatively well-understood and sane practices are encouraged.
It blows my mind that less than twenty people can string up something non-trivial with Node.js et co.. Operating systems are trivial in comparison.
As somebody who develops significantly complex in Node.js and considers operating systems to largely be black magic, I'd love to have a chat about this some day, and compare notes :)
I second Minix as Linus started with it when figuring out how to write Linux. Also, Dijkstra's THE system, Niklaus Wirth's Oberon books with source code, and Hansen's stuff are all educational. Wirth and Hansen have a simplicity focus that makes it easier.
Matter of fact, Wirth's people make OS's with type-safe code that are so straight forward I've always recommended starting with an AOS or A2 Bluebottle port to your language of choice.
@gravypod if your college library doesn't have it you should ask them if they can get it through interlibrary loan; most colleges (in the USA) participate in this program which gives you access to virtually every book ever printed.
"Operating System Concepts" (aka "the dinosaur book") by Silberschatz, Galvin and Gag is a classic. It's very approachable, compares real operating systems' design choices, and covers some theory.
Appreciate the recommendation. I gotta relearn a bunch of stuff due to a brain injury wiping it out. A simple x86 OS in modern C will fill in plenty of blanks. :)
Your website is interesting but a tad hard to follow due to the "organization" scheme. ;) So, what OS project did you work on in that first sentence and do you have a link to it?
Sadly, I don't have any link -- it was a commercial and proprietary soft real-time system, mainly for embedded use (we made it as portable as we could, but never ported it to something truly general-purpose like x86). It was functional and reliable more than it was fancy. Certainly nowhere near the scope of Redox. But it had preemptive multi-tasking, a driver model somewhat akin to Linux's (but without loadable modules) and some networking support, which we managed to fit in a few KB of RAM.
I'm saying all this in the past tense because the company was sold off a few years ago and I have no idea what happened to the IP, if anyone is still developing it and so on.
I only wanted to mention it because operating systems have this intimidating reputation, which is not only a little unfair, but seems to hold people back. I've seen a lot of very good programmers thinking they couldn't be part of a team that develops one, when it was in fact well within their possibilities.
And, historically, this has been true for a very long time. Many applications that run on operating systems are far, far more difficult to program than the system they're running on, or at least require vastly more knowledge in order to get right than writing the OS.
> Your website is interesting but a tad hard to follow due to the "organization" scheme. ;)
The joy I get out of doing it again vastly outweighs the pain of writing HTML by hand, but barely :-).
Good work. Not quite an OS like what they're trying to build, though. I've always thought these discussions could benefit from different categories of OS like single vs multi-user, UNIX-like vs non, embedded vs desktop vs server, local vs distributed, and so on. A soft-real-time, single-user OS is much easier to write than a UNIX alternative that's similar in capabilities. So, one can certainly write a RTOS more easily than many applications on servers or desktops but the statement seems apples to oranges applied to mainstream perception of OS's (aka desktops, servers, or iOS/Android).
"The joy I get out of doing it again vastly outweighs the pain of writing HTML by hand, but barely :-)."
Haha. I can't talk given how disorganized my stuff is. Hell, I don't even have a blog: I have text files and PDF's I send to people that ask. Gopher is on another level compared to how I do things.
Nonetheless, it's good to know people that did embedded systems as I'm slowly accumulating knowledge on that sort of thing in my exploration of secure hardware/software systems. Really wish I spent more time on embedded and HDL's back when my brain worked reliably. That's where best bang-for-buck in security and reliability are. The more people write openly on such topics the better. So, consider organizing anything you have on the magic you used to do stuff like that web server.
Side note. I'm peripherally gaining information on doing 8- or 16-bit software for microcontrollers. Hell, I even have some docs (and maybe HDL source) on a 1-bitter although with 8-bit ALU. Do you have any resources I can give to new people showing the tricks people use to reliably do complex or fast stuff with them? SymbOS would probably be high-end of that but I intend to use them in peripheral controllers, monitoring, and such.
> Good work. Not quite an OS like what they're trying to build, though.
Absolutely, that's why we managed to get something that worked, despite being at least six times as few people who also had to work on other projects :-). Redox is an order of magnitude more complex, because it's trying to solve problems that are an order of magnitude more complex. I just want to dispel some of the "black magic" air around these things.
> Side note. I'm peripherally gaining information on doing 8- or 16-bit software for microcontrollers. Hell, I even have some docs (and maybe HDL source) on a 1-bitter although with 8-bit ALU. Do you have any resources I can give to new people showing the tricks people use to reliably do complex or fast stuff with them? SymbOS would probably be high-end of that but I intend to use them in peripheral controllers, monitoring, and such.
You mean re. digital design, or microcontroller software?
If it's the latter, I'm not sure what to recommend... most of what I know is stuff that every programmer knows + experience you get out of working on embedded devices (and, of course, failing a lot). Fundamentally, programming these devices is no different than programming other computing devices (save, perhaps, for the Harvard architecture, but that's fairly inconsequential), you just work based on other assumptions regarding what's acceptable in terms of failure, performance degradation and so on.
I guess related resources that might help a lot would be:
* Jean Labrosse has a series on his uC/OS kernels. I think the latest is uC/OS-III. I haven't read it, but Labrosse's work in general is extraordinary, so this book may be what you're looking for.
* If you don't mind Ada, Building Parallel, Embedded, and Real-Time Applications with Ada is a pretty good book on the subject, too. I've skimmed it and at least the aspects related to reliability are well-treated.
* If you haven't already read it, van der Linden's "Expert C Programming" is still a very good read. I don't doubt that languages such as Rust and Go are the future for programming higher-end systems, but I don't see C going away on the lower-end in the next 10-15 years, and frankly, I don't think Rust solves too many of the problems I encounter on these systems.
* For dealing with resource-constrained systems in general, Bitsavers.org has exceptional stuff, but it requires some digging and extrapolation. Old computers are a semi-serious hobby of mine -- it's fun, but I also learned a lot of interesting things from studying old systems.
Good recommendations may come from Ganssle's (www.ganssle.com) Embedded Muse, too.
Oh, speaking of which...
> I have text files and PDF's I send to people that ask.
What's the quality of that book in terms of instruction and examples for a modern audience? If it's good, I might try to get a copy even for a teaching or demonstration language in safe languages. Easier for people to learn than Ada or Rust.
The first problem is the traditional use of uppercase for the keywords, that many dislike, but any suitable IDE would "convert as you type" kind of thing.
The chapters about multi-threading, file IO and graphical system are interesting and it contains multiple references to contemporary work like Topaz.
But while it is very interesting from historical perspective, I am not so sure if it would be useful for modern audiences.
Yeah, embedded was what I was talking about and trial&error was what I was worried about. What I've seen people write up is a lot more difficult than software in general. I remember one had severe application issues when it ran with instant on that were eventually resolved as the PLL's not synced up. They put a delay in to let them warm up. Program worked fine. Seen some stuff in immunity-aware programming talking of similar issues. A comprehensive, free collection of stuff like that with tips would be a boon to hobbyists and pro's alike.
re book recommendations
Thank you. I'm already on Ganssle's newsletter. Good stuff in it. I've been published in one or two. Ancient hardware factored into my recommendations for two:
Ganssle confirmed that one SOC was taking the approach where it had a badass ARM Cortex plus a M0 for interrupts and such. It's claims were like a watered down version of Channel I/O. Cheap, too. :)
Btw, here's your 1-bitter. The link saying datasheet and VHDL design has the PDF's and code. The manual might have tricks worth remembering on other architectures.
I already have over 10,000 papers from CompSci on my computer, a subset I read instead of skimmed. I keep planning to go through BitSavers to find more interesting stuff but afraid of accumulating useless junk. I just keep putting it off. I was able to pull interesting data on some interesting systems. They even had detailed books on OpenVMS drivers and internals. Something I'd have had to pay for when trying to clone its legendary reliability. Similar stuff on Tandem's NonStop architecture, patents on which should be expired by now or close. I'm keeping an eye there.
My own work was research and occasional work in high-assurance systems. I focused on security as it was hardest and most important with safety or predictability next. I have all the key papers from the past with many of today's best. The lessons learned papers had the most wisdom showing me every step of the way how they tried, failed, and/or succeeded in specific ways. I applied those lessons to my own work. After a brain injury, I'm like INFOSEC's Jason Bourne where I don't remember shit but what's left kicks in to help on forums and such. My main role is evangelist of high-assurance and old wisdom to embed it into more projects. Successes are few but keep it worthwhile. Best, recent example is Tinfoil Chat: Ottela applied every bit of feedback we gave him on Schneier's blog to make one badass design that deserves a rewrite in systems language.
I'll send you my stuff later today. Some people find it useful. Especially since the advice pre-empted the Snowden leaks defeating around 90% of their attacks. Funny that security "pro's" still argue with the shit while pushing what got defeated. In high assurance, it's mandatory to learn from the past. In retrospect, I wish I did lots of embedded or digital design before my memory loss as hardware/software interface is where best results are at. The analog and RF levels, too. I've mostly completed a secure ASIC design methodology and RAD strategy, though, so that will come as quickly as CompSci decides on right HW architecture. :)
It's no rush :-). I'm always looking forward to this kind of stuff. Nothing keeps one's mind fresh the way someone else's well-informed opinions do.
A long time ago, I thought about doing something similar to Dijkstra's letters -- bringing a few colleagues together and beginning to circulate small notes whenever we had something interesting and cohesive enough that it might be wort putting into writing. I don't remember what stopped me, but I still think this is a great way to keep innovation alive. Perhaps it's an idea that I ought to revisit :-).
> Yeah, embedded was what I was talking about and trial&error was what I was worried about.
I'm worried about this, too, and it occasionally drives me insane to see how many people have a "well, let's just get something working and see what happens" approach. It's not just the adversity towards doing some nothing more than simple math first that worries me, it's the fact that I see a lot of people doing this with no regard to how they're going to "see what happens". No serious test methodology, no attempt to at least document assumptions first. It's a wonder we're not at the point where a computer kills someone every day yet.
Trial and error is a natural way to learn things, but it should generally be done just once, ideally by as few persons as possible. We're... not only are we not there yet, we're doing the precise opposite of it.
I've thought about writing down some of these things, but I realized a lot of the "trial and error" spirit by which I learned them still lurks in my understanding of them. And progressing past that is, I've learned, anything but trivial.
> MC14500
Ah, I remember reading about that! I don't remember in what context but I'm sure I've seen the page you mentioned before. It may have been in the context of an article describing OISCs and other minimalistic CPU architecture.
I think these concepts would be great to revisit in the context of the latest developments in microelectronics. One could have thousands of MC14500s on a single chip with today's technology; granted, they could not all talk to each other at the same time due to the limits of interconnects, and not all could be independently interfaced with the outside world, but a hierarchical architecture built out of reliable nodes (and with plenty of room for redundance) might at least be worth investigating.
Even if we go past the realm of on-chip, things have changed dramatically lately. A workstation built out of a hundred Raspberry Pi Zero-grade devices is pretty much on the conceivable side, if not necessarily on the "good idea" side. (The RPi is anything but my favourite system but it's a good example in terms of price, capabilities etc.)
"A long time ago, I thought about doing something similar to Dijkstra's letters -- bringing a few colleagues together and beginning to circulate small notes whenever we had something interesting and cohesive enough that it might be wort putting into writing."
Interesting idea. They sort of do that already with ACM letters, online articles, and so on. If anything, we have so much of this going on that each community silos. That's partly why high assurance INFOSEC and the ITSEC field are two different things. ;) I do occasionally bring up the idea of creating a site hosting top papers and developments in software/security engineering that only invites people that do the research or job. Just people that know shit with a track record and something to bring. They can read the papers, discuss what's in them in moderated forum, and so on. Quite a few like the idea but it will be hard to bootstrap.
"o serious test methodology, no attempt to at least document assumptions first. It's a wonder we're not at the point where a computer kills someone every day yet."
People scratch an itch by modifying an embedded RTOS's kernel. Yeah, I'm amazed we're all still here too.
"Trial and error is a natural way to learn things, but it should generally be done just once, ideally by as few persons as possible. We're... not only are we not there yet, we're doing the precise opposite of it."
Excatly. One example was Burrough's including stack protection into their CPU's. That should've showed up in Intel's stuff as soon as stack attacks became more prevalent. Instead, you see all this trial and error research into tactics that all got beaten. Anything but actually managing one's stack or modifying a CPU to do so. Intel eventually puts it in their off-brand, but good, Itanium CPU. I know Secure64's OS uses it but idk about rest. People still countering stack attacks with tactics to this day despite it solved in 1961.
Ask me about setuid if you want another one even more clever.
"One could have thousands of MC14500s on a single chip with today's technology; granted, they could not all talk to each other at the same time due to the limits of interconnects, and not all could be independently interfaced with the outside world, but a hierarchical architecture built out of reliable nodes (and with plenty of room for redundance) might at least be worth investigating.""
Funny you say that because that was one of my first thoughts. Look up Kilocore for one of those. Other thought was reimplementing ABC Thinking Machines 65,000 CPU design on one or a few chips. The chip's were barely functional: just ALU's or whatever. Still did amazing things in genetic algorithms. One 256 8-bit core design on 500nm accelerated neural networks well in past, too. So, 8-16-bit MPP on a chip are conceivably useful to this day.
Wirth and a few others went to Xerox to see their personal computer with GUI that Steve Jobs also envied. They were told they couldn't buy it. "So, I built one..." (Wirth)
Lilith was a custom computer with all kinds of hardware. The mouse eventually inspired Logitech. The software included a safe language (Modula-2), its compiler, an OS (Medos-2), a relational DB, and some other stuff. The trick was safe language, modularity, and simplicity in implementation details. Two or 3 people over a few years.
Later, that turned into the Oberon series of languages, compilers, and OS's. The extensions or ports usually took 1 or 2 students months to a year or two to pull off. Last one they released IIRC was A2 Bluebottle OS. Primitive, but usable. I keep telling people to rewrite it in other safe, C alternatives to save them work. Students got it done fast so I'm sure hobbyists could too.
it's actually a pretty common task in undergrad CS programs for an individual or a small group (3-4 students) to write an *nix-like OS from the ground up, including most of the core utils, scheduler, process management, etc.
This is obviously much more full featured (includes a GUI), so this is actually a very nicely sized team for the task.
Its actually much easier then it sounds, we had an undergrad write an entire windowing system on-top of implementing the rest of the OS for undergraduate operating systems at UW in a single quarter (2.5 months~).
Also relating to the whole thread, I think these kinds of statements are dangerous. What is a "windowing system" here? There is a huge spectrum. If a windowing system were so easy, why has Wayland taken such a long time? It's best to give a link to the project so people can see what problems were actually solved and how useful it is in practice.
I have been thinking hard about a simple plain text file format and associated tools for the last couple months. Sounds nowhere near as impressive. Yet I hope the end result could eventually make a practical difference.
I've been looking through some of the kernel code [1] out of curiosity, and I'm very surprised to see that most components have almost no internal documentation — minimal file and class comments, and even fewer inline comments. Is this typical of code for something as complex and central as an OS kernel? Are the developers planning to go back later and add documentation, or is the expectation that anyone who might need to work with this code will find its structure and details intuitive?
Yes, I like to read sources of big projects as a relaxing activity and almost never is documentation written in a way that a newbie can immediately catch up. The best projects will have a README.md in every directory that explains what a component or namespace is about, or nice big documentation headers. In my experience though if you really want to contribute the lack of documentation is only a small hurdle. If the code is well written then just following the code from the entrypoint on will give you a very good idea about what's going on. All the files and their names should make sense in under an hour or two.
In my opinion documentation is most important for external consumers of APIs who don't want to spend time grokking the code, and for explaining design decisions. For contributors it would be code quality.
On your question of how typical a lack of system-internal documentation is in systems projects, the answer from my experience is unfortunately more common than it should be. There are a bunch of usage and design patterns that cannot be expressed regardless of the language you are using. I know very little about Rust, but I work mostly in Java, which is safer than many other languages (less flexible than interpreted or dynamic languages, less dangerous than C/C++). Even java, which has interfaces and strict typing can save you some of the more mundane unit tests and input validation code, has a bunch of design patterns on top of the core language. Projects could serve themselves well to document their code well to allow new contributors to look at any of the code and reasonably modify it to fix a bug or add a feature.
In java, the jvm is the OS. Every object in Java is allocated in the heap, and doesn't provide developers direct control of a more optimized way of allocating objects. In rust, you can specify an object to be on the stack or on the heap, rust compiler is very smart and it knows where an object goes out of scope, thus not needed anymore and insert a code on that location to drop that object and free memory. Unlike in Java, which waits for the garbage collector to drop objects.
In a nutshell, java engineering effort is directed into the jvm(jit) to provide high performance when the applications are run. Rust engineering, on the other hand is directed on the compiler figuring out where and when an object is out of scope, no matter how complex it was attached to other objects, passed around multiple functions. It will figure it out.
That's not really how it works. Rust's rules for when an object is freed are very simple, basically the same as C++- they don't depend on references or anything. The complicated part is that the compiler makes sure no references to an object exist after this predetermined point in the program.
Great news! Now it's worth to learn Rust just to keep up with this project.
Does Redox run in qemu only or also in VirtualBox? I tried it with all devices (USB, Network, etc.) disabled. Installation from ISO to /dev/sda worked fine but after reboot I got a hangup: "IDE Primary Master: Unknown Filesystem". Do I have to format the hardisk image with ZFS before installation?
First I have to install the current Rust version. Are there any MD5/SHA256/GPG to verify the integrity? Rust's download page doesn't provide anything like that.
The only odd part is that the modules (drivers) themselves are not referenced by URL, but only by a simple word (in the example "port_io").
Also, I wonder how we could combine drivers. For example, (theoretically) instead of using "https" as driver, we could compose it as "HTTP over TLS over TCP", and change any of those subcomponents as desired. With URLs this might become clumsy.
I don't know about Redox, but Phil's Rust OS (http://os.phil-opp.com/) seems to manage to minimize unsafe Rust and only use it to create some low level OS abstractions that can be safely used. It's pretty neat.
From the book[1]: A quick grep gives us some stats: The kernel has 16.52% unsafe code, a 50% improvement in the last three weeks. Userspace has roughly ~0.2%.
Some compelling projects (GPU) failed for lack of interest. The concepts behind those projects did not fail because most of the projects were isolated. The best time for open hardware is yet to come. Redox could accelerate that.
rust "unsafe" is not about proprietariness of the hardware. It's about doing lower level operations that, in order to express, the rust compiler's safety logic must be disabled.
The point I'm trying to make is that open hardware could also be programmed in Rust, reducing the number of "unsafe" blocks in user applications. Software for proprietary hardware is usually written in unsafe C/C++.
I think what you're trying to say is that architecture-specific code will be isolated in unsafe blocks. That's not necessarily true, as a lot of safe code can definitely benefit from knowledge of the underlying metal. I'm thinking of scheduling in particular.
"unsafe" is a superset of "safe" Rust. There are exactly 3 things you can do in unsafe code that you can't do in safe code: access/modify a global mutable variable, dereference a raw pointer and call other unsafe functions. That's it.
No inline assembly? No ability to manipulate the bits of a pointer for stuff like alignment in a memory manager?
This is truly awful. It means you need to carry around a C compiler for the low-level parts. You could also use assembly, but then you're forced to write whole functions in assembly.
There is something else missing AFAIK. It's not quite as critical, but it sure helps: bitfields
This was done rather badly in C, reducing portability by not letting the programmer fully specify the layout. Normally we mostly ignore portability; for x86 gcc and Visual Studio are compatible.
Imagine writing an x86 emulator which might run on hardware of either endianness. In theory, bitfields are perfect for implementing the GDT, LDT, and IDT. Bitfields are also great for pulling fields out of opcodes. Unfortunately, bitfield layout in C is undefined. The same trouble hits when parsing a file, for example a flash animation file.
One should be able to specify spans of bytes with chosen endianness and bit order, then subdivide each span into fields. Normally each bit should belong to exactly one field, with an error if violated, but it should be possible to define overlapping fields if the programmer insists. Fields should then be able to be joined into larger fields, even if they come from different byte spans. This allows handling split fields such as the x86 descriptor's base and limit or the PowerPC opcode SPR encoding.
Lack of bitfield support and lack of a "restrict" keyword are probably the two biggest things holding me back from rust now.
Rust automatically adds the appropriate 'restrict' annotations to `&mut T` pointers, or well, does generally, but I think an LLVM bug made us take it off temporarily. Point is, this isn't something that you annotate in Rust like you do in C, you use the type system and the compiler handles it where appropriate. (It's more than just &mut T, like, UnsafeCell will also cause the annotation to _not_ happen, for example.)
While I'm not sure this fits with Rust's existing concepts of safety and correctness, it would be interesting (and not just for RT purposes) to have a language in which one could mark functions as <= some complexity (given the usual simplifying assumptions) or provably terminating, and have the compiler throw an error if it can't prove that those constraints are met. Does something like this exist?
The closest thing that comes to mind is total functional languages- the compiler doesn't deal with complexity classes, but it does prove termination. It's mostly used in dependently-typed languages and theorem provers so they can prove the type checker will terminate.
...and consequently limited driver support. Copyleft is permissive. There's no need for non-copyleft licensing unless you want restrictive proprietary licensing somewhere or sometime.
"Limited driver support", here, means ... you don't have access to, and the right to fork, the source code?
As an old engineer, to me, limited driver support always meant: "not that many drivers". You seem to mean: "I can't fork".
Or am I mis-reading you?
Many of those of us in the new-new world of next-gen system-integrators-acting-like-new-software-product-developer types don't always have sole control over the drivers in the/our stack necessary to deliver key features to key clients.
The "hard" open source position of the copy-left crowd incentivizes old-school pragmatic management to take a "why bother" stance and instead of open sourcing 80% and getting yelled at because it isn't 100%, just go with 0%.
Which is sad. An unnecessary. The won't deal with the very real business risk that happens when you liberally treat with zealots.
I like the home page: unusually good choice of attributes for a security or reliability-focused OS. The path usually not taken. So, for those familiar, what milestones has the project achieved since we last discussed it here?
Note: Best bang-for-buck will be getting solid networking, filesystem, time API, and crypto lib in there. People can crank out purpose-built appliances or VM's for all Internet or Web servers with... not ease but easier. 80/20 rule always best for OSS projects to get adoption & contributions up. ;)
I've seen this project a few times - very impressive work! It'll be awesome to see if they can get it self-hosting (right now, I think you need to compile with a different OS).
Why create a new Unix-like OS? Sure, have a Unix compatibility layer, but there are so many ways to improve on Unix. Seems stupid to repeat the mistakes of the past.
It is a shame that OS development is so dependant on toolchains that whatever focus an OS may have, it excludes other areas of development. For example, I very much like Genode. But to redo Genode in Rust would probably be harder than making a Unix-clone. Safe language or safe API's, it seems like you have to choose one.
I'm also curious how to contribute to the GUI. I'll trawl the github in the meantime but if anyone happens to know a good point to start I'd love to hear.
That hasn't stopped anyone from naming tons of Rust libraries after elements and chemistry terms, just like the true etymology of "Python" hasn't stopped people from naming things after snakes. :P
There are three near-term products to develop on this:
- A small router (home/small office, not data center)
- A DNS server
- A BGP server
Those are standalone boxes which do specific jobs, they're security-critical, and the existing implementations have all had security problems. We need tougher systems in those areas.