One of the many problems was trying to limit the use of Optane to Intel devices. They should have manufactured and sold Optane memory and let other players build on top of it at a low level.
Which “Optane memory”? The NVMe product always worked on non-Intel. The NVDIMM products that I played with only ever worked on a very small set of rather specialized Intel platforms. I bet AMD could have supported them about as easily as Intel, and Intel barely ever managed to support them.
The consumer "Optane memory" products were a combination of NVMe and Intel's proprietary caching software, the latter of which was locked to Intel's platforms. They also did two generations of hybrid Optane+QLC drives that only worked on certain Intel platforms, because they ran a PCIe x2+x2 pair of links over a slot normally used for a single X2 or x4 link.
Yes, the pure-Optane consumer "Optane memory" products were at a hardware level just small, fast NVMe drives that could be use anywhere, but they were never marketed that way.
Exactly. I happen to have all AMD sitting around here, and buying my first Optane devices was a gamble, because I had no idea if they'd work. Only reason I ever did, is they got cheap at one point and I could afford the gamble.
That uncertainty couldn't have done the market any favors.
I feel like this is proving my point. You can’t read “Optane” and have any real idea of what you’re buying.
Also… were those weird hybrid SSDs even implemented by actual hardware, or were they part of the giant series of massive kludges in the “Rapid Storage” family where some secret sauce in the PCIe host lied to the OS about what was actually connected so an Intel driver could replace the OS’s native storage driver (NVMe, AHCI, or perhaps something worse depending on generation) to implement all the actual logic in software?
It didn’t help Intel that some major storage companies started selling very, very nice flash SSDs in the mean time.
> were those weird hybrid SSDs even implemented by actual hardware, or were they part of the giant series of massive kludges
They were definitely part of the series of massive kludges. But aside from the Intel platforms they were marketed for, I never found a PCIe host that could see both of the NVMe devices on the drive. Some hosts would bring up the x2 link to the Optane half of the drive, some hosts would bring up the x2 link to the QLC half of the drive, but I couldn't find any way to get both links active even when the drive was connected downstream of a PCIe switch that definitely had hardware support for bifurcation down to x2 links. I suspect that with appropriate firmware hacking on the host side, it may have been possible to get those drives fully operational on a non-Intel host.
Why on Earth did Intel implement this as a 2x2 device? They could have implemented multiple functions or they could have used a PCIe switch or they could have exposed their device as an NVMe device with multiple namespaces, etc. (I won’t swear that all of these would have worked nicely. But all of them would have performed better than arbitrarily splitting the link in half.)
Maybe they didn’t own any of the IP for the conventional SSD part and couldn’t make it play ball?
The Optane side of the drive used the same x2 controller as the pure-Optane cache drives. The NAND side used a Silicon Motion controller, same as their consumer QLC drives of the era. They almost literally just crammed their two existing consumer products onto one PCB and shipped it. Intel was never interested enough in the consumer applications of Optane to design a good, useful SSD controller around it, and they weren't going to let a third-party like Silicon Motion make an Optane-compatible controller.
I’m curious to know more about your setup! Which switches do you prefer? What hardware are you using for proxmox? And what does your network look like?
For the switches I'm considering replacing them all with 2.5gbit but don't see the need to yet. Currently I have a TL-SG1016DE as the core switch. The main proxmox servers are 3 used Dell 1U servers I bought from Ebay. Each has 256GB ECC ram, 2x 8 core CPUs, 4x Gbit intel nics. I flashed the PERC card to be a plain SCSI controller so ZFS in Proxmox has direct access to the disks. If I were to buy them today I'd look for R720's or newer. I got mine for about $800 USD each. They're overkill, but provide a lot of capacity. They're also unnecessary, you can ignore them and only consider the rest of this comment. They're the most expensive, hottest, and loudest devices on the network.
I have a separate tower that's a old 9th gen intel that provides the large ~50TB ZFS NFS server. It used to be an intel Atom, but that finally died after 10 years so I moved the drives to a gaming PC I had lying around. Over the years, nicest thing about ZFS and Proxmox is the drives are fully independent of the hardware and the software OS they're attached to. Now, I just pass the devices through Proxmox to a Debian VM and they come up just like they did before.
Regarding the rest of the network, let's move from the edge in toward the 3x 1U servers and NFS storage box. I have 1gig symmetric fiber from Ziply. The ONT has cat5 running to 1 of the 4 gig ports in an Intel Atom C2758. The other 3 ports are bridged together in Proxmox to act as a switch. It kind of looks like an EdgeRouter-4 if you squint at the ports. This C2758 only runs a single VM, OpenWRT. The nice thing is I can take snapshots before upgrades, and upgrade or replace the hardware easily.
The OpenWRT VM is the most critical thing in the whole network. I try to manage it simply, I have some shell scripts that copy the /etc/config files into place and restart services for a simple IaC setup.
The main services OpenWRT provides are:
1. WAN DHCP client, my ISP doesn't offer static IPs.
2. One minute cron job that makes sure the A record for home.example.com is correct. *.home.example.com is a CNAME to home.example.com, this simplifies configuration and TLS cert management.
3. HAProxy runs on OpenWRT listening on 0.0.0.0:80 and 0.0.0.0:443 Extremely valuable for SNI routing of TLS connections. I moved the LuCI web UI to alternate ports, which is simple to do via config.
4. dnsmasq provides dhcp and dns for the main and guest VLANs.
5. OpenWRT is configured as a WireGuard server. Each wireguard client device is allocated an dedicated IP in a separate 192.168.x/24 subnet. This has been great for source based IP access control which I'll cover below. Wireguard clients connect to home.example.com.
That's it for OpenWRT. The key lesson I learned is it's been incredibly valuable to run haproxy on OpenWRT. All L4 connections terminate to it, but crucially it does not handle TLS certificates. It only forwards TCP connections based on the SNI in the client hello. HAProxy is also configured to use the PROXY protocol to preserve source IP addresses, which has been great for access control.
Most TLS connections are forwarded to a single node Talos VM running on another Proxmox host. This VM runs Cilium, Istio, and the Gateway API. The istio envoy gateway is configured to accepts PROXY protocol connections, which means AuthorizationPolicy resources work as expected. By default, only connections coming from the local subnets, or the wireguard subnet are allowed. OpenWRT does hairpin NAT, so this works just fine, all sources connect to the WAN IP regardless if they're internal or external.
I don't do much with Kube yet, most of the traffic is forwarded on to another VM running Portainer. Most of my backend services are in Portainer. The Kube VM does handle Certificate and AuthorizationPolicy resources though, using cert-manager and Istio. This has been nice, I don't need to configure each service for TLS or access control in bespoke way, it's all in one place.
The only other thing to note is the Dell 1U servers have 3 of their 4 gig nics aggregated into LACP bonds. Similar to the Atom router, they're configured as a bridge in Proxmox and I use them for the Ceph data plane. 9 of the 16 ports in that TL-SG1016DE are just for Ceph and I'm able to get close to 600 MiB/sec reads (yes megabytes) which is pretty neat given 1gbit interfaces.
That's about it. Overall I'm trying to eliminate VLAN's, but it still makes sense to have them for Ceph and for a Guest wifi network.
Edit: Lastly I've maintained a home lab for 25 years and this is the best iteration yet. All of the trade-offs feel "right" to me.
What effect, if any, did you notice on sensor outputs from heat generated by other system components? e.g., constant temperature offset after reaching thermal steady state
We have not tested this properly (see my other comment on lab testing), but so far we have observed no effect during normal operation and only saw a slight increase when charging the device (~1°). We took great care to move the sensors far away from power management. Luckily, the display cutout also helps here.
Don’t look to large, well-known registrars. I would suggest that you look for local registrars in your area. The TLD registry for your country/area usually has a list of the authorized registrars, so you can simply search that for entities with a local address.
Disclaimer: I work at such a small registrar, but you are probably not in our target market.
I miss the days when Network Solutions had a permanent option to switch/sign-up with a PGP key, binding all future communications and change requests to it.
I forget how they handled key expiration/revocation...
Since you asked, I use Cloudflare for my registrar. I can’t really say if it’s objectively better or worse than anybody else, but they seemed like a good choice when Google was in the process of shutting off their registry service.
I use Cloudflare for everything I can and then currently use Namecheap for anything it doesn't support. I haven't tried Porkbun mostly because I'm okay with what I have already.
After Google ditched Domains I moved to Route53. I guess the only downside is that it doesn't handle some TLDs?
What you want from a registrar is to keep existing for many years and resilience to social engineering, and AWS seems like the next best thing to Google which you famously can't even talk to for a social engineering attempt. I expect AWS account management to be almost as good as Gaia, but don't really know how hard social engineering is.
I recently had to bail on Gandi. I had a special requirement, being Canadian, in that I didn't want to use a registrar in the USA. I found a Canadian registrar that seemed to have the technical stuff reasonably worked out (many don't) and had easy to understand pricing:
Not sure, I use dnsimple for dns and wrote my own little service to update my A record, no ip6 in my corner of the world so have not checked for AAAA record support.
CF sells domains at cost so you're not going to beat them on price, but the catch is that domains registered through them are locked to their infrastructure, you're not allowed to change the nameservers. They're fine if you don't need that flexibility and they support the TLDs you want.
Not really; it was a gimmick. They used standard forecast post-processing techniques to bias correct global/regional weather models. There is virtually no evidence they actually used device data in this process.
reply