This is a big, general problem with CI providers I don't hear talked about enough: because they charge per-minute, they are actively incentivized to run on old hardware, slowing builds and milking more from customers in the process. Doubly-so when your CI is hosted by a major cloud provider who would otherwise have to scrap these old machines.
I wish this were only a theoretical concern, a theoretical incentive, but its not. Github Actions is slow, and Gitlab suffers from a similar problem; their hosted SaaS runners are on GCP n1-standard-1 machines. The oldest machine type in GCP's fleet, the n1-standard-1 is powered by a variety of dusty, old CPUs Google Cloud has no other use for, from Sandy Bridge to Skylake. That's a 12 year old CPU.
There are workloads where newer CPUs are dramatically faster (e.g. AVX-512), but in general the difference isn't huge. Most of what the newer CPUs get you is more cores and higher power efficiency, which you don't care about when you're paying per-vCPU. Which vCPU is faster, a ten year old Xeon E5-2643 v2 at 3.5GHz or a two year old Xeon Platinum 8352V at 2.1GHz? It depends on the workload. Which has more memory bandwidth per core?
But the cloud provider prefers the latter because it has 500% more cores for 50% more power. Which is why the latter still goes for >$2000 and the former is <$15.
> Which vCPU is faster, a ten year old Xeon E5-2643 v2 at 3.5GHz or a two year old Xeon Platinum 8352V at 2.1GHz? It depends on the workload.
It really does not depend on the workload, when those workloads we're talking about are by-and-large bounded to 1vCPU or less (CI jobs, serverless functions, etc). Ice Lake cores are substantially faster than Ivy Bridge; the 8352V will be faster in practically any workload we're talking about.
However, I do agree with this take, if we're talking about, say, lambda functions. The reason being that the vast majority of workloads built on lambda functions are bounded by IO, not compute; so newer core designs won't result in a meaningful improvement in function execution. Put another way: Is a function executing in 75ms instead of 80ms worth paying 30% more? (I made these numbers up, but its the illustration that matters).
CI is a different story. CI runs are only bound by IO for the smallest of projects; downloading that 800mb node:18 base docker image takes some time, but it can very easily and quickly be dwarfed by all the things that happen afterward. This is not an uncontroversial opinion; "the CI is slow" is such a meme of a problem at engineering companies nowadays that you'd think more people would have the sense to look at the common denominator (the CI hosts suck) and not blame themselves (though, often there's blame to go around). We've got a project that can build locally, M2 Pro, docker pull and push included, in something like 40 seconds; the CI takes 4 minutes. Its the crusty CPUs; its slow networking; its the "step 1 is finished, wait 10 seconds for the orchestrator to realize it and start step 2".
And I think we, the community, need to be more vocal about this when speaking on platforms that charge by the minute. They are clearly incentivized to leave it shitty. It should even surface in discussions about, for example, the markup of lambda versus EC2. A 4096mb lambda function would cost $172/mo if ran 24/7, back-to-back. A comparable c6i-large: $62/mo; a third the price. That's bad enough on the surface, and we need to be cognizant that its even worse than it initially appears because Amazon runs Lambda on whatever they have collecting dust in the closet, and people still report getting Ivy Bridge and Haswell cores sometimes, in 2023; and the better comparison is probably a t2-medium @ $33/mo; a 5-6x markup.
This isn't new information; lambda is crazy expensive; blah blah blah; but I don't hear that dimension brought up enough. Calling back to my previous point: Is a function executing in 75ms instead of 80ms worth paying 30% more? Well, we're already paying 550% more; the fact that it doesn't execute in 75ms by default is abhorrent. Put another way: if Lambda, and other serverless systems like it such as hosted CI runners, enables cloud providers to keep old hardware around far longer than performance improvements say it should be; the markup should not be 500%. We're doing Amazon a favor by using Lambda.
> It really does not depend on the workload, when those workloads we're talking about are by-and-large bounded to 1vCPU or less (CI jobs, serverless functions, etc). Ice Lake cores are substantially faster than Ivy Bridge; the 8352V will be faster in practically any workload we're talking about.
If you were comparing e.g. the E5-2667v2 to the Xeon Gold 6334 you would be right, because they have the same number of cores and the 6334 has a higher rather than lower clock speed.
But the newer CPUs support more cores per socket. The E5-2643v2 has 6, the Xeon Platinum 8352V has 36.
To make that fit in the power budget, it has a lower base clock, which eats a huge chunk out of Ice Lake's IPC advantage. Then the newer CPU has around twice as much L3 cache, 54MB vs. 25MB, but that's for six times as many cores. You get 1.5MB/core instead of >4MB/core. It has just over three times the memory bandwidth (8xDDR4-2933 vs. 4xDDR3-1866), but again six times as many cores, so around half as much per core. It can easily be slower despite being newer, even when you're compute bound.
> We've got a project that can build locally, M2 Pro, docker pull and push included, in something like 40 seconds; the CI takes 4 minutes. Its the crusty CPUs; its slow networking; its the "step 1 is finished, wait 10 seconds for the orchestrator to realize it and start step 2".
Inefficient code and slow hardware are two different things. You can have the fastest machine in the world that finishes step 1 in 4ms and still be waiting 10 full seconds if the system is using a timer.
But they're operating in a competitive market. If you want a faster system, patronize a company that provides one. Just don't be surprised if it costs more.
Lambda is good for bursty, typically low activity applications, where it just wouldn't make sense to have EC2 instances running 24x7. There about some line-of-business app that gets a couple of requests every minute or so. Maybe once a quarter there will be a spike in usage. Lambda scales up and just handles it. If requests execute in 50ms (unlikely!) or 500ms, it just doesn't matter.
Not quite sure I follow. But I built an asp.net api and deployed it into lambda and it cost $2/m and when it started to get more traffic and the cost got to $20/m I moved it to a t4g instance.
When I moved it, I didn’t need to make any code changes :) I just made a systemd file and deployed it.
For this to be true IPC would have to have stagnated for 10 years which is not the case. Look at Agner's instruction tables for different uarchs and compare.
I sure hope not, GitHub is supposed to be matured infrastructure at this point, where most if not all changes going into production should be very well tested and nothing that multiple people haven't verified as being correct should end up being deployed and released.
Besides, Microsoft surely has 24/7 watch of their infrastructure, even on weekends, it's a huge company.
> Besides, Microsoft surely has 24/7 watch of their infrastructure, even on weekends
"watching" with a dedicated team vs "waking up everyone in engg because things are on fire" are two very different things.
Besides, size doesn't work that way. The larger the organization and the more complex the product is, the higher the chance some unexpected interaction will occur. There are processes and automation that can mitigate this, but one can never be completely certain.
What's your pager rotation like? I want to say you have follows-the-sun, and so your on-call shifts are 12-hours long and you swap with a team on the other side of the world from you so you can get said sleep, but I don't want to just assume that.
Why does everyone need to be in the room? I have a groomed backlog and can talk to people async as needed. We also record meetings if you missed them and depending on the context of the meeting or importance, we'll hold timezone friendly meetings for everyone as required.
Less "on" hours then? Even Google has diurnal patterns when there's a lower amount of traffic simply due to the fact that humans are unevenly distributed across the Earth's surface. And Google does code freezes for the holidays where they don't deploy at all.
It seems like buildjet is competing directly with GitHub on price (GitHub has bigger runners available now, pay per minute), and GitHub will always win because they own Azure, so I'm not sure what their USP is and worry they will get commoditised and then lose their market share.
Actuated is hybrid, not self-hosted. We run actuated as a managed service and scheduler, you provide your own compute and run our agent, then it's a very hands-off experience. This comes with support from our team, and extensive documentation.
It's amazing to see big company can throw so much engineering effort into it, while for majority of the CI users, just getting a 2x faster CI machine can achieve the same outcome with much less cost.
Speaking from experience working on CI at a large company, I'm sure they've "just got a 2x CI machine" about 6 times. At some point you can't just burn more money and you need to optimise
Computers are usually way cheaper than people. The difference would just be Cap Ex vs Op Ex. That being said, they must be burning zillions of cycles rebuilding code that hasn't changed by using a monorepo.
I've been thinking about this for a while, and it seems to apply to a lot of things; serving a HTTP request on cheap EC2 instances won't come close to doing it on a dedicated server with great single-thread performance.
So even though you can more easily horizontally scale and handle infinite requests, the latency of each request will be much poorer than if you were just running on better hardware.
fwiw there’s a cap to how much perf you can extract from an instance. we use R6 32cpu 64gb ram for our builders. we can’t really 2x that from a price point again lol
These companies often have thousands of machines in their CI fleets. It really can be cheaper to pay engineers to optimize rather than just buying more or bigger instances.
what makes you think they havent already done so? If they're running Jenkins and/or buildkite, they're managing their own runners so they're not jumping from GitHub actions runners to 8/16 core machines.
I've heard lots of people talking about Estonia but can you name any international remote-first startup that became successful and raised money with an Estonian entity? I don't and this makes me somewhat suspicious :)
I don't know of remote-first companies, but I know of many Estonian companies that have raised a lot of money and been incredibly successful.
To name a few:
- Bolt
- Wise (TransferWise)
- Pipedrive
I have a long history with e-Residency and know many e-Residents (e-Resident since 2015, 3 companies, member of EERICA.ee). Happy to chat about it. Email in bio.
I was e-resident too but had to become resident in Estonia. Cause if you are abroad and distribute yourself dividend (as a sole entrepreneur) your host country would consider your Estonian company local and thus it constitutes a permanent establishment and that becomes extremely complicated
Of course--but this is the case for almost all jurisdictions of record vs jurisdictions of taxation. Few countries want to allow a company to operate 100% in their borders without extracting some degree of taxation. (In fact, this is basic OECD taxation doctrine.)
My situation is complex, but, generally, the advantages of Estonian registration are found in drastically simplified and lower-cost business registration and processing (vs say a GmbH in Germany or Switzerland with high share capital and accounting costs) or ease of operation for digital nomads or fully remote companies. A OÜ isn't for everyone in every life situation, but when it fits, it tends to work really well.
A LLC or C Corp in the US could work just as well (or better), depending on the situation.
Just to clarify: since you are only an e-resident of Estonia, are all your three Estonian companies paying the corporate income tax (CIT) in your personal country of residence?
Since Switzerland is currently the only country in Europe without CFC rules[1], it seems that an Estonian company can only be managed from Estonia or Switzerland without being considered as local for tax purposes in another jurisdiction.
This means my situation pretty much does not apply to anyone else, and the tax residency question is very complex in my case, as well. I won't bore you with details. :)
Sure. But since you have a lot of experience with the Estonian e-residency, maybe you know any real life examples where managing an Estonian company without personally being an Estonian tax resident made sense?
In Germany, it reduces the overhead of owning a company dramatically (the overhead of a German GmbH is quite a bit more than just taxes, including mandatory registrations, etc.).
In Switzerland, it reduces necessary share capital from 25'000CHF to 2,500EUR (which can be deferred).
In the US, it probably doesn't make sense unless your situation is very special.
As a digital nomad, it gives you a clear business home while traveling the world (and running everything online is essential).
Estonian accounting and business management is all electronic, so you can essentially run your business 99% in Estonia, do yearly tax reports in your resident, all while reducing the day-to-day complexity of running your business significantly. (No more paper or faxing!)
For non-EU residents, an Estonian entity gives them a clear legal path to marketing and selling in the EU with proper VAT, etc. reporting.
Every one I know has had slightly different reasons while Estonia made sense for them. I'm also purposefully not addressing the intangibles of registering a business in a well-functioning, well-regulated, forward-looking jurisdiction, which is a large component for many of my friends. (We care about Estonia, like its ideals, and want to see it succeed on a global scale.)
I second that. There are growing number of startups and increasing capital pouring into Estonia (hence the largest inflation rate in Europe)
Wise and Bolt are good examples.
this is very naive reimplentation of the C# version. I managed to reduce the runtime of the same file from 5.7 seconds to just 800ms
using var file = File.OpenRead("file.bin");
var counter = 0;
var sw = Stopwatch.StartNew();
var buf = new byte[4096];
while (file.Read(buf,0,buf.Length) > 0)
{
foreach (var t in buf)
{
if (t == '1')
{
counter++;
}
}
}
sw.Stop();
Console.WriteLine($"Counted {counter:N0} 1s in {sw.Elapsed.TotalMilliseconds:N4} milliseconds");
This code has a bug that can cause the count to be overreported. The last `file.Read` may only partially fill the buffer, but this code will look for 1s in the entire buffer.
(This bug won't affect the performance comparison, but I was just reminded of how error prone these kinds APIs can be vs the PHP/Python route of having the library function just allocate a new buffer each time.)
most of the latency comes from network layer. my naiive guess is they probably switched from a standard ethernet setup to a infiniband setup to achieve 600us of total latency.
Unfortunately, "China" holds grudges (in quotes, because I suspect it might not even be the government, but different officials wanting to appear as good citizens to the government, and definitely not Chinese people), so it won't be that easy: Simon will likely need to step down (though he's kinda being diplomatic in only "wanting answers", not putting particular blame on anyone, but he's still pretty adamant).
Witness NBA-Houston Rockets case which caused Houston Rockets games to be blacked out for 15 months even though Morey moved on from Houston late in 2020 after tweeting in support of Hong Kong protests early in the year.
What did CCTV do? Ban 76ers NBA games who are the current employer for Morey.
https://buildjet.com/for-github-actions/blog/a-performance-r...