> I think if you are working at a scale where this is even ridiculously relevant then you will prefer to manually specify required libraries and maintain a closer understanding of control flow at all times.
I work at a scale where this is extremely relevant, and autoloading is a godsend. Manually including files is not only error-prone, but is tedious and requires development effort [time] that would be best utilized elsewhere.
> [...] this particular approach ('autoload') costs too much and delivers too little to be generally praiseworthy
Costs too much? Have you not implemented it? It's not hard at all to implement, even if you have an unorthodox file & class naming convention. The only caveat is that you can only have 1 class per file (which hopefully you'd agree is good practice in the first place).
Performance increases can be pretty large, too, especially at scale. Consider class A, which in some methods requires access to class B. You _could_ put the include inside those methods, but I can guarantee you that you'll see a noticeable performance hit if that method is called in iteration (we're talking a few seconds worth of time if iterated 1000+ times). That's non-trivial. So, what do you do? You could move it to the top of the file, so that it will only be required once. However, the problem here is you end up with 5-10+ includes at the top of every file. So now when you require that one file, you're now requiring an additional 5-10+ files, each with their own possible set of includes. I've seen instances of needing to require a single file result in having 100+ files being required due to the cascading. With an autoloader, that 100+ goes back to 1.
> whilst harming simplcity
This couldn't be further from the truth. You will have less LoC overall, less developer time spent trying to get the perfect set of includes so that you only require them once and only when they're needed. You will have fewer bugs due to the problems with manual includes, which again, means more time for developers to actually do stuff that matters.
> I work at a scale where this is extremely relevant
Really? You run Facebook or something? Got any figures?
> autoloading is a godsend. Manually including files is not only error-prone, but is tedious and requires development effort [time] that would be best utilized elsewhere.
Hang on. How many files are you including? Generally you only need to include one, right. Your framework, maybe an external library. Then you can call in whatever modules you need, as you need them. Right? Or are you using some weird kind of framework without modules?
> > [...] this particular approach ('autoload') costs too much and delivers too little to be generally praiseworthy
> > [...] whilst harming simplcity
> Costs too much? Have you not implemented it? It's not hard at all to implement, even if you have an unorthodox file & class naming convention. The only caveat is that you can only have 1 class per file (which hopefully you'd agree is good practice in the first place).
To clarify, the overheads I was referring to were (1) conceptual; less "WTF" for new developers to deal with (2) complexity; code-flow is both simpler and more explicit.
> Performance increases can be pretty large, too, especially at scale. Consider...
If you can provide some demo code, I'd love to see it. I think you will find largely it's just a result of having a dumb framework or being ultra-contrived in your example to an unrealistic degree. I also think you'll find that second-guessing require_once (a language-level feature) is not going to be faster than simply using it.
> Removing require_once in favour of __autoload shows
> one of the biggest performance improvements in my
> entire application - I shaved off roughly 220
> milliseconds by removing about 15 (or so) calls to
> require_once in my bootstrap.php file. And that's
> with APC enabled, and a decent sized realpath.cache
> (and .ttl).
Presumably this lazy loading approach starts being faster once your codebase exceeds a certain size. Everything you've said so far indicates that you tend to work on smaller, fire-and-forget PHP projects, so maybe this just doesn't resonate with your experience and I can certainly respect that.
No matter what your gut feeling about language-level features may be, I know that at my day job, we measured a clear performance boost when we removed require_once and switched completely to autoloading. For those of us working on very large, long-term PHP codebases, autoloading really is a good thing and I hope you can respect that in return.
Good answer. I'm not saying that including code you don't want to use isn't going to shave of milliseconds, I'm just not convinced that the problem is being adequately identified. The problem honestly sounds like the codebase itself. And if that's something built with Zend, then perhaps that doesn't reflect well on Zend's innate structure, and Zend's only effective way around this is through 'autoload'.
> The problem honestly sounds like the codebase itself.
Any reasonably sized codebase is going to load a lot of files. Because your projects are so small that this is not an issue is not a reflection on the quality of every one else's work.
The subject of the conversation: "Given that (a) we need to load files (b) loading files takes time ... do we, consider performance: (1) use an autoloader on some mega-framework; or (2) be explicit and require directly."
I feel this comment contributes nothing to the conversation but a baseless personal accusation.
On the subject of the conversation, we can either let the computer do the work for us or waste time specifying it manually in a way that it's nearly impossible to do as effectively.
The code is already explicit enough, requiring directly is pointless busy-work that is more likely to be wrong.
> Really? You run Facebook or something? Got any figures?
100 total devs, over 10 years of development, with the codebase spanning 5k files and over 1 million LoC.
> Hang on. How many files are you including? Generally you only need to include one, right. Your framework, maybe an external library. Then you can call in whatever modules you need, as you need them. Right? Or are you using some weird kind of framework without modules?
This is a homebrew framework, but you're missing the point: regardless of how you kick it off, the framework itself still has to include files left and right. And modules or not, some code somewhere still has to include supporting files.
For every single file in your codebase, you will have at least one include call to it somewhere, possibly many more. It's silly to look at this as a "how many includes do you need to launch a request", because any simpleton can make it so that all you see is a single include, but that doesn't stop the fact that you must still manually include each and every other file -- at some point in the framework -- before you can use it.
> (1) conceptual; less "WTF" for new developers to deal with
I'll point to the supporting evidence provided by others here that shows that pretty much every modern framework uses autoloading. Thus, I wouldn't say it's a WTF, rather, the opposite is probably more true: who manually includes files anymore?
> (2) complexity; code-flow is both simpler and more explicit.
Removing include calls does not make the code more complex nor harder to follow, any decent IDE will let you control-click into the definition of classes. I'd argue that less LoC is a good thing (so long as we're not talking about "clever code", which this isn't).
> If you can provide some demo code, I'd love to see it.
Or, you know, you can read what I said as I perfectly explained the situation. But I'll play along.
Note that test.php was a completely empty file (and thus no parsing required); in reality, those files are often 1000+ lines of code and take much longer to parse.
Even with this simple example, the non-require version is 2000x's faster than the require version. This is also in a perfect world, e.g. there's no other files also being required that'd overflow the realpath cache, nor is there really any other OS activity that will hamper my tests. In the wild, there could be 50-100 requests being served concurrently, which would affect the system calls even more.
> I think you will find largely it's just a result of having a dumb framework or being ultra-contrived in your example to an unrealistic degree.
It's a dumb framework, but considering that it's a legacy project that has survived 10 years of organic growth and is still paying my salary (I've been here less than 1/2 of that time), it's not getting re-written anytime soon. But, weren't we talking about autoloading?
> 100 total devs, over 10 years of development, with the codebase spanning 5k files and over 1 million LoC.
OK, so your codebase is certainly large, ungodly large, and frankly worrisome, but that alone isn't that all that interesting. Is it possible to feed my curiosity and supply queries/sec? :)
> ... any simpleton can make it so that all you see is a single include, but that doesn't stop the fact that you must still manually include each and every other file -- at some point in the framework -- before you can use it.
No, you only need to include the files you actually use. When you use them.
There's a strong argument to be had for being explicit, especially once performance becomes a concern.
The thread here is about arguing for autoload on performance grounds, and my opposition to such.
> pretty much every modern framework uses autoloading.
Pretty much every framework is crap, and a very bad way to build larger, complex systems, and maintain them over time. The majority of them are CMS and conventional website oriented, and apply poorly to other use cases. But you are perfectly welcome to limit your perceptions based upon what you consider to be modern and popular frameworks if you wish; it's entirely possible that - as I suspect - my experiences going beyond the above are a bit abnormal.
> Removing include calls does not make the code more complex nor harder to follow, any decent IDE will let you control-click into the definition of classes. I'd argue that less LoC is a good thing (so long as we're not talking about "clever code", which this isn't).
If you are looking at a block of code and the execution flow is dependent on an entire framework's dependency resolution mechanism (autoload or similar), with inexplicit dependencies, then it seems to be an obtuse perspective to argue that the code has not lost the property of being explicit.
> Even with this simple example, the non-require version is 2000x's faster than the require version. This is also in a perfect world, e.g. there's no other files also being required that'd overflow the realpath cache, nor is there really any other OS activity that will hamper my tests. In the wild, there could be 50-100 requests being served concurrently, which would affect the system calls even more.
First of all, clarity of code is far more important than performance in almost every case. If you really want to justify the lack of explicit dependencies on a performance basis, my point is that it's not possible.
> with this simple example, the non-require version is 2000x's faster than the require version
Err, here's what I got.
test1: 0.0000250
test2: 0.0177090
test3: 0.0062518 (same as test2 but with require_once instead of require)
Yeah the overhead is larger, no it doesn't really mean anything. However you include, you still have to include. not including is faster than require_once() is faster than require(). Nothing shocking there. I believe if you add APC and php-cgi and/or tmpfs as a mountpoint (vs. raw disk) this will decrease markedly, with no requirement to alter code.
I remain unconvinced.
Props on the custom framework vibe, IMHO every time I've looked, what's out there prebuilt is all crap (not relevant for anything but web-only, OO-centric, high overhead both performance and cognitive).
I work at a scale where this is extremely relevant, and autoloading is a godsend. Manually including files is not only error-prone, but is tedious and requires development effort [time] that would be best utilized elsewhere.
> [...] this particular approach ('autoload') costs too much and delivers too little to be generally praiseworthy
Costs too much? Have you not implemented it? It's not hard at all to implement, even if you have an unorthodox file & class naming convention. The only caveat is that you can only have 1 class per file (which hopefully you'd agree is good practice in the first place).
Performance increases can be pretty large, too, especially at scale. Consider class A, which in some methods requires access to class B. You _could_ put the include inside those methods, but I can guarantee you that you'll see a noticeable performance hit if that method is called in iteration (we're talking a few seconds worth of time if iterated 1000+ times). That's non-trivial. So, what do you do? You could move it to the top of the file, so that it will only be required once. However, the problem here is you end up with 5-10+ includes at the top of every file. So now when you require that one file, you're now requiring an additional 5-10+ files, each with their own possible set of includes. I've seen instances of needing to require a single file result in having 100+ files being required due to the cascading. With an autoloader, that 100+ goes back to 1.
> whilst harming simplcity
This couldn't be further from the truth. You will have less LoC overall, less developer time spent trying to get the perfect set of includes so that you only require them once and only when they're needed. You will have fewer bugs due to the problems with manual includes, which again, means more time for developers to actually do stuff that matters.