The performance graph is deceptive for two reasons: (1) Leaf with CuDNN v3 is a little slower than Torch with CuDNN v3, yet the bar for leaf is positioned to the left of the one for Torch, and (2) there's a bar for Leaf with CuDNN v4, but not for Torch.
It's good to see alternatives to Torch, Theano, and TensorFlow, but it's important to be honest with the benchmarks so that people can make informed decisions about which framework to use.
And I don't believe the first point counts as deceptive; the bars are ordered by Forward ms, not by the sum of Forward and Backward. In both CuDNN v3 and v4, Leaf is faster than Torch by that metric (25 vs 28 for v4, 31 vs 33 for v3).
Yes, on their site they post Torch CuDNN v4 as faster than Leaf [0]. Seems exciting for an early release.
Can it get much faster than something like Torch? I would think if CuDNN is doing most of the computation time it would be hard to see big improvements. Perhaps go the route of Neon and tune your GPGPU code like crazy [1, 2], or MXNet and think about distributed computing performance [3].
> Leaf with CuDNN v3 is a little slower than Torch with CuDNN v3, yet the bar for leaf is positioned to the left of the one for Torch
I think that's because they're sorting by forward time rather than forward+backward. That would also explain why in the Alexnet benchmark Tensorflow (cuDNN v4) is to the left of Caffe (cuDNN v3) despite having a much taller bar overall.
That's rarely a problem where MRV makes sense. Though you could always fix it by using anonymous structs whose fields are both named and positional (similar to Python's namedtuples)
I'm guessing that's the point at which the working set exceeds the L1 cache size. You can see a few more subtle dips in the performance graph at later points; these correspond to working set spilling out of the L2 and L3 caches.
I've had great success with Blaze, despite the fact that it has received little publicity compared to alternatives like Eigen, Armadillo, etc. Blaze is consistently the leader of the pack in benchmarks, and even outperforms Intel MKL on the Xeon E5-2660 (the CPU for which the benchmark results are shown).
I've used Blaze for machine learning applications, where I've relied on the performance of elementwise operations and dense matrix multiplication on a single machine (the results advertised in the benchmark). Eigen has more functionality, but in my experience is not always optimized as well as Blaze. Neither has support for distributed computing, but I believe this is a problem that HPX is trying to address: https://github.com/STEllAR-GROUP/hpx
That's because direct solvers can't scale. If you want to solve a large (distributed over hundreds of nodes) sparse linear algebra problem as fast as possible, decades of research have been poured into efficient techniques (Krylov methods, Multigrid, preconditioners) for solving them iteratively.
Can't scale in a weak, strong, or asymptotic complexity sense? And for what sorts of problems (I assume you're thinking of 2D and 3D PDEs discretized with local basis functions)?
Yes, I'm thinking of discretizations of elliptic 2D/3D PDEs. They don't scale in the weak or strong sense, and they can't hold O(n log n) asymptotic complexity due to fill-in from Cholesky/LU-style factorizations.
If you want to generate LaTeX from Markdown, you can use Pandoc. Pandoc has various extensions to regular Markdown (including inline math, tables, etc.), so this gives you some flexibility when producing more complicated types of documents. In fact, Pandoc converts from Markdown to LaTeX to PDF when you choose PDF as the output format.
This is an important statement and should be upvoted more. Case in point: "the Weierstrass approximation theorem states that every continuous function defined on a closed interval [a, b] can be uniformly approximated as closely as desired by a polynomial function."
> but a migration of research interest away from neural nets seemed increasingly promising, and today, the migration seems largely complete.
What are you talking about? Deep learning is one of the hottest areas of research today, and a lot of it has to do with neural networks. NN's are the state of the art in several domains. Case in point: http://image-net.org/challenges/LSVRC/2014/results. All of the top entries use convolutional networks; in fact, almost all of the entries do.
The fact that the loss function represented by a neural network can be highly nonconvex is what makes them so effective in the domains in which they are used. See this presentation by Yann LeCun for more info: http://www.cs.nyu.edu/~yann/talks/lecun-20071207-nonconvex.p...
"ML theory has essentially never moved beyond convex models, the same way control theory has not really moved beyond linear systems. Often, the price we pay for insisting on convexity is an unbearable increase in the size of the model, or the scaling properties of the optimization algorithm ... This is not by choice: nonconvex models simply work better.
Have you tried acoustic modeling in speech with a convex loss? ... To learn hierarchical representations (low-level features, mid- level representations, high-level concepts....), we need “deep architectures”. These inevitably lead to non-convex loss functions."
This isn't to say that NN's are going to solve all our problems, but to say that there has been a shift in interest away from NN's is absurd.
Parent might be living in the recent past. There was a migration away from NNs in the 90s/early 00s, then Hinton and other people brought it back to life...with a vengeance :)
Exactly. The history of NN is full of ups and downs and it's becoming increasingly popular again the form of Deep Learning thanks to increasing cloud processing power and advancements by Hinton and others. Most to of the traditional criticism of NN is related to shallow nets. But deeper and far more complex structures like those in the animal brains are not explored enough.
It's good to see alternatives to Torch, Theano, and TensorFlow, but it's important to be honest with the benchmarks so that people can make informed decisions about which framework to use.