Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
No Phd, No Problem: New Schemes Teach the Masses to Build AI (economist.com)
245 points by jkuria on Oct 28, 2018 | hide | past | favorite | 124 comments


The one thing I see over and over and over again, in these articles and associated discussion, is failure to recognize the distinction between "ML research" and "applied ML". You always get people in the threads yelling "you can't do ML without graduate level knowledge of measure theory, real analysis, information theory, ..." and so on, ad infinitum. And that is very likely true for most ML research. But OTOH, you can absolutely take off the shelf libraries, the level of knowledge you would get from, say, taking Andrew Ng's Coursera class, and then create value for your company using ML. No measure theory or linear algebra needed.

And of course all this happens along a spectrum from "requires the least theory/math" to "requires the most theory/math". It's not a strictly binary dichotomy.

So is the person completing the fast.ai course going to be the inventor of the next great new NN training algorithm? Maybe not. Probably not, even. But are they going to be able to apply ML to solve real problems? Yeah, most likely.


Fast.AI is the best course out there if you had like to get in to applied side of deep learning. However, I wish people would be more careful about their claims. It's one thing to mechanically apply same technique to same problems. It's quite another thing to compose known techniques in a new way to solve new problems. And still quite different to invent brand new techniques to solve previously unsolvable problems. The first category is "practitioner", second is "engineer" and the third is "scientist".

To transition from practitioner to engineer, one will need to know low level details including directly relevant math and popular prior art. One of the distinguishing difference between a practitioner and an engineer is that later needs to be able do systematic debugging to identify and fix the problem. I often ask folks during interviews how they would go about when their ML model to do X doesn't work as expected. This usually reveals the difference between practitioner and engineer with high accuracy.

To transition from engineer to scientist, one needs to know much broader body of prior art and beyond directly related math. A typical interview that would differentiate between engineer and researcher would be to discuss some known unsolved problem and see if the person can elaborate details of prior art, opinions of top researchers, wins achieved so far and directions for future efforts.

This is again not to say that practitioners can't do debugging OR that engineers can't make ground breaking innovations. Faraday, Edison, Wright Brothers - all either started out as practitioners or engineers and successfully attacked some of the hardest unsolved problems of their time, never obtaining formal academic credentials. On the other end far many more academics have broken new grounds then non-academics. At the end of the day its all probabilities. Claiming that PhD is no longer required to do "AI" doesn't reveal all these subtlety and marginalizes tools and foundation that PhD equips oneself on this journey.


Andrew Ng's course definitely requires some linear algebra knowledge, and explains how things work under the hood. Same goes for fast.ai

I think even that level of knowledge is not always necessary. Often, just having the intuitive understanding can be enough to get great results e.g: understand that this is a black box that takes features in, predicts things, and you have to help it by giving good features; or understand that word2vec builds vector representations based on words that co-occur in the same context.


Speaking of learning/reviewing linear algebra, I wrote the NO BULLSHIT guide to LINEAR ALGEBRA which covers all the material from first year in a very concise manner.

preview: https://minireference.com/static/excerpts/noBSguide2LA_previ... condensed 4 page tutorial: https://minireference.com/static/tutorials/linear_algebra_in... reviews on amazon: https://www.amazon.com/dp/0992001021/noBSLA


Hey, I just finished your book. Loved it! I am out of uni already and uses it to brush up.


Nice! Did you work though some of the exercises and problems? Don't be a tourist in the land of math!

If you don't want to bother with pen-and-paper (which is the best, but takes very long), you should at least try some problems using the computer-assisted approach. See samples solutions to first few chapters: https://github.com/minireference/noBSLAnotebooks


Linear algebra and even matrix calculus is an undergrad level math.


>Andrew Ng's course definitely requires some linear algebra knowledge,

Extremely basic. Anyone with an undergrad engineering degree will have covered what's needed for the course. His course was almost patronizing, calling people who knew the basics of calculus as "advanced students".


I got halfway through his course and it seemed to do a decent enough job of teaching it as you go. the main thing I struggled with was the mathematical notation.


Throwing ML at something and seeing what sticks is how one ends up with systems that no-one understands and which have horrible impacts in the real world.


Do you understand how a hammer works? I mean, do you really understand how a hammer works?

Throwing X at Y and seeing what sticks is literally how life got from single-cell organisms to "The Apprentice".

We'll keep doing it that way, it's going to be fantastic.


Is such a course enough to be able to discern what data to best collect for this and that ML model?


Is such a course enough to be able to discern what data to best collect for this and that ML model?

For any arbitrary range of values for $DATA_TO_COLLECT and $ML_MODEL? Maybe not. For many real world scenarios? I'd say yes. Heck, sometimes the data you have available simply "is what it is" and there is no question of "what data to best collect".

Let me also add re-iterate that learning to use ML is a continuum... and I would not posit that one could take Andrew's class, or the fast.ai class, and claim to be "done" with their ML education. To me either (or both) of those is just step one on a journey that could continue more or less forever (or until ASI comes along and enslaves the human race, ala The Matrix).

Personally I took Andrew's (original) Coursera course, but since then I've continued to study Linear Algebra and Calculus through other forums, have been working through several other ML books, and have a whole litany of stuff queued up to study, up to and including the aforementioned Measure Theory, Complex Analysis, etc. I definitely value that stuff, I just think you can accomplish some useful tasks before you get to that level.


You would at least be able to say: No, we won't train an LSTM on 5000 datapoints (unless there's some heavy transfer learning going on).

Once you get your hands dirty, and need to debug unexpected issues, understanding what happens under the hood becomes more important.


I often make this point (if you don't have a grad-level understanding..). And people get pissy. Same with statistics ("if you don't have a PHD in stats; you don't understand stats").

--

I had someone contact me for a "consulting" position, to replace their rule-based insurance claim denial system, with "ML". Contract-work, nobody in-house to review its performance, no insight to the problems of the domain and they wanted to throw ML at it to "solve their problem".

When I refused, and pointed out that it would be highly irresponsible, to "contract-out" this type of work, particularly given its life-or-death implications; the CEO got angry. Told me; its "post-claims processing" and it isn't life or death. To which I told him bullshit; your denial of claims is going to directly influence how doctors practice their treatment. There is a direct feedback cycle. The fact that you don't see it, makes it even more dangerous.

They simply, didn't understand, how HIGHLY inappropriate it was to just throw random ML at a problem, particularly as a one-off consulting project (and no in-house expertise).

^ thats real-world.

Personally I don't think you should be allowed anywhere near ML, UNLESS you have that PHD in computer science. I don't even think you should be allowed to HIRE people for ML until you fully understand the hazards with letting a computer control critical decisions.

So yeah, with respect to your distinction of "ML research" and "applied ML".

"NO."


> Personally I don't think you should be allowed anywhere near ML, UNLESS you have that PHD in computer science

A lot of ML research is done by PhDs in other fields. I did research that focused on developing compact group invariant features (for neural networks) for predicting local atomic energies in materials science, and a few mathematicians I follow did work on developing convolutional neural networks that utilize Clebsch–Gordan coefficients to generalize the translational invariance to other symmetries.

On the contrary, a lot of CS machine learning research is heavily application focused (generate such-and-such new thing using a GAN). If anything, mathematicians are the ones who understand machine learning at the very deepest level of theory. This isn't to say there aren't many theory-focused CS presentations/publications; I'm just refuting your point that highly theoretical machine learning research is purely the domain of CS PhDs.


This seems like it misses the point. Obviously there are professional numerical analysts, statisticians, mathematical physicists, etc. who have sufficient background and interest to keep up with cutting edge research and do solid work in machine learning. The argument is not that everyone needs a CS degree per se, but rather that you shouldn’t have your excel guy who just went through a machine learning MOOC but has no further training or deeper understanding try to apply machine learning to life-or-death problems.


shouldn’t have your excel guy who just went through a machine learning MOOC but has no further training or deeper understanding try to apply machine learning to life-or-death problems.

Sure, but most problems aren't life-or-death. They're mundane problems related to improving a business process for a widget manufacturer, etc.


> Personally I don't think you should be allowed anywhere near ML, UNLESS you have that PHD in computer science.

I rather have people have a PhD in logic or ethics, so they hopefully wont make for example racist programs without even thinking about it. Unfortunately, so far i dont have a lot of faith in computer science as a field when it comes to ethics, as it's all about the technical challenge. That something can be build does not mean it should be build.


So yeah, with respect to your distinction of "ML research" and "applied ML". "NO."

Interesting. I don't see anything in the story above that supports that position. There's nothing about the scenario you described that would be affected by having, or not having, a Phd in CS. If anything, as somebody else pointed out, this is more of question of philosophy or ethics.

It's also a pretty niche example, which is not representative of the kinds of things that ML can be used for. If you want somebody who work on pricing optimization or a recommender system for your e-commerce site, you really don't need somebody who's doing cutting edge ML research.


For the people saying that incomplete knowledge of theoretical foundations will lead to dangerous or "bad" AI - go read some papers from respected academics and 'industry leaders'. Incomplete knowledge is something we have to deal with one way or the other.

For the people saying that people won't be able to do anything useful after a 7 week course, go look at what beginner programmers are creating these days. There are so many resources out there now. I am constantly surprised at the impressive and practically valuable scripts that programmers are writing after a several week course. I was excited to get a hangman game working weeks after first learning to code - but now it's not uncommon to build multi-player games or complicated, good looking web apps with less experience. We've definitely moved a level of abstraction higher in the last decade - it just wasn't as clear cut as MIPS to Java.

I think the highest value created here might be in the really non-sexy "oh so that's what 'AI' means" revelations that more and more people are having every day. Before I spent two years studying ML I had the same crazy ideas around AGI that many people with no understanding have. And I got super lost in the same non-productive discussions about what AI might do, what it could currently do, and what it all actually meant. If courses like fast.ai can get a critical mass of people to understand core ML concepts like classification and overfitting and accuracy and precision and recall I think we'd collectively get closer to stopping the hype train and focusing attention on where advancement is actually possible and currently happening.

I know dozens of people taking this course, from all walks of life, and I am convinced that they're all gaining useful knowledge, and are likely to benefit society in some way as well. Maybe they're not going to invent GANs from scratch, but neither is every one who learned what an if statement is going to turn into Linus. Gate keeping isn't cool. This course is (though it actively claims that it isn't).


For the people saying that incomplete knowledge of theoretical foundations will lead to dangerous or "bad" AI - go read some papers from respected academics and 'industry leaders'.

It's not models that would necessarily be bad when made by something like "script-machine-learners". What has a lot of potential for a lot of badness is taking these models as given by "magic". Machine learning is essentially a giant exercise in piecing-together correlations by sort-of clever, sort-of brute-force methods. What's hard is knowing the proper application areas and the improper application areas.


This kind of looks like submarine marketing ala pg [1]. But I'm very interested in this topic nevertheless. Has anyone taken this course? Or if not, would anyone be able to recommend the current best resources for acquiring useful AI / machine learning skills, for the average software engineer?

[1] http://www.paulgraham.com/submarine.html


Hi, I wrote the piece. No PR was involved. I’ve been following fast.ai for a while and I thought it was part of an interesting trend

My basic rule with stories is that they must come from me casting my attention out into the world, not via a bidding war for it in my inbox. I run a whitelist and never read inbound PR. I only email comms people when I a) can’t find a person’s details b) the person asks me to


Does the economist have a standard definition of AI they run with, and/or do you? I find a lot of technologies get bundled together/labeled AI and it makes it hard to discuss or analyze such a broad swathe of things that could be AI depending on how the org has labeled them.


I've taken the course. It's the best out there. Jeremy and Rachel do a phenomenal job and they're taking a contrarian stance to a lot of others in how they encourage anyone with even a little bit of programming experience to go for it. It's also a free course so it's not like they'd make tons of cash on covert advertising operations...


Have you been able to apply the learnings to any problems at work, or to switch jobs?


From taking the course I realized that I wanted to do deep learning rather than what I was going into otherwise (likely some other kind of data science or fintech). I quit my job to go harder on deep learning studying and apply to jobs, but I ended up getting excited about doing a startup.

Now I'm working on Stowbots (stowbots.com) and trying to bring the power of deep learning to everyone's internet browsing.


Awesome. Good luck!


I started with Andrew Ng's Coursera class. There are 5 different courses and I went through the first course and 3/4 through the 4th on conv nets. I would say he does a really good job of explaining the basics but the code is a little dated as AI is moving so fast.

I'm a few weeks in to Fast.ai and the professor keeps it high level and has more focus on application than theory. You jump straight into working on image classification which is broken down into clear concise steps that gives you some confidence moving forward. If you do take the course make sure to plug in your own data and get a feel for how everything works before jumping to the next video.

I would say its good to have some knowledge of the basics, so watch a few weeks worth of Andrews course while it's still free(7 day trial). Skip over the programming assignments because they are too low level and verbose. Then when you think you have a good sense how things work jump over to Fast.ai and start coding.


would anyone be able to recommend the current best resources for acquiring useful AI / machine learning skills, for the average software engineer?

EdX has a good one, I wrote about my experience of it here https://gaius.tech/2018/08/08/microsoft-professional-program...

The tooling and libraries are very good now; assuming you already had the data in a good shape, you could do something commercially useful with Keras in a day or two from first picking it up. There is still quite a lot of low hanging fruit around but these skills are well on their way to becoming commoditized so I don’t know how much longer that will last.


What is the course you are referring to? This is a pay walled article.



fast.ai is known for extensive marketing here on HN.


Can you be specific?

fast.ai has no revenue at all, no investment dollars at all, no marketing team, and no PR budget. It's entirely self-funded out of my pocket personally in order to help people. The courses are all free, there are no ads, and the software is all open source.

Neither I nor anyone else at fast.ai has ever asked anyone to comment on an HN post, or to post anything to HN.

edit: Just noticed last time fast.ai was mentioned on HN you that made a similar claim. @dang and I both responded to you then. So this is not just some misunderstanding, but repeated and intentional falsehoods. https://news.ycombinator.com/item?id=18096807


They may just be genuinely unaware of how organically successful you guys are. But maybe I'm being overly charitable.


Or maybe many here (such as me) have done and loved the fast.ai course


An often untold story these days is you still need specialistic domain knowledge and a lot of your own data to make good use of fast.ai’s very clever lessons. Being able to achieve state of the art results with copy-pasting & modding any of the Dogs vs Cats image classifiers out there, eg. from Keras/Tensorflow, fast.ai/PyTorch, PyImageSearch/OpenCV etc, is worth almost nothing without your own business or research case, your own data and your own targets / metrics.


I see both arguments in this thread and elsewhere.


I disagree that for becoming an expert you don't need PhD., or some kind of 3-4 years focused work in a particular topic in AI.

But the question is are all Ph.D equal, in my view many students fraudulently awarded Ph.D without enough rigorous work, earlier those won't get hired, but because of AI hype they get hired. Also many old PhDs brand them-self as AI expert even though they don't know much.

Hiring is still runs on hype and there are many bias (including gender bias) exists in industry.(e.g. Facebook's Mark does not like to hire 30+ people)


I don't think I really disagree with you with respect to becoming an expert, but what you think of as becoming an expert may be different from what others are thinking.

Personally, what I see is a lot of AI becoming commoditized. There was a time when you had to have a fairly strong understanding of compilers to program anything complicated. These days you can use a high-level programming language and never get down to the level of the compiler if you don't want to.

If someone wants to make lots of money or change the world with AI, my advice would not necessarily be to start with a PhD. It would be to focus on understanding data, getting very good with a library, and building apps that are useful to people.

If you do that, then getting additional knowledge about the mathematics furthers your career, but your career doesn't block on acquiring that knowledge.

If you do this, you probably won't have a great of chance of working on a team at Google or Facebook improving the implementation of the AI infrastructure. Just as you probably wouldn't get a job at one of these companies optimizing the compiler if you didn't have an academic background or years of experience in compilers. But you could still work on other teams in those companies and make as much or more than those people and have a more direct impact than just making it marginally faster.

I have a PhD in pure mathematics, and I don't do ML. Getting a PhD is a particular journey, and it's not for everyone. Also, the idea of getting a PhD as a credential for industry seems a little odd to me, but that may be my personal bias.


Branching off of what you were saying, being an ML expert requires a high level of math typically only seen in academia, aka PhD required. Being an expert in using ML requires you to be an expert in your own field, and understanding your specific problem. Whether that requires a PhD is dependent on the field, but at the PhD level you end up collecting and analyzing significant amounts of data, and through that, understanding how to apply ML to that type of data, if applicable.

PhD for industry is useful for research and development positions in a variety of fields. Just like companies see bachelor's degrees as proof you can stick with something for 4 years and complete it, PhD degrees are proof you can develop and implement research procedures on a long time scale.


This article isn't about becoming an AI expert. It's about learning enough AI to be able to build applications that help the world.

Having taken Fast.ai's Deep Learning course, I can confidently say that their course is enough to help a software engineer with no previous AI experience build extremely powerful real-world AI applications.

Jeremy Howard (the cofounder of Fast.ai, the former president of Kaggle, and the former #1 Kaggle competitor) only has a B.A. in Philosophy. One student who started the Fast.ai course as a violinist is now working as a researcher at Google Brain. It may have been true in the past that you need many years of work to become competent at building AI, but that's not true anymore.


I think there’s a difference between mechanical sympathy and really knowing what’s going on mathematically.

IMO, parent is right. It’s going the way computing did. Ie most degree programmes basically teach you how to programme. Many degrees don’t even require maths anymore.

Hype drives demand drives hiring ,... and always at the top a bunch of guys that don’t know anything about ML at all :)


Yes, but couldn't the same be said about many aspects of software engineering. If you want to optimize a database query of some application code, yes you need to understand what's going on under the hood, but to produce something useful you only need a fraction of that knowledge.


Yes that quite right IMO. Software has been a craft for ages. And I don’t mean it disparagingly, but the number of devs that can think from first principles, know the difference between computing and programming or get “close to the metal”, is relatively small.

I think the same is true of marketing. The number of marketers that are also good researchers is tiny.

And so on.

The common denominator is short supply, high demand pressure, and in all I’m thankful for it because it keeps the wolf from the door for many of us.

But.. it leaves the deeper science/art (whatever yours be) untapped , and it adds so much noise to the market that gold becomes difficult to sell for peanuts.


Software seems to encourage shallow knowledge these days, with a constant churn of new frameworks / databases or whatever the latest trend is.


Why not less focused work over more years? The argument makes zero sense to me.

Of course a lot of study and also applying what you learned are important. But having a PhD says very little about a person's ability to do data science or machine learning work. Especially since most phds have an extreme focus on their narrow field.


Deep learning resurgence is hardly 4-5 year old. Also to become an expert you need more focused work.


As a PL PhD, many of my peers have gone into ML. They actually have competence in it, it turns out many phds are just smart and curious (eg people like Jeff Dean who has a PL background also).

I haven’t gotten into it myself, but that is more of an interest issue.


"it turns out many phds are just smart and curious" So does non-PhD my friend :)


Yep. Having a PhD is just one indicator.


i Disagree and there is no data to support this


What are you disagreeing with? Having a PhD is an indicator that you at least got a PhD. Better if it’s a good school or they know your advisor. You can’t take it for much more than that, but it isn’t an empty achievement either.

Also, there is no data for a lot of things.


It takes a special kind of something to think that there's no correlation between intelligence and having a PhD.


Just to confirm, PL means “programming language”?


Yes. Includes those who work on compilers, formal verification, functional languages, OO languages, and so on. Not the hottest topic these days (compared to ML).


From my experience, to be a good applied AI researcher / implementer it’s not very important that you have a PhD in a specific field but it is important that you have experience working with and debugging complex problems. A PhD is usually a good way to gain such experience as you will work on a challenging problem and have access to people that can teach you how to debug / get unstuck when you’re stuck.

Often I see people without such experience getting stuck on something when building a ML model and not being able to get unstuck as they lack the ability to properly debug the issue without external help.

I think it’s absolutely possible to teach people how to write models in Keras/Tensorflow etc. but IMHO it won’t do them much good unless they also learn to effectively debug their models.


Well, why would you need a PhD? A lot of applied engineering doesn't require you to be doing cutting edge things. You still need to spend time learning existing things in the field, but that is a lot less uncertain than finding something new to write about.

Also a lot of places where recent ML type stuff is useful requires some knowledge other than ML. For instance in quantitative finance this stuff might be useful, so a lot of people who've done finance will do a short course to complement existing skills rather than taking several years out to do research.


A similar HN submission came up a few days ago about how MOOCs can be a replacement for a PhD: https://news.ycombinator.com/item?id=18293418

Data-oriented MOOCs like Andrew Ng’s Coursera course on Machine Learning and fast.ai’s course on Deep Learning are good academic introductions to the theory and terminology behind data science and other related fields. Although MOOCs have many practice problems to solve, they don’t make you an expert in the field capable of handling messier real-world problems, nor claim to do so. (my longer blog post on the subject: https://minimaxir.com/2018/10/data-science-protips/)

More importantly, actually getting the job nowadays is near impossible without a Masters/PhD due to the competition. (the statistical trick with MOOC/boot camp job placement is that the candidate often has a Masters/PhD in a different field, not necessarily in AI)


I don't think MOOCs can come even close to be a replacement for a PhD (and from what I understand, neither do you). If a candidate who learned with MOOCs can apply for a job which previously "required" a PhD, then that requirement was simply misguided in the first place. A PhD in (AI, CS, stats or whatever else) does not teach you how to be a good all-around data-scientist, it teaches you how to conduct scientific research. In AI this means either developing or improving algorithms, theory work or applying AI to one particular problem for four years. That kind of expertise is not needed for most DS jobs and never was.

In my opinion however it is a good rule of thumb to assume that someone with a PhD in a relevant field will become a good data scientist after an adjustment period.


> If a candidate who learned with MOOCs can apply for a job which previously "required" a PhD, then that requirement was simply misguided in the first place.

Therein lies the problem. A lot of people I've talked with in leadership positions but without a statistical background believe that data science/AI requires a PhD, and since there's a healthy supply of candidates with the PhDs, there isn't much reason to reevaluate that position.

(that's more for traditional job positions; obviously research positions will benefit more from a PhD.)


The problem is that when you put people without formal training in charge of building and training models, you get also results that match - models with horrible biases, models that don't reflect the reality at all, models that are worse than taking random guesses.

Running through a bunch of tutorials and then taking a deep learning toolkit (e.g. Tensorflow or Keras) and starting to feed data into it won't teach you squat about the importance of having a representative sample, about removing biases from the data, about correctly handling outliers, about doing some basic statistical analysis on the data to see whether they are even relevant to the problem at hand and so on and so forth.

Or even how to build a questionnaire/experiment so that you don't get only a load of expensive garbage instead of data out of it.

This is what a doctorate and the associated research training generally give you. Of course, it is not anything that couldn't be learned without doing a formal PhD, publishing research papers and defending a thesis but it is typically the background you won't get from these MOOCs or various online (and offline) "data science" trainings.


Great answer. What if you learned basic stats, calculus etc during a masters or BSci degree?


My take on this is what's needed to succeed in DS/ML in a commercial setting is a) domain knowledge b) the diligence to acquire, clean, and generally pre-process the data and c) some fairly basic knowledge of ML.

Someone with amazing ML skills but no knowledge of the problem domain will fail. Someone without domain knowledge will be unable to assemble and curate the data they need and will be unable to make any sense of the output. That is the Achilles heel of ML rockstars who think it's just a matter of getting enough data into their training set and the NN will do the rest.


Personally I liked Andrew Ng’s course much more. While fast.ai seems to have some practical gems hidden, it's just so much noise surrounding it.

So much time wasted just by stating over and over again why their top down approach is so great. And even more with how to install this and that, dealing with the command line, setting up stuff, using AWS etc. Things the "coders" the course is targeted at should be able to do anyway.

I always found myself jumping around the videos to find the useful parts (with the help of the time markers in the wiki) but then dropped it every time and did Andrew Ng’s course in combination with "the" deep learning book instead. They just have much less noise.


I did 3 years of PhD work in computer vision, then dropped out of the PhD program to work in finance, eventually found my way back to a career in deep learning for image processing and NLP and some other smaller stats problems in causal inference.

My undergrad degree was a very advanced pure math curriculum as well, so I had already done multiple years worth of linear algebra, measure theory and measure theoretic probability theory even before the PhD work.

I think in terms of being effective at utilizing machine learning or statistics in a company that creates products, there is absolutely no value whatsoever, emphatically none, not even in terms of mathematical thinking, formalism or ability to grok research publications, associated with measure theory, measure theoretic probability, formal derivations of common ML algorithms or optimization problems, theoretical topics in convergence, etc. None.

The absolute most critical thing you need is skepticism that algorithms are not working. After that you need a great understanding of all the complex failure cases that are possible, which includes tons of things that business people will not think of, from multi-collinearity to mode collapse to unsound reasoning based on p-values to overfitting to missing data treatments and so on.

If you can grok basic linear algebra and algorithms, can assemble modern machine learning library components efficiently and have good judgment about statisical fallacies and unsound statistical reasoning, then it does not matter what other credential you have at all, period.

In fact, I have worked with very decorated PhD level ML researchers who had such horrible programming skills that it was nearly impossible to incorporate their work into actual products. I’ve also worked with decorated PhD level ML researchers who did not understand basic things about general statistics outside the scope of loss function optimization, for example like topics in MCMC sampling, or cases where reasoning about a model’s goodness of fit needs to holistically consider residual analysis, outlier analysis, posterior predictive checking and plausible effect sizing from literature reviews. They argued and argued that purely optimizing log-loss (with appropriate controls for overfitting) should always be the best model, which is just very naive.

The people saying these things had PhDs in top programs, many publications and conference presentations, and usually considerable software engineering skills.

Truly, credentials in ML really don’t mean anything. It’s about work experience and what you know about pragmatically analyzing statistical problems in the service of product development, and academic training is just not a very important part of this.


I see a lot of companies asking how they can make use of AI/ML because they think they're supposed to. Aside from the sort of recommenders we've been doing forever, it's mostly just niche applications. ChatBots are already waning. I don't think we need all that many AI experts right now.


Chatbots need to go away faster - I've almost managed to punt on having to deal with them this cycle, if they get back into the trough of disillusionment soon, I'll be in the clear. Most people just want a super dumb dialog tree, but that's not sexy and whitepaper friendly.


Yeah. I did a voice doodad for a client not too long ago and it was just a whizzy version of a classic IVR system. Worked really well and no one was the wiser.


Just some thoughts, never took the course:

These AI-Bootcamps can, i would expect, only complement existing skills. I think they are a bit misleading, since I expect the typical day to day work for the graduates to be more Data-Sciency than ML-engineer (I would expect the graduates to be quickly out of breath in a real ML-(research?) engineer position). The challenges in many Data-Science roles lie mostly in understanding the data and not in the algorithms. Domain-Knowledge is also very important in many Data-Science (especially non-big data related) roles.

I think the day to day tasks of a lot of data-scientists are not what what you usually associate with "AI".

Any real experience to validate/invalidate my thoughts?


Why is PhD valued so much for engaging in R&D? Pretty much any labs (Intel labs, Amazon Lab126, etc) require PhD. What is objectively beneficial for having a PhD in doing R&D? Is it the ability to conduct research - methods, process, analysis, discipline? Or is it the fact that they have knowledge in that particular field?

Every time I speak to someone with a PhD at work (I work in top 10 tech companies in US), in 100% of the cases in my experience do not engage in anything related to their thesis.

So why is there this arbitrary blockade for having a PhD in R&D/Labs?


> in 100% of the cases in my experience do not engage in anything related to their thesis

This is missing the point of a PhD, which is to develop a capacity for doing science (or more generally hard thinking in relevant domain). Most of my colleagues in neuroscience are physicists


>This is missing the point of a PhD, which is to develop a capacity for doing science (or more generally hard thinking in relevant domain)

These days, it is more about the ability to publish. A few weeks ago I had a conversation with a friend of mine who is wrapping up his PhD. He pointed out that not one of his colleagues is concerned whether anyone can reproduce their work. They use a home grown simulation suite which only they have access to, and is constantly being updated with the worst software practices you can think of. No one in their team believes that the tool will give the same results they did 4 years ago. The troubling part is, no one sees that as being a problem. They got their papers published, and so the SW did its job.

(Should not be surprising to anyone who spends a lot of time in engineering PhD programs).


But those AI programs that started this discussion would have teams of engineers to do that SW correctly.

One interpretation of science is just making theories and testing them with experiments, no solid software engineering required. Another says reproducibility is important but the current incentive structure is not there yet.


Exactly this; this is also why people with multiple PhDs are incredibly rare (and viewed as odd ducks). The expectation is you've been trained to do "science" if you have a PhD, and thus a second one is redundant.


I hadn’t thought about the multi PhD case.. I’ve never met one. I’d expect someone to do a post doc even if it’s far afield from their PhD.


Spot on. Note also that the person taking on that second PhD took up a spot that could have been useful for someone else. Odd duck indeed.


Exactly. I have met plenty of PhDs that have switched fields from Physics to biology for example.


Because having a PhD is proof that you can do research, and there are so many PhDs available that Intel has absolutely no motivation to increase their hiring risk one iota by extending their net past that market. Even if a BA was only 0.000001% less likely to pan out than a PhD it would not be worth it. The only way this could change would be if the number of PhDs available to Intel was drastically reduced, which might happen if the president gets his way with immigration, or if the now commonly repeated message "getting a PhD will not be worth ten years when measured by career progress" sinks in to US students.


Because having a PhD is proof that you can do research, and there are so many PhDs available that Intel has absolutely no motivation to increase their hiring risk one iota by extending their net past that market. Even if a BA was only 0.000001% less likely to pan out than a PhD it would not be worth it.

There's such a shortage of STEM graduates that PhDs get hired to do stuff you can actually do with a MOOC (or to carry a pager at Google).


I know way too many unemployed physicists to believe for one second that there is a shortage of STEM graduates. The shortage must be of some other trait or skill.


Are they trying to stay in Physics? I know ex-physicists who have switched to biology of programming.


How far does this go though? Suppose someone has a PhD in, say, social anthropology, and they did ethnographic research for their dissertation. They can do research, but it isn't really math-y. If they do a boot camp to get the ML/Stats stuff, does that count? Or does "PhD" really mean "PhD in STEM?"


A PhD in STEM, of course.


Given the salary difference between the two on average, if the BA was five nines as likely to be as good as the PhD you'd hire two of the BAs and you'd make out like gangbusters.


What salary difference? At most of the majors, PhDs are paid roughly equivalent to somebody with just a few years of industry experience.


A PhD is a degree in doing R&D, with an emphasis on the R. A thesis is just a demonstration that this set of skills was acquired in a discipline.

Just like how people check out your GitHub repository to see if you are a capable coder, people check out your refereed publication record to determine if you can do research. It's possible to build up a record of publications without a PhD, but it is a hard skill to acquire on your own.


Most PhDs are all R and no D. And you don’t necessarily need to build up a record of publications to do real world R&D outside of academia. In my experience having a PhD is just a crude filter for employers, because what else are you going to rely on? But unless the PhD thesis was in the exact same area as the real world R&D (quite rare), you can assume that a newly minted PhD will not be hitting the ground running.


Sure, but parent is talking specifically about heavy-R research labs (he mentions Intel labs, Amazon Lab126, etc), not more D-oriented data science or ML positions.


A lot of successful researchers lack phds, some of them are considered at the top of the ML field. It isn’t really hard to acquire on your own, you just have to spend time at it (the main advanatage of doing a PhD is time and mentorship).


A PhD is meant to give you the discipline you need to read literature properly, write your own proposal to expand the knowledge in your niche, devise and follow a sound protocol to test your hypotheses, execute experiments and simulations keeping sensitivity at bay, discuss your results against your own hypothesis and the accepted wisdom in your field. You do not need a PhD to do research, many advancements are even originated from non-PhD researchers, but you probably need a PhD to function properly in a rigid research environment.


This is a good answer.

I say this because most applied research is not done in a manner that would meed any kind of academic standards.

When you look at this history of the Valley, it has more to do with dropouts than PhDs - rather, it has more to do with 'those doing' than 'getting credentials'.

I almost feel that for the most part PhDs belong in Academia, and that whatever high-end work we do in industry should be called something else.


A PhD is not only about solving some specific problem. It is more about learning how to solve problems, and more importantly, choosing what problems to solve.

Having a PhD should mean that the person is capable of making consistent progress in ill-defined problems without much external support. Of course not all PhDs can do this, and you don't need a PhD to gain these skills, but they correlate strongly enough that it makes sense to just filter some jobs by whether or not someone has a PhD.


> Is it the ability to conduct research - methods, process, analysis, discipline?

I think this is largely the reason, that a PhD is seen as a semi-reliable guide to general research ability, though of course it's neither necessary nor sufficient, but there is a rough correlation.


I wonder if this could be taught as a 6 month course instead of having to grind through 4+ years of your life researching something obscure impractical and largely driven by funding incentives.

I feel like teaching research methods would be way more efficient - just without the experience. But we hire BS/MS engineers all the time without industry experience, so why would’nt having a research focus Masters not be sufficient?


> I wonder if this could be taught as a 6 month course instead of having to grind through 4+ years of your life researching something obscure impractical and largely driven by funding incentives.

4+ years of PhD (incl. a track record of solid publications) prove that you have the intellect, motivation and discipline to pull off a multi-year research project, similar to the ones employers expect you to work on when they hire you. It seems like an excellent filter for a research job.


Maybe. Assuming it's a research-focussed MA. Still, the problem is that research needs to be on something, and so they do also need to acquire knowledge at the same time (otherwise they're likely to be a glorified lab tech), so the things an MA can generally tackle are different from those a PhD can tackle.

That said, our department sends MAs into industry as well. Further our department as well as our University has been promoting undergraduates getting hands-on experience in doing research.


Getting a PhD is mostly about doing research. So, it's directly relevant.


I’d like to do research but it’s impossible to get into these labs because having a PhD is an absolute non-negotiable requirement.


It is not "absolute" as you say, you can get in most research lab with a MS if you are an exceptional candidate. My advice, if you really, really want to be a scientist in a major research lab, is to just bite the bullet and do a PhD.


You don’t just “do research” (nowadays) without having had training, which is precisely the point of a PhDd


a) it's not absolute

b) you could apply to phd programs and get a phd.

But yes, it's impossible to get into R&D labs if you've never done research. You can self-teach and publish in top conferences on your own (this is probably much harder than you think it is but is not impossible!), or you can defer income to get an education from one of the many structured educational programs designed to teach you how to do research.

I'm not sure why you expect industry R&D labs (as a researcher, that is) to hire people who do not already know how to do the job. They are companies, not guilds or universities. Software Engineering positions certainly require people to already know how to program, after all. How is this any different?


There is a significant difference the way a person with PhD approaches unsolved problems vs a person without PhD.

A person with PhD will typically like to invest huge chunk of time to digest much of the prior art. They want to systematically understand what worked and what didn't in the past. They then bring in their unique understanding of small set of techniques/cross-domain expertise to combine with prior art to attack the problem. It is important for them to craft well designed experiments and lean on tools like mathematics to guide their next steps. Their goal is to make sure anonymous reviewers of their work are satisfied with proof they present of their hypothesis.

A person without PhD typically doesn't want to take a lot of time going through years worth of prior art. Their approach usually boils down to developing intuition via initial random search and then iterative experiments to strengthen intuition. If this doesn't work out over a period of time, they will probably want to move on to something else instead of keep creating next little insignificant bump of progress. The frame of mind is get to near-instant gratification as opposed to possibly spending all of your life in to something that wouldn't make any impact in real world but might build useful steps for future generations.

It is hard to say one approach is better than other for all classes of problem. However using the first approach requires certain training, rigor, discipline and, most importantly, frame of mind - we call PhD. It is only recently and in few fields like AI that research careers have become desirable from financial perspective. However in most of the fields, things are very different and you would rarely see anyone demanding to be theoretical mathematician or quantum physicist, for example, without a relevant PhD while also receiving modest income with no promotions or stock options.

However, it has become clear that for many advanced problems the random search approach typically maxes out quickly because search space becomes prohibitively large and developing correct intuitions requires climbing knowledge pyramids of epic proportions. For example, invention of transistors, GPS or techniques like initialization or dropout in deep learning wouldn't have been possible without PhD style multi-generation approach and concrete mathematical understanding.

Having said that, I'm against hard liners who create rules that being researcher requires PhD and one without it cannot innovate or do creative work. Many complex and most valuable innovations have been done by non-PhDs such as Faraday, Edison and Wright Brothers - interestingly in all of these cases people with academic degrees gloriously failed working on same turf, for longer time and better funding (Davy, Langley etc).


My sibling comments miss something important: people actually do something for those 5+ years. They don't just sit on their hands. They're publishing papers, collaborating with respected researchers, building a personal network/reputation, developing new tools and techniques, writing peer reviews, contributing to grant proposals, giving talks, teaching, etc.

I.e., they are actually doing the exact job that they are now being hired to do.

> Is it the ability to conduct research - methods, process, analysis, discipline? Or is it the fact that they have knowledge in that particular field?

When you apply to R&D labs, you are competing against people who spent at least five years:

* building research infrastructure (which includes, but is not limited to, knowledge about "methods, process, analysis, discipline"),

* a concrete track record of producing publishable research,

* a professional network and some amount of respect for you previous research output, and

* experience designing a multi-year research project (from writing/helping write a convincing proposal all the way through publishing, evangelizing, and managing relationships with people who want to build on your work).

Do you have 5+ first author publications in good conferences? Have you contributed to R&D software? Do you have multiple influential collaborators who have a high opinion of you and are likely to help popularize/contribute to/build upon your research goals? Have you written and followed through on grant proposals?

If "yes" to all those things, you should apply for these research positions! Even without a phd, you are a good candidate.

If all you have is a six month MOOC, or some R&D software development but no experience with the full research process, then sorry, but there are people with who have already demonstrated they can successfully design and execute on research agendas. You might still be a good candidate, but preferring strictly more experienced and better qualified candidates who have demonstrated, empirically, that they can perform the full job, is not an "arbitrary blockade".

> Every time I speak to someone with a PhD at work (I work in top 10 tech companies in US), in 100% of the cases in my experience do not engage in anything related to their thesis.

"people I have worked with" is a huge selection bias.

IME, in CS this is almost always either a) not true, or b) a conscious choice on the part of the phd holder (and is more of a pivot than a "totally unrelated thing"; you just might not see that there's a deep connection between the thing that the person was doing before and what they're doing now).


Tell us the name of the PhD research groups that were critical in the success of Cisco, Microsoft, AirBnB, Uber, Oracle, Salesforce, Apple, Google (Page Rank is a simple idea, not rocket science)?

Maybe Intel, but then there are arguments there as well.

The kinds of 'R' that PhD's are trained to do is fairly academic and pure. It takes so long for that research to meet applied reality in the competitive sense. I know that Google has made some improvements with voice recognition etc. but it's not hugely material, it hasn't changed their business one bit.

I can't even think of a single important startup these days that was founded on key research.

Of course these people are important, and fairly pure 'R' is important, but the world is far, far more applied than most people like to think.

The kinds of 'professional' aspects you describe simply aren't necessary in most scenarios.

Also - remember that a lot of what researchers churn out isn't necessarily valuable, a lot of the results are not reproducible, and that 'publishing, giving talks, having a network' are not in and of themselves hyper relevant to making progress.

In many ways I feel that PhD's are to 'applied R&D' what 'MBA's' are to business - it's all very academic :)


Frankly, your post isn't relevant to this thread.

Major R&D research labs exist to do the "R" in R&D. Many of them are explicitly not doing research that is intended to become a new business/product.

There are many models here, but the important point is that if you want to start a new business (unit), an industrial research lab is probably not a good fit. They mostly exist for other reasons.

If parent wants to start a new company or business unit, he should do that. But in that case, why is he applying to industrial R&D labs? He should be applying to y combinator instead...

Your post is a bit like saying that it's reasonable for a personal trainer to get a job as a doctor at a hospital because exercise is more important than surgery to most people's health. I.e., you're saying true things but kind of completely misses the point of the sorts of organizations we are talking about (research labs in large companies).

(I'll leave aside the ridiculous irony that you included Google in that list... in any case, each of those companies has benefited significantly from their internal R&D orgs.)


"Frankly, your post isn't relevant to this thread."

We're talking about the value of Phd's to high tech companies, so it's relevant.

Tell me how Facebook, AirBnB, Uber, Microsoft, Salesforce have materially gained from their R&D labs? They would be essentially the same companies without 'research style' R&D. Windows and MS Office do not require PhD's to design, though surely there are material contributions made by those who hold a PhD, but the more classical R&D aspects have not been hugely relevant.

For some silicon companies it's probably true, and possibly in the medical field, wherein clinical research etc. is essential part of the product approach ... but for most of Silicon Valley style companies - it's not.

Companies tend to hire lots of PhDs after they are rich and flush with cash - not the other way around.

"it's reasonable for a personal trainer to get a job as a doctor at a hospital"

No - you need to be a doctor to do 'doctoring'; training to be a doctor is 100% relevant to the practice of actually being a doctor. But you definitely do not need to be a doctor, or even have PhDs on staff to make companies or products that are material relevant to people's wellbeing or fitness.

PhD level, near pure R&D is not essential to the vast majority of even tech startups.

We're on HN, i.e. a Y Combinator site - an entity that is really at the nexus of early stage investment. Care to run down the list of YC investments and see which ones are based on some kind of materially relevant PhD level research - and that are also successful? Because unfortunately there are very few.

I know of one, off hand, Lyrebird - one of the more popularly significant investments that YC has made in AI, they make a really cool product that can take maybe a minute or so of your speech, and then re-create a text-to-voice simulator voice that sounds similar to the pre-recorded device. It's pretty cool, but not that great. It's been a while now, they have no real product that anyone cares about aside from it's novelty, they don't have any real revenue, and they probably won't. Sadly, they will be acquired by some BigCo R&D shop and hopefully make some money for themselves. They are making great contributions to science, but they might as well be doing it out of U. Montreal from whence they came.

Most R&D done by BigCos is just a little more applied than such research done at Universities, but the results of such work benefit the commons as much as they do the companies anyhow. In a way, it's almost benevolent spending on the part of BigCo's, somewhere to park their cash, maintain prestige. It's important in the 'general' context, but often not very directly related to the bottom line of said companies.


> But you definitely do not need to be a doctor, or even have PhDs on staff to make companies or products that are material relevant to people's wellbeing or fitness.

Again, research labs are not hiring people to build products. They are not even hiring people to do advanced development in most cases (and when they do, those supporting engineer roles typically don't require phds). They are hiring people to do heavy-R research.

> We're talking about the value of Phd's to high tech companies, so it's relevant.

No, we are talking about whether these companies should hire phds into their already existing research labs. Not even generally, but specifically (e.g., Intel Labs and Amazon).

The decision that these labs should do heavy-R research was already made for us by VPs or C-level folks. These labs already exist, their mandate is already spelled out, and now the question is "why won't they hire me to execute on that mandate?"

If parent disagrees with that mandate, why would he be interested in getting a job in one of these labs?

The question of whether these places should have heavy-R research labs in the first place is entirely orthogonal to this thread. Why? Because parent already stated he wants to work within these existing research organizations.

It sounds like you think typical industrial research labs are bad investments. That's fine. I disagree. More importantly, presumably, the parent commenter disagrees. After all, if he did agree with you, why the hell would he want to work in an R&D lab?

If parent agreed with you, why the hell would he be pining for a job in one of these labs? He wouldn't. He would just apply to YC.

So, again, litigating whether or not these heavy-R research labs should exist in the first place is an important topic but is not relevant to this thread, which is about the question: "why don't research orgs that have a corporate mandate to do heavy-R research hire non-phds?".

Extremely off-topic, as I've stated multiple times now, but maybe it will help: YC has HARC and (indirectly) OpenAI. Those are much more analogous to corporate research labs than YC investments. Are they wastes of money? Good question; but one thing is certain: if you think so, then you probably don't want a job at one :-)


For Google that would be Larry and Sergey. There's probably untold amount of PhDs that were and are working at Cisco, Microsoft, Oracle, Uber and Apple as well.


For the same reason there's an arbitrary blockade against awarding a phD to someone with no formal training from a university.


The challenge with machine learning today isn't that it doesn't work -- many organizations have successfully applied it to their problems -- but rather many people struggle to use it.

There is massive potential if we can just make it easier to use the machine learning techniques that have already been proven to work. Phds are useful if you're trying to use the state of the art, but that's not what the masses need to benefit from machine learning.

The scikit-learn library is a great example of this. It provides a clean fit()/predict() API that developers can leverage in their applications, which little understanding of how the implementations of each algorithms works.

Another area ripe to be made easier for new practitioners of machine learning is feature engineering. Without proper feature engineering it is difficult to create accurate models.

That is why I work on an open source python library for automated feature engineering called, Featuretools (https://github.com/featuretools/featuretools/). It can help when your raw data is still too granular for modeling or comprised of multiple tables. We have several demos you can run yourself to apply it to real datasets here: https://www.featuretools.com/demos.

In the future, I expect even more tools to emerge to help with things like defining a specific prediction problem and extracting labeled training examples, frameworks for robust testing, etc.


just curious, can it select sets of features from a predefined sets of features? like knowing which of the tuples from (A,B,C),(D,E,F),(G,H,I) is the best tuple?



The comparison of AI to the Internet doesn’t really make sense: “nerds” are still the ones building out applications; only difference is that those applications can be used by a wider audience. The comparison could be that “nerds” build out AI systems, and that they’re used by wider audiences—this is what is already happening; look at assistants like Siri or Alexa, recommendation systems like YouTube’s, etc.

I think what this article is getting at, is the notion that AI will become “professionalized”: the role of “AI specialist” will be as common place as software engineers; the salary premium will close. All of this is banking on, however, the idea that AI systems will needed and be implemented by most companies—which I disagree with.


Weird that they talk about this topic without mentioning market leaders such as https://www.datacamp.com/ and https://www.udacity.com/. Coursera also has some excellent courses. And of course Galvanize also teaches a course about data science.


Unpaywalled on Chinese website :)

http://www.tianfateng.cn/20740.html

Sara Hooker who is prominently features in this article as success has rebuked the story by saying that she has spent over 4 years in machine learning:

https://medium.com/@sarahooker/slow-learning-d9463f6a800b


There’s a difference between “building things with ML/AI” and “building ML” just like you don’t need to know how to design microprocessors to buy and use a computer. These articles always fail to report on the difference between the two.


Why not! A bit of benefit multiplied by a large number of applicants should pay some dividend for some time.

But there is a big difference between the craft and the science ... and phd or not , ratively few practitioners have the chops for the science.


Finally, I never got why they had this limit. ML even if you know the theory and mathematical knowledge isn't out the realms of undergrads. If you black box it can easily be done.


The only thing a Ph.D. does is teach you how to formulate and defend a thesis, according to traditional philosophical tenents, and possibly using the statistics and relevant courses you took while getting a masters. It doesn't take you further or deeper into your chosen path of study. It enables you to publish research articles, for when/if you become a professor and learn how to navigate ridculous politics within University departments. That's it. If you want ti build AI, then go learn how to do it. Don't waste your time getting a PhD. Your time willbe much better spent and rewarded.


I don't have a Ph.D and I'm not pursuing one, but when you say "It doesn't take you further or deeper into your chosen path of study"that says to me that you don't understand what a Ph.D is...


Then how can you derive that opinion? I understand all too well what a PhD is because of several credible reasons. It's a doctorate of Philosophy. The reason for that degree being in Philosphy, rather than Marketing, or whatever the the chosen path of study is, is because what you're learning to do is create a theory, evaluate it in a philosophical manner using theorems, and defend it, whether it pans out or not. It's to teach the candidate how to do research and present it in research journals, which is what makes a Ph.D. more valuable than, say, an instructor or an adjunct professor, who doesn't need a Ph.D but can also teach graduate-level courses, as I have. Instructors in a University know as much as or more about the subject they teach as a Ph.D. And they often do know more. In some instances, much more. But they don't have the other two responsibilities tenure-track professors have which is research and service, which set tenure-track professors apart in salary and prestige. It's not a fair system, but schools don't represent the real world in any shape or form, so that's just how it is.


I know this isn't particularly constructive but, before reading did anyone else think they were making a new mass-market dialect of scheme/lisp to teach people AI programming?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: