Why are very few schools involved in deep learning research? Why are they still hooked on to Bayesian methods?

First, this question assumes that every university should have a "deep learning" person. Deep learning is mostly used in vision (and to a lesser extent NLP), and many universities don't have such researchers, so they wouldn't have a deep learning researcher either.

One thing that people often forget is that academics have long careers (thanks to tenure, this is by design). So if you hire a bunch of researchers now who do deep learning, they're going to be around for decades. Academia tends to be conservative, so it's not going to stock up on deep learning researchers just because it's cool today. If this were the norm, CS departments would be full of fuzzy logic researchers hired in the 90s.

There's nothing magical about deep learning. It's one tool of many (including Bayesian methods, discriminative methods, etc.) you should have in your toolbox. Departments try to hire bright people, not those who slavishly follow every fad. Obviously, there will be more of these people on faculties who do deep learning in the near future. (If Facebook, Google, and Baidu don't all hire them first, that is.)

That said, there are lots of folks working in this area. Of the schools mentioned in the question, Noah Smith at UW and Katrin Erk at Texas. Other places (off the top of my head) that work in this area: UMass, JHU, Maryland, NYU, Montreal, Michigan, and TTI. I'm more upset that Princeton and Caltech (where I did my PhD and undergrad) don't have professors in CS who do language research. That's the bigger crime in my opinion, and is correlated with their lack of deep learning folks.

Blatant self-promotion ... Colorado has three folks working in this area: me, Mike Mozer, and Jim Martin.

Updated Mon. 11,170 views. Asked to answer by Nishant Prateek.

Upvote104

Downvote

Comments2+

Related Questions

Cui Caihao, PhD Candidate in CS & IT

6 upvotes by Manigandan Muthusamy, Alvin Pastore, Arpit Gupta, Haider Ali, (more)

There is no conflict between these two methods, deep learning and Bayesian methods are both useful Machine Learning Tools to solve the real problem in our life. Deep learning allows computational model that are composed of multiple layer to learn representations of data with multiple level of abstraction, this is a automatic feature extractor which can save a lot of engineering skills and domain expertise.

Bayesian method is also used in some part of deep learning, like Bayesian Nets etc. Some school may looks like that they haven't involved in deep learning research but actually they share the same knowledge base and philosophy in this area. If one is good at Machine Learning or Statistical Learning, he will feel no pressure to do some research on Deep Learning.

Here is a paper about deep learning published last month on nature : Page on nature.com . The authors are so famous in the world right now and my friend, if you met a guy doing research in AI or ML, and he told you that he had never heard one of them, you have an obligation to wake him up, LOL~

Here is a reply from Yann LeCun | Facebook

Written Mon. 1,362 views.

Upvote6

Downvote

Comment

Jane Lee, Data mining for businesses and manage... (more)

2 upvotes by Haider Ali and Pss Srivignessh

I just wanna quote Yann Lecun's answer in Facebookhttps://www.facebook.com/yann.le...
The key ideas are: first, there's no opposition between "deep" and "Bayesian". Second, it takes time to acquire skills and talents to be professional in deep learning research.
fw

Written 1am. 388 views.

1 upvote by Nishant Prateek

There was a big hype in the 80s around what we call now "shallow" neural networks. I don't know why but bio-inspired models in artificial intelligence seem to follow a cycle of popularity-discontent, whereas pure statistical methods seem to be less hyped but more constant in popularity.

Anyway they are not so distant. The basic component of Hinton's Deep belief network is the restricted Boltzmann machine, which is a flavour of the Boltzmann machine, which is a probabilistic model.
You can always see the state of a neuron to be conditioned by the state of its inputs, statistically speaking. The whole network state can be described in a probabilistic fashion.

What is universally important for artificial intelligence is linear algebra (vector spaces), calculus (gradient descent), and probability theory (bayes). Be worried only when these topics are neglected... :)
Also, I really see graph theory as a common feature of all advanced models in AI.

Piero,
PhD quitter who still loves neural models

Written Mon. 662 views.

3 upvotes by David Ha, Anjith George, and Adriaan de Beer

I'm actually quite disturbed by the current use of the term. It reminds me of all the "high level" stuff in the 1980s, what wasn't really high level in any particular absolute sense, just relatively high compared to what proceeded it. Now we have something being called "deep" just because it's a bit heavier than something else and "learning" just because it's a fashionable word to use. Why is everybody working toward a job in marketing these days?