How to trick a neural network into thinking a panda is a vulture

IshKebab · on Dec 24, 2015

I like the visualisations on this page: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/ (Especially "The Easy Way Out" one.

It suggests that because neural networks cannot properly separate the classes, the often mush them together in the input-space so the "panda" region of image-space has loads of other classes speckled though it. If you purposely find those speckles you can make it think that a panda is a vulture.

That blog mentions adding smoothness constraints, so that you can't suddenly go from panda to vulture with a tiny change in an image. But I wonder if an easier solution (ok, hack) is just to add different noise to the image 10 times or so, run it through the network and then combine the results.

Essentially you'd be doing Monte Carlo sampling of the image space around the input image. Or kind of blurring the image space.

Just an idea anyway. I don't really know what I'm talking about.

fundamental · on Dec 24, 2015

My understanding from reading a few 'adversarial example' papers is that it's also possible to talk about the classifier as having a poor margin between classes. An acceptable output classifier for the convolutional net would be to place the decision boundary arbitrarily close to the known examples, but this doesn't make much sense given how large the semantic differences actually are.

There are multiple options to try to enhance the distance to the decision boundary (such as the adversarial examples referenced in the post), but I think the recent work from microsoft (I think it's published Dec 2015) on replacing pooling layers in the convolutional neural net with something analogous random forests might be the best option so far. More or less each decision tree will end up pushing values to 0/1 in the non-linear region, which mitigates some of the concern about overly linear systems and it places the decision boundary at a somewhat arbitrary location between classes. In aggregate when these locations are combined the resulting classifier has a larger margin without explicitly needing adversaries. So instead of sampling the image space, you're effectively sampling the classifier space.

If you want, I can dig up the citation, but searching for deep neural nets and random forests should get you to the paper all the same.

mziel · on Dec 24, 2015

For the lazy:

http://research.microsoft.com/pubs/255952/ICCV15_DeepNDF_mai...

From abstract:

"(...) we introduce a stochastic and differentiable decision tree model, which steers the representation learning usually conducted in the initial layers of a (deep) convolutional network. (...)"

Houshalter · on Dec 24, 2015

From the papers, training on images with random noise in them didn't help. But intentionally creating these adversarial images and training on them did help. And even improved the performance of the net in general.

IshKebab · on Dec 24, 2015

I don't mean training on images with random noise - what about evaluating the net on images with random noise?

chestervonwinch · on Dec 24, 2015

I'm not sure if this provides any insight, but I think it's interesting to visualize this type of thing for the much simpler case of a two-class problem in two dimensions:

http://i.imgur.com/NdzdH5j.png

On the left are the training vectors, color-coded by label; the background color-codes the probability output by the learned network at each point in the plane. On the right is the gradient field of the network's output corresponding to the blue class. The gradient field shows, at any point, the local direction of greatest increase toward the blue class.

eli_gottlieb · on Dec 24, 2015

Interesting. So most of the training points for Blue were in the "bulge" near where they were surrounded by the Red class, but at almost every point, the network believes that moving further "deeper" into the Blue "zone" increases the "blueness".

schiffern · on Dec 24, 2015

I wonder if simulated saccading would address this problem?

Rank the image by entropy, and re-run the neural network centered on those points with a radially-increasing gaussian blur applied to the subimage (approximating the sensitivity of the human visual field).

blazespin · on Dec 24, 2015

For some reason this reminds me of the man who mistook his wife for a hat.

https://en.wikipedia.org/wiki/The_Man_Who_Mistook_His_Wife_f...

spikels · on Dec 24, 2015

Make me think of this and optical illusions.

http://www.ritsumei.ac.jp/~akitaoka/index-e.html

guard-of-terra · on Dec 24, 2015

Neural net can see some subtle pattern invisible to human eye, decide it is very representative, and make a guess based on that.

Pre-processing seems to be an answer, normalize the image, blank out areas obviously uninteresting to human eye, add dithering.

argonaut · on Dec 24, 2015

Define "uninteresting."

Also, it kind of defeats the purpose of neural networks to do substantial feature engineering like that.

dgacmu · on Dec 24, 2015

The Imagenet-based networks already have bounding boxes to identify the region of interest. It's important, because many of the training images have multiple "things" in them, but in the '12 dataset, only one label. It's not as much "feature engineering" as "object discrimination in the training set".

argonaut · on Dec 24, 2015

That's not feature engineering, that's just a label for the task of interest (object localization).

Blanking out areas that are "not of interest" I would consider substantial feature engineering (unless the task you were training a net for was explicitly to find interesting vs. uninteresting areas).

guard-of-terra · on Dec 24, 2015

Something human can't see is uninteresting. Fractional values of pixel color is one thing.

Our human eyes have a lot of filters (hardware and software-based) before recognition takes place.

orting · on Dec 24, 2015

There is a lot of stuff the human eye cant see that is very interesting. One of the challenges in medical imaging is getting accurate labelling of images. For many labellings we see large inter- and intra-observer variability. We have both the problem that humans see something that is not interesting and miss something that is interesting.

I currently work on estimating emphysema extent in CT lung scans. Emphysema can be very diffuse and it is not possible to label individual pixels, so instead we try to learn the local emphysema pattern from a global label. Neural networks are interesting for this problem because the learn the features, but it is also a "problem" because the features might not make physically sense, which could make it hard to transfer the model and convince clinicians that they should use it.

guard-of-terra · on Dec 24, 2015

For that kind of task, you might want to filter out other things.

We should just be realistic. We want to take real image, except it might be tinkered with, and make neural net tell us what we see on it, except we also want it to see what we can't see, and we want it to answer as accurate as possible, except we also want short and definitive answer.

We also kind of want it to admit that image always contains more than one thing, but kind of don't.

sp332 · on Dec 24, 2015

The neural network should capture all the behavior of the relevant filters as well.

guard-of-terra · on Dec 24, 2015

I'm not sure. Human brain uses dedicated facilities for this.

You won't win much by making every neural network learn stuff from scratch that can be done once, good.

RodericDay · on Dec 24, 2015

I'm not in computer science or anything but... is there like a big ML vs. ML-skeptic divide in the field? Where could I read about it?

natch · on Dec 24, 2015

Cool! So now I have this Python notebook running (I guess) on docker, and it tells me:

    [I 08:37:21.591 NotebookApp] Writing notebook server cookie secret to /.local/share/jupyter/runtime/notebook_cookie_secret
    [I 08:37:21.757 NotebookApp] Serving notebooks from local directory: /neural-nets
    [I 08:37:21.758 NotebookApp] 0 active kernels 
    [I 08:37:21.759 NotebookApp] The IPython Notebook is running at: http://0.0.0.0:8888/
    [I 08:37:21.759 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

What next? Should I be looking for a python notebook tutorial, or a docker tutorial, or both?

The author lists some commands to run, but it isn't clear where to type those commands. All I see is a terminal with the above output.

avian · on Dec 24, 2015

I don't know anything about Docker, but to use iPython Notebook, you need to open it in a browser. The command line starts a local web server. It says it is running at "http://0.0.0.0:8888/" in your case (which sounds invalid). Usually, Notebook opens the browser for you automatically when it starts. I guess Docker interferes with that.

antsar · on Dec 24, 2015

0.0.0.0 is valid. It means the server is listening on all available addresses (as opposed to, say, only 127.0.0.1)

jre · on Dec 24, 2015

Open a browser and navigate to your docker container's :8888 port. That would be localhost:9990 if you used the docker commandline given in the repo.

zenlikethat · on Dec 24, 2015

Or :9990 on the Docker Machine VM's IP address if you're using Machine.

blazespin · on Dec 24, 2015

Notebook is OK (love the inline matplotlib!!) and publish to HTML is neat, but I prefer pycharm -> docker feature now.

lmm · on Dec 24, 2015

You follow the link in the article (the localhost:99... one) and then you can use the notebook in your web browser.

dnautics · on Dec 24, 2015

this really suggests that while neural networks are good, they aren't doing what we are doing when we classify images.

shoo · on Dec 24, 2015

i have a vague suspicion, without evidence, that many of these systems are trained to overfit their training sets horrendously.

i won't claim that all networks overfit the data, but i suspect the methods tend to be prone to overfitting more than other methods that have far less parameters to optimise.

this doesn't really matter in order to generate a bunch of fun pictures and visualisations.

blazespin · on Dec 24, 2015

Maybe, it's hard to say. I remember when my kids were really young they used to see really random things in clouds and sand or whatever - just like a neural net would.

kuschku · on Dec 24, 2015

There’s even a name (and a subreddit) for the phenomenon: https://www.reddit.com/r/pareidolia

It shows that a human brain is just a neural network with decades of training.

dnautics · on Dec 25, 2015

convnets are based off of the structure in the visual cortex, so there are likely some low-level similarities. It's the high-level processing that I doubt.

Houshalter · on Dec 24, 2015

Well we don't know that for sure. No one has been able to run backpropagation through the human brain, and find the exact right set of inputs that would fool us.

eli_gottlieb · on Dec 24, 2015

We know the human brain doesn't learn via backpropagation of errors. Biology can't implement a "global" algorithm of that sort, it needs a local, cell-to-cell learning rule.

Houshalter · on Dec 25, 2015

There are many theories on how the brain learns, including variations of backpropagation. We don't really know anything. Here's a relatively recent theory by Hinton: https://www.youtube.com/watch?v=kxp7eWZa-2M&feature=youtu.be...

Regardless, this has nothing to do with how the brain learns, but rather the function it has learned. Even if a neural network somehow used entirely local learning rules, it could still be exploited with this method.

eli_gottlieb · on Dec 25, 2015

>Regardless, this has nothing to do with how the brain learns, but rather the function it has learned.

The learning rule and the hypothesis class together dictate what sort of function is learned.

>Even if a neural network somehow used entirely local learning rules, it could still be exploited with this method.

Yes, which is why current neural networks may not be the best learning method.

Houshalter · on Dec 26, 2015

This method is basically just fuzzing with a really efficient method to do it quickly. But in theory you could try every possible set of inputs into the human eye, and it's quite possible you would find images like this, where only slight changes to the inputs cause entirely different outputs.

But we can't try every possible set of inputs to human eyes. So we don't actually know how fragile human brains are. I suspect that brains use similar tricks to artificial neural networks, and learn similar functions.

p1esk · on Dec 25, 2015

No, we don't know that. Take a look at this paper: http://arxiv.org/abs/1411.0247

simonh · on Dec 24, 2015

Exactly. These neural nets have no concept of what a dog, vulture or trash can are, what properties that have, even what they are like as three dimensional objects. Many people reading about image classifiers assume that the networks are performing the same task humans do in the same way, but they are actualy worlds apart.

empath75 · on Dec 24, 2015

The primary difference is that we have brains that do other things besides classifying images. We know things about the world that help us clarify ambiguous visual input. I think going forward that we're going to improve these things by linking more, different kinds of neural networks together.

lordnacho · on Dec 24, 2015

Quite interesting. Sounds like he's subtly modifying the images at the points where the various classes are sensitive.

I don't think any humans are going to be fooled by this procedure, so what algorithm do people use to classify? There must be other algos out there that are more plausible as explanations for what humans (and animals) use?

eli_gottlieb · on Dec 24, 2015

>I don't think any humans are going to be fooled by this procedure, so what algorithm do people use to classify? There must be other algos out there that are more plausible as explanations for what humans (and animals) use?

Personally, it seems very plausible to me that humans and animals use generative modeling (which was behind the "human-level concept learning" paper published this month) rather than discriminative (like typical neural networks).

argonaut · on Dec 24, 2015

This implies human eyes are extremely robust. But the truth is our eyes can be tricked very easily too.

For what it's worth, of all the intelligent things the human brain performs, the visual system is the system that is closest to what neural nets are doing.

wyager · on Dec 24, 2015

There are a number of obvious tricks that humans can use and simple NN classifiers can't.

For example, humans have access to facts, which allow for better use of context information. E.g. I know that pandas live in trees, and that the queen wears a crown.

It does appear that humans do better with limited training data. I have probably seen fewer pandas than the NN in the article, but I can identify them more reliably. It probably comes down to humans' superior ability to generalize training examples.

tinco · on Dec 24, 2015

They way I understand is that humans simply use more than just a simple single neural network. Humans do more than just compare arrays of pixels to other arrays of pixels we've seen before. We have neurons that distinguish shapes, we have neurons that distinguish faces, we have neurons that distinguish textures, etc.

So for a human the question is more like: Does it have a shape like a Panda, does it have a face like a panda, does it have fur like a panda, and does it have coloring similar to a panda? Then it probably is a panda.

I also don't believe all of these systems are trained neural networks, but rather some of them are evolved instincts, hardwired and (near) unchangeable during a lifetime. Perfected over millions of years.

tripzilch · on Dec 27, 2015

> I don't think any humans are going to be fooled by this procedure, so what algorithm do people use to classify? There must be other algos out there that are more plausible as explanations for what humans (and animals) use?

Well that's the thing, it's a bit of a philosophical paradox/conundrum actually :)

See, you can subtly tweak these images up to the point where an artificial neural network is really sure that it's a vulture and not a panda. It scores real high on "vulture", not just slightly over the threshold, it's possibly to push it far into "most definitely a vulture" score.

And--someone correct me if I'm wrong--I think I remember from other research that if you tweak an image into another category using the gradients of one neural net, it still scores very high on that other category if you try to classify it with a different neural net.

While our own biological neural network ... well, as you look at the image, you're really sure that it's a panda and not a vulture.

Now imagine the opposite. Let's assume we can fool our own biological neural networks in a similar manner. What would this experience be like?

Imagine you observe an image that to your eyes (or visual cortex) most definitely looks like a panda. Except it "actually", "really" is a picture of a vulture.

The question then becomes, if it looks like a panda to our biological/human neural networks, who or what is the authority that can say "no you are mistaken, this is actually a picture of a vulture, it just seems like a panda to your brain"?

Because it will fool human neural networks, most people will agree "yup looks like a panda to me".

So if all the humans are wrong, who gets to say what it "really" is a picture of? Maybe an artificial neural network? :-) Because we already have one of those. It's in the article. There's that picture that to our puny mistaken human neural nets definitely looks like a panda. But the artificial neural net in the article sees the "really real truth" and tells us that no, it really is a picture of a vulture, really, with mathematical certainty. ("wake up sheeple!", etc ;-) )

jvns · on Dec 24, 2015

> he

she.

extrapolate · on Dec 24, 2015

Great read! Time to troll some of my machine learning co-workers.

"Can you classify this image of a panda? Oh, curious, it thinks it is a vulture..."

rrmm · on Dec 24, 2015

There are tons of anecdotes i remember from AI classes about neural nets learning something other than what you expected them to learn from your training set.

For example, one story involved training a classifier to recognize an overhead image with tanks vs without. It turns out it ended up learning which days were sunny and which were overcast.

This sort of thing happens when training people from examples too: From the mundane cases in school, to the AA587 crash in Queens NYC.

jefftk · on Dec 24, 2015

> ended up learning which days were sunny and which were overcast.

The earliest source I can find for this is https://neil.fraser.name/writing/tank/ but it says it "might be apocryphal".

jefftk · on Dec 24, 2015

Actually, there's also the 1992 paper "What Artificial Experts Can and Cannot Do" [1] which gives this anecdote and calls it a "legend".

[1] http://www.jefftk.com/dreyfus92.pdf

rrmm · on Dec 25, 2015

Nice sleuthing! I don't even recall where I first heard this story myself. The paper was an interesting read and still largely applicable even now in this 3rd or 4th coming of NN's.

jefftk · on Dec 25, 2015

Expanded this into a post: http://www.jefftk.com/p/detecting-tanks

thomasahle · on Dec 24, 2015

I don't think it'll nessesarily work, if you take an adversial example for one network and plugs it into another.

jvns · on Dec 24, 2015

surprisingly, if you read the paper, the adversarial examples actually do generalize to some extent if you try them on another network!

Houshalter · on Dec 24, 2015

Here is the table from the original paper. Note this is for MNIST, not more complicated datasets like Imagenet.

http://i.imgur.com/fJ35PTc.png

It's kind of confusing, but table 2 shows what percent of these adversarial images trained on one networked worked on another. It varies quite a bit, and many networks aren't similar enough to each other for it to work reliably. But there is definitely some degree of generalization.

awqrre · on Dec 24, 2015

Could processing the image before and after adding a small blur to the image detect this kind of thing?

sp332 · on Dec 24, 2015

You would lose information from the texture of the image that way. It's already downscaled to 224 pixels square, so you'd really be restricting the amount of data available to the function.

awqrre · on Dec 26, 2015

of course you would loose some information but it is probably not enough to mis-identify most objects..

Qantourisc · on Dec 24, 2015

What happens if you retrain the network with these bugs ? (As in tell the vulture is not a panda ?)

jvns · on Dec 24, 2015

the network gets smarter! (you can read the paper to find out more =D)

pmalynin · on Dec 24, 2015

https://news.ycombinator.com/item?id=10742390

ohblahitsme · on Dec 24, 2015

Wow this post is fantastic! Great read Julia!

sundarurfriend · on Dec 24, 2015

I should probably begin reading author names when starting to read articles - knowing this was written by Julia Evans would have clued me in on how fun a read this was going to be, and motivated me to cut down on the open-tab-procrastination hours!

> So now we’ve seen the network do a correct thing, and we’ve seen it make an adorable mistake by accident (the queen is wearing a shower cap ).

This is such a Julia sentence, including the suddenly appropriate emoji, it should probably have clued me in.

Edit: HN is breaking my heart with its anti-emoji stance, so (re)read the article to see the quote in its full glory.

gcb0 · on Dec 24, 2015

This is a great example of why computer scientists get a bad rep.

zero understanding of something so simple, laid out as black magic. And a lot more time spent fiddling with the thing as a toddler playing with a toy than it would be required to read two or three articles that explained it correctly.

jvns · on Dec 24, 2015

which articles would you suggest?