Say what you will about Zuckerberg, but this is a really cool project! Steps he took:
1. Connect his home devices (lights, thermostat, doors, music player, TV, CCTV etc) to his computer, so he can turn them on and off from it and in general automate his house from his PC.
2. Add an NLP component to his computer, so he could text instructions to it and it would apply the correct automation task. This can learn preferences too and be told about mistakes so it does better in the future.
3. Add face recognition, so it can recognise people at his door and automatically open it if they are expected.
4. Add speech recognition, by creating an app for the phone that is constantly on and listening for his voice. This is then converted to automation tasks again.
All in all, an interesting way to spend 100 hours. Calling it AI is a bit of a stretch though; this is basically linking up a bunch of in-accurate sensory parsers, coupled with some limited machine learning, connected to a bunch of if statements. Still a long way off of Jarvis, who is basically a genius by human standards!
Mark replied to a comment along these exact lines. Quote below;
Mark Zuckerberg: That's the thing about AI. It's sort of like magic. We only call things AI that we don't understand yet. Once we understand something, it's just math.
But if you'd asked someone 30 years if a computer system that you could talk to, that could see your friends and let them in, that could learn your listening habits and figure out when to make you toast, if that was intelligent, then most people would have said that would be AI. Now that we know how to do it, it's just math.
In my parents generation, and the generation that came after them (the one my siblings belong to, Gen X), I still think this differentiation between concepts of fictional AI were present. Consider for example the computer systems in Star Trek, and especially The Next Generation, and contrast how they worked with how Commander Data was depicted. The computer systems on Star Trek can be programmed using natural language, and by the 23rd century they're capable of even piloting an entire starship that a crew of hundreds would normally be required for. Yet, these computer systems never engage in the kind of inner exploration (or "character development") that Commander Data goes through every episode. This is what we're seeing here, Jarvis is more like a primitive version of the USS Enterprise's central computer, whereas "true" artificial intelligence, the kind that makes complex decisions on its own and has the capability of self awareness, still remains fiction.
Writers, at least, understood that there were different "flavors" of artificial intelligence. I have a funny feeling that at least science fiction enthusiasts also knew that AI would be separated in this way if it ever came to fruition.
Pardon my nerding out, but I think the difference between the computer system and Data was the degree of self-determination given to each system. When the holodeck was instructed to create an opponent capable of defeating Data, it created an sentient Moriarty AI. The computer system was fully capable of being a full AI, it was just never instructed to.
Moreover, they make the point in one of the episodes (3x06, "Booby Trap", IIRC) that the ship's computer's intelligence is INTENTIONALLY limited. The reason that Data is unique isn't simply due to his status as a fully sapient being. It's because of his positronic brain.
The ship's computer being limited to turning on the lights is just one of many points in Star Trek where the writers intentionally avoid using the fictional technology to its full capabilities in order to keep the plot exciting and suitable for a main stream audience. Transporters and replicators are another example: Suitably used they could be terrifying weapons.
> Transporters and replicators are another example: Suitably used they could be terrifying weapons.
One thing I was always wondering is why it took as long as ST:VOY to see transporters being used as a way to deliver armed torpedoes directly into the target.
In later seasons of Stargate SG-1 they sidestepped this problem cleverly - when humans received transporter technology from a more advanced species, they had non-overridable safeguards[0] (and an inspector on-board) preventing them from being used as a weapons delivery platform.
[0] - which, in the best fashion of this series, were overridden several times anyway. A lesson for those who think that if something is banned, it won't be used.
Star Trek, for all it's technology, was always a story about humans. Even Data's journey is about as human-centric as you can get. In that way even the recent series have mirrored an older age of science fiction ("Using science to opine on the human condition" etc).
The Culture novels managed to convince me that a story about true AI needs to not only look a little different, but fundamentally different. Not least, answering "Why still humans around?"
And yet 30 years we could do all that with a whole lot of if statements. It would just return false positives a lot more. A couple of others have replied saying the same thing as you that this is 'AI', but it's not. As soon as you try to use it for something it wasn't programmed for, it fails. It doesn't even understand what it is doing right now. This isn't me 'moving the goalposts' either, we have literally never achieved anything close to AI. It baffles me that people think this is even remotely AI to be honest.
It's a very long read, but it will give you a much better overview of AI and the different "calibres" than I could possibly try to in a HN comment.
What Mark has used here is "Artificial Narrow Intelligence (ANI)", the first of three levels/calibres of AI. Level 2 is "Artificial General Intelligence (AGI)", otherwise known as "Human Level AI". And level 3 is "Artificial Superintelligence (ASI)" which is "“an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.” [Quote: Nick Bostrom].
sigh this is why I don't usually get involved in arguments about whether something is considered AI or not; it just devolves into people trying to define and redefine what the word 'AI' even means.
Final word: I follow this stuff closely and read the research articles, and what we currently have is really great, but it's still just pulling the wool over your eyes. These techniques and algorithms trick you into thinking they are intelligent when in fact they are anything but. What we have now is a parlour trick in comparison to what the brain is doing. When true AI actually comes along, no one will doubt it.
Superintelligence will be achieved by:
a) Training a system to trick the smartest people on the planet that it's smart, and then
b) Asking it to prove itself by solving the Riemann hypothesis.
I was unaware that we knew how how human intelligence works. </sarcasm>
If we can build systems that can fool us into thinking their intelligent (which I will agree that outside of limited scopes we have not), then we built intelligence.
What I see we have now is lots of bricks - often overly problem-specific (and it's important to stress this; a lot of things that get called "AI" in the news is really lots of compute thrown at super specific heuristics). To call it AI, we still need to connect them together into a system that can use those components to learn and solve problems it didn't see before, without bricking itself in the process.
> parlour trick in comparison to what the brain is doing
That's a good turn of phrase, because it speaks to the real rub between algorithmic insufficiency and hardware insufficiency. From my reading, it seems fair to say that "strong general AI" will require some product of both to be realized and that our current hardware capabilities likely fall short of the minimum requirement (even with optimal algorithms).
Were they imagining exascale computing when Quicksort was dreamed up in 1959? No. But does that make it any less important?
According to [1] there was a system in 1968 that “consistently outperformed humans when presented with the same recognition tasks”.
As someone who was into programming and electronics in the eighties, I also feel a need to object whenever people say things like “we thought it was AI until we could make a computer do it, and now it’s just math” (paraphrasing).
We (those of us interested in the subject back then) had a pretty good idea about what would be possible in the future (because of faster machines and more storage) and what would not be possible in the future (unless a breakthrough was made in understanding intelligence).
Now, I do think we underestimated how fast things have developed, and especially the resources people have access today via the internet, but none of what I have seen today would have been classified as AI by myself back in the eighties.
I also mucked around with computers in the 80s and agree with most of that. I guess a lot of whether stuff is AI comes down to how you define it. The Oxford English Dict has:
>[mass noun] The theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
So obviously by that speech and Zucks type of face recognition are AI. I note the '68 system was hybrid and required humans to measure the faces and enter data for where the features are so it was more analysing numbers.
But as a data point, you could do it 10 years ago. On a cheap desktop PC. And without Internet connection or cloud computing.
Source: around 2006 I made a Star Trek-like music control system with Microsoft Speech API and an electret microphone hand-soldered to a long cable and an audio jack. With ~30 minutes of training on my voice it worked as reliably in noisy environment as Google works for me now.
I think what's particularly great about this in a way is that it's not too remarkable. I'm not knocking it, but it's awesome to be around when creating an natural language, speech controlled, available anywhere system with face recognition to automate your own home is something that can be done in 100 hours.
I agree with the sentiment, the hours of course are the assembly time. When people ask me the role of an architect I try to impress upon them that well architected subsystems will be readily composable into a variety of forms and uses. That amplifies the millions of hours of research and development into a tool that can be wielded by a motivated individual or team to tremendous advantage.
95% of the time, good software creation is really just construction, not engineering. There are great paradigms and solutions to the majority of problems. There's room for design and UX, but you shouldn't need to custom engineer the walls out of carbon fiber or some aerospace composites unless you're launching the house to Mars. The engineering aspect really only comes into play when no components meet your needs.
The construction metaphor is full of great wisdom to be gleaned as software developers. And I'm not just saying that in hopes I didn't waste all those hours watching History Channel documentaries. :-D
For me a well designed house is actually, you know, about the design. And whilst the construction is important due to standards it is largely irrelevant.
The analogy to software is apt. Well designed, well architected software is so important. Anyone can write syntactically proper code. Far harder to design a system that has just the right amount of extensibility, performance, elegance etc to meet the requirements without going over the top.
Ask owners of beautifully designed Frank Lloyd Wright houses how happy they are with the leaky roofs that plague them. Wright didn't understand construction which lead to many practical flaws when people tried to actually build his designs.
When you're talking about civil/mechanical engineering, good engineering is 'just construction' too. Your job is to take well known, tried-and-tested techniques and combine them in a safe and reliable way to produce a design which meets all of your requirements and can be built for your target price.
Engineering isn't really a place for more than very limited creativity. That place is R&D.
I think the remarkable bit is that he did this while being the CEO of Facebook. There's not a huge number of corporate CEOs who still code at all, and have time to put in a big project like this. And as someone who's worked with writing some of the same sort of code, it's really enjoyable to see someone drastically more talented making many of the same observations and encountering similar challenges.
Oh come on, it was clearly both. Try to tell us this excerpt is not an advertisement for the facebook SDK:
> I'd learn a lot about the state of AI this year, but I didn't realize I would also learn so much about what it's like to be an engineer at Facebook. And it's impressive.
> My experience of ramping up in the Facebook codebase is probably pretty similar to what most new engineers here go through. I was consistently impressed by how well organized our code is, and how easy it was to find what you're looking for -- whether it's related to face recognition, speech recognition, the Messenger Bot Framework [messenger.com/platform] or iOS development...
The rest of that paragraph goes through all the other developer tools you can sign up for in the facebook ecosystem, including convenient links. Then it continues:
> One of our values is "move fast". That means you should be able to come here and build an app faster than you can anywhere else, including on your own. You should be able to come here and use our infra and AI tools to build things it would take you a long time to build on your own.
It was a neat project, not denying that, but there is no free lunch. The CEO of a Fortune 500 does not write up a 5-10 min read without plugging his interests.
You can enjoy what he's doing, but also recognize what it is.
I think it's worth pointing out that he started his mobile efforts by talking about how easy it was to make a Facebook Messenger bot, something they're heavily pushing... and then said shortly thereafter, to do what he wanted, he had to switch to making his own app, because Messenger couldn't do what he wanted.
If a Google exec wrote this post, it would be about how amazing a Google product was as a solution to the problem, and that would be the end of it. But here's Mark saying "you know what, Messenger is great but I needed more". And maybe that could turn into some pretty big advice for his Messenger team on what to add next, but he seemed to be pointing out where the industry (including his own company) could do better.
I don't think he intended it to be marketing, but kudos to him for being able to spin it as a way to promote Facebook too.
I see it more as R&D than marketing, personally. His personal experiments may likely hint at new directions in which he wants to take his company. Depending on how you view Facebook, this could be terrifying or exciting.
One way I see it is: Facebook knows a lot about your (online) social interactions. Home-based AI would need to know a lot about your private behaviors and preferences. If they marry those two worlds, that's a heck of a lot of behavioral information to have. Google has a big lead in this area already, so if Facebook goes this direction too, it would be an interesting to see how it plays out.
> it's awesome to be around when creating an natural language, speech controlled, available anywhere system with face recognition to automate your own home is something that can be done in 100 hours
It's not creating. It's just copy+pasting and glueing.
The actual creation of these systems took decades of research.
I don't think calling it AI is a stretch. It's a system with sensors, actuators, a knowledge base, and a ruleset; plus it can learn and adapt its ruleset based on success/failure. That's the textbook definition of AI.
You added one thing in the list that makes me agree - "it can learn and adapt".
Jarvis in my house uses sensors, actuators, lights, switches, etc - but mine is not AI. It only does what I tell it how to do. I'm pretty happy with the results, but it can't learn a thing.
The latest thing I "taught" my Jarvis to do is stream the security camera to the TV with a voice command. Now if I only had time to look into how to tell it to stop...
That being said, I don't think the ability to learn and adapt. This is entirely a statement on semantics, but if I made a tic-tac-toe playing program that had every possible game state pre-programmed in, I would not hesitate to call it a game playing AI even though it lacks the ability to learn.
In the general programming industry over the last several decades, "artificial intelligence" has also applied to simple decision-making algorithms, even as simple as playing checkers. Arthur Samuel [1] was renowned for his work on a checkers AI circa 1959.
As a follow-on to the above comment, in AI there is something called expert systems, which is just a set of IF-THEN rules which emulate the reasoning of a human expert: https://en.wikipedia.org/wiki/Expert_system
Learning/inferencing/adaptation not included or required.
This is something I like about AI in the context of video games; the technology has improved, the complexity has increased, but the meaning of AI in this context has stayed constant.
The same can't really be said of AI in most other contexts, where the goalposts continually seem to move to whatever is currently just beyond our reach.
> Add face recognition, so it can recognise people at his door and automatically open it if they are expected
I wonder if you could open his door with an A4 printout of his face. That's the sort of thing I'd worry about creating this system.
Edit: Forgive me HN, it looks like a lot of people have wondered this. I only commented so high up because I commented before reading all the comments! Go upvote the more creative/indepth versions of this question rather than this one :)
There's a decent way to handle this method: use one of the Intel close RGBD cameras. I think they go by RealSense.
Now, do facial recognition with the depth component. "Oh look, paper. How quaint... NOPE!"
You could also log all people coming by with a known-unknown axis. You could train the system so that all they need do is look at the door, and they have the appropriate access they need. And it could also, instead, default in a "pass through to app for user to decide".
Don't think of this as a huge monolithic system... But instead of dozens of component parts that can work together. Then, the problem is certainly tractable. And then adding things like Google-Now, Alexa, Siri, and similar is no longer damn near impossible.
I particularly like Node-Red for this very project. It is javasript, however the big positive side effect is easy implementation with all sorts of nodejs APIs.
Ooh you could do all that and instead of a flat out denial based system fall back to a fingerprint scanning doorbell. Probably a weird conversation to have in the pub "Alright lads I'm gonna need you all to scan your fingerprints for me"
I already wrote a system that can take new facial inputs, classify you as "new user", and put your data in a database. All you need do is go up to the door. I can easily view the timecode and enter your details (name, phone#, email) at a later date.
Indeed. Security isn't a binary. There's plenty of layers. And we went from
"Print picture" to
"Compute Sfm 3d object from thousands of photos, and have access to a 3d printer that large that can print human-sized heads, and paint it appropriately in the correct color-space as to trigger the right facial recognition"
It's then trivial to add a feature of 'Blink' and 'Smile'. We have public, free haar and lbp cascades for those. It would be a few lines to add in a check that requests those 2 actions.
"Then you have to print an articulated skull and musculature that activate the appropriate smile/blink responses, with the correct coloring across the face."
And then hope that the video feed of the disjoined head trying to pass as human isn't just sent to the butler or a family member for "quality control"....
A somewhat surprising feature of suburbia in the US (for those of us that come from less safe countries) is that someone like Mark could ever have unbarred windows at street level directly facing a public street. In many places, this kind of recognition system would be at the outer gate of the complex, surrounded by tick concrete walls ending in an electrified fence or similar[1]. At that point, the face recognition does become the weaker link.
Also, even in US suburbia, smashing the window is likely going to trigger a burglar alarm and get the police called to the location, whereas bypassing the face recognition system will not.
[1] For a non billionaire house: thinner concrete walls ending in broken glass or decorative-yet-functional metal barbs, outer wall shared by a 10-20 homes neighborhood.
> Calling it AI is a bit of a stretch though; this is basically linking up a bunch of in-accurate sensory parsers, coupled with some limited machine learning, connected to a bunch of if statements.
The bar for "AI" (as distinct from "Jarvis-level AI") isn't high. Heck, video games have had "AIs" since the 1950s: https://en.wikipedia.org/wiki/Artificial_intelligence_(video... Not sure if everyone agrees with that definition, but's it the one in common use.
If so, then Machine Learning is just predicting the next possible result using the statistical probability of previously studied data, and could not qualify as AI either.
AI is being broken down in to so many parts, some that are invisible to us, that: either we abuse it to the point of over simplification, reclassify it as common sense, and redraw the finish line of where AI starts; OR we start calling everything we do a version of AI from computers turning off the lights to balancing the controls in the first Apollo missions.
If you're in a position to. It sounds like he plugged into Facebook's face recognition, for example, and I don't think that's available as an API request ('is this one of my friends')?
1. Connect his home devices (lights, thermostat, doors, music player, TV, CCTV etc) to his computer, so he can turn them on and off from it and in general automate his house from his PC.
2. Add an NLP component to his computer, so he could text instructions to it and it would apply the correct automation task. This can learn preferences too and be told about mistakes so it does better in the future.
3. Add face recognition, so it can recognise people at his door and automatically open it if they are expected.
4. Add speech recognition, by creating an app for the phone that is constantly on and listening for his voice. This is then converted to automation tasks again.
All in all, an interesting way to spend 100 hours. Calling it AI is a bit of a stretch though; this is basically linking up a bunch of in-accurate sensory parsers, coupled with some limited machine learning, connected to a bunch of if statements. Still a long way off of Jarvis, who is basically a genius by human standards!