AIY Projects: Vision Kit announced - build your own intelligent camera

Keen Google fans will recognise the similarity between the Vision Kit and Google Clips, a recently announced smart camera.

The Vision Kit enables makers to build a similar device, but that’s just the start. AIY Projects is all about developing hackable AI kits for makers that they can integrate into their own projects.

“We look at how do we get these technologies in people’s hands in ways that are easy,” says Jess Holbrook, AIY Projects UX Research Lead.

“AI and machine learning are up on this pedestal,” he explains. Google wants to show makers that they can build pretty impressive stuff themselves. As with earlier AIY Projects, Google is interested to see what ideas makers come up with to use AI to “solve problems for themselves, their families, and their communities.”

AIY Projects Vision Kit bonnet

The Vision Kit adds an advanced AI hardware board to the Raspberry Pi. Developed by Google, it’s called the Vision Bonnet and sports a powerful Movidius MA2450 vision processing chip.

The chip acts as a “neural network accelerator” says Billy Rutledge, Director of AIY Projects at Google. “In the case of Vision Kit, we are moving forward in a pretty big leap and running the AI neural network on the accessory board itself.”

This is in contrast to the earlier AIY Projects Voice Kit, which relied on Google Cloud infrastructure for voice recognition and natural language processing.

“We have developed a deep learning inference acceleration engine that we’re running on the chip,” explains Kai Yick from the AIY Projects team. “It’s 60 times faster than trying to do it on a Raspberry Pi 3.”


The projects you build will operate independently of a network connection, making for a more versatile piece of equipment. The Vision Bonnet also ensures the security of captured images, as they are all processed locally on the device.

Build your own Google Clips kit

Once you’ve assembled the Vision Kit, there are a number of neural networks software programs you can run. The first neural network is a “person, cat, and dog detector,” reveals Peter Malkin, Software Lead at AIY Projects. It detects if a person, cat, or dog is in the frame.

The second neural network is focused on facial emotions. It will detect happiness, sadness, and other sentiments.

The third neural network can identify 1,001 common objects, like a cup, an orange, or a chair. A label displays the name, and the level of confidence in the neural network’s inference.


In the future, the hope is that users will be able to modify these neural networks. You could take the cat, dog, and person model and modify it to look for rabbits. Then build a project that works with a rabbit hutch, for instance.

Google can’t wait to see what makers have in store for the AIY Projects Vision Kit. Pre-orders for the Vision Kit will begin in December at Micro Center.

The initial run is going to be very limited. If you’re interested, we advise you to order a kit as soon as it’s available. Sign up to our newsletter (in the footer of this page) for news.

More features from HackSpace magazine magazine