Follow Along

RSS Feed Join Us on Twitter On Facebook

Get Engaged


Related Reading

Our Sponsors


Join Us

Newsfeeds from around the industry
Google Research Blog
The latest news on Google Research.

  • DeepDream - a code example for visualizing Neural Networks
    Posted by Alexander Mordvintsev, Software Engineer, Christopher Olah, Software Engineering Intern and Mike Tyka, Software Engineer

    Two weeks ago we blogged about a visualization tool designed to help us understand how neural networks work and what each layer has learned. In addition to gaining some insight on how these networks carry out classification tasks, we found that this process also generated some beautiful art.
    Top: Input image. Bottom: output image made using a network trained on places by MIT Computer Science and AI Laboratory.
    We have seen a lot of interest and received some great questions, from programmers and artists alike, about the details of how these visualizations are made. We have decided to open source the code we used to generate these images in an IPython notebook, so now you can make neural network inspired images yourself!

    The code is based on Caffe and uses available open source packages, and is designed to have as few dependencies as possible. To get started, you will need the following (full details in the notebook):

    Once you’re set up, you can supply an image and choose which layers in the network to enhance, how many iterations to apply and how far to zoom in. Alternatively, different pre-trained networks can be plugged in.

    It'll be interesting to see what imagery people are able to generate. If you post images to Google+, Facebook, or Twitter, be sure to tag them with #deepdream so other researchers can check them out too.

  • Google Computational Journalism Research Awards launch in Europe
    Posted by Andrea Held, Google University Relations & Matt Cooke, Google News Lab Europe

    Journalism is evolving fast in the digital age, and researchers across Europe are working on exciting projects to create innovative new tools and open source software that will support online journalism and benefit readers. As part of the wider Google Digital News Initiative (DNI), we invited academic researchers across Europe to submit proposals for the Computational Journalism Research Awards.

    After careful review by Google’s News Lab and Research teams, the following projects were selected:

    SCAN: Systematic Content Analysis of User Comments for Journalists
    Walid Maalej, Professor of Informatics, University of Hamburg
    Wiebke Loosen, Senior Researcher for Journalism, Hans-Bredow-Institute, Hamburg, Germany
    This project aims at developing a framework for the systematic, semi-automated analysis of audience feedback on journalistic content to better reflect the voice of users, mitigate the analysis efforts, and help journalists generate new content from the user comments.

    Event Thread Extraction for Viewpoint Analysis
    Ioana Manolescu, Senior Researcher, INRIA Saclay, France
    Xavier Tannier, Professor of Computer Science, University Paris-Sud, France
    The goal of the project is to automatically build topic "event threads" that will help journalists and citizens decode claims made by public figures, in order to distinguish between personal opinion, communication tools and voluntary distortions of the reality.

    Computational Support for Creative Story Development by Journalists
    Neil Maiden, Professor of Systems Engineering
    George Brock, Professor of Journalism, City University London, UK
    This project will develop a new software prototype to implement creative search strategies that journalists could use to strengthen investigative storytelling more efficiently than with current news content management and search tools.

    We congratulate the recipients of these awards and we look forward to the results of their research. Each award includes funding of up to $60,000 in cash and $20,000 in computing credits on Google’s Cloud Platform. Stay tuned for updates on their progress.

  • Inceptionism: Going Deeper into Neural Networks
    Posted by Alexander Mordvintsev, Software Engineer, Christopher Olah, Software Engineering Intern and Mike Tyka, Software Engineer

    Artificial Neural Networks have spurred remarkable recent progress in image classification and speech recognition. But even though these are very useful tools based on well-known mathematical methods, we actually understand surprisingly little of why certain models work and others don’t. So let’s take a look at some simple techniques for peeking inside these networks.

    We train an artificial neural network by showing it millions of training examples and gradually adjusting the network parameters until it gives the classifications we want. The network typically consists of 10-30 stacked layers of artificial neurons. Each image is fed into the input layer, which then talks to the next layer, until eventually the “output” layer is reached. The network’s “answer” comes from this final output layer.

    One of the challenges of neural networks is understanding what exactly goes on at each layer. We know that after training, each layer progressively extracts higher and higher-level features of the image, until the final layer essentially makes a decision on what the image shows. For example, the first layer maybe looks for edges or corners. Intermediate layers interpret the basic features to look for overall shapes or components, like a door or a leaf. The final few layers assemble those into complete interpretations—these neurons activate in response to very complex things such as entire buildings or trees.

    One way to visualize what goes on is to turn the network upside down and ask it to enhance an input image in such a way as to elicit a particular interpretation. Say you want to know what sort of image would result in “Banana.” Start with an image full of random noise, then gradually tweak the image towards what the neural net considers a banana (see related work in [1], [2], [3], [4]). By itself, that doesn’t work very well, but it does if we impose a prior constraint that the image should have similar statistics to natural images, such as neighboring pixels needing to be correlated.
    So here’s one surprise: neural networks that were trained to discriminate between different kinds of images have quite a bit of the information needed to generate images too. Check out some more examples across different classes:
    Why is this important? Well, we train networks by simply showing them many examples of what we want them to learn, hoping they extract the essence of the matter at hand (e.g., a fork needs a handle and 2-4 tines), and learn to ignore what doesn’t matter (a fork can be any shape, size, color or orientation). But how do you check that the network has correctly learned the right features? It can help to visualize the network’s representation of a fork.

    Indeed, in some cases, this reveals that the neural net isn’t quite looking for the thing we thought it was. For example, here’s what one neural net we designed thought dumbbells looked like:
    There are dumbbells in there alright, but it seems no picture of a dumbbell is complete without a muscular weightlifter there to lift them. In this case, the network failed to completely distill the essence of a dumbbell. Maybe it’s never been shown a dumbbell without an arm holding it. Visualization can help us correct these kinds of training mishaps.

    Instead of exactly prescribing which feature we want the network to amplify, we can also let the network make that decision. In this case we simply feed the network an arbitrary image or photo and let the network analyze the picture. We then pick a layer and ask the network to enhance whatever it detected. Each layer of the network deals with features at a different level of abstraction, so the complexity of features we generate depends on which layer we choose to enhance. For example, lower layers tend to produce strokes or simple ornament-like patterns, because those layers are sensitive to basic features such as edges and their orientations.
    Left: Original photo by Zachi Evenor. Right: processed by Günther Noack, Software Engineer
    Left: Original painting by Georges Seurat. Right: processed images by Matthew McNaughton, Software Engineer
    If we choose higher-level layers, which identify more sophisticated features in images, complex features or even whole objects tend to emerge. Again, we just start with an existing image and give it to our neural net. We ask the network: “Whatever you see there, I want more of it!” This creates a feedback loop: if a cloud looks a little bit like a bird, the network will make it look more like a bird. This in turn will make the network recognize the bird even more strongly on the next pass and so forth, until a highly detailed bird appears, seemingly out of nowhere.
    The results are intriguing—even a relatively simple neural network can be used to over-interpret an image, just like as children we enjoyed watching clouds and interpreting the random shapes. This network was trained mostly on images of animals, so naturally it tends to interpret shapes as animals. But because the data is stored at such a high abstraction, the results are an interesting remix of these learned features.
    Of course, we can do more than cloud watching with this technique. We can apply it to any kind of image. The results vary quite a bit with the kind of image, because the features that are entered bias the network towards certain interpretations. For example, horizon lines tend to get filled with towers and pagodas. Rocks and trees turn into buildings. Birds and insects appear in images of leaves.
    The original image influences what kind of objects form in the processed image.
    This technique gives us a qualitative sense of the level of abstraction that a particular layer has achieved in its understanding of images. We call this technique “Inceptionism” in reference to the neural net architecture used. See our Inceptionism gallery for more pairs of images and their processed results, plus some cool video animations.

    We must go deeper: Iterations

    If we apply the algorithm iteratively on its own outputs and apply some zooming after each iteration, we get an endless stream of new impressions, exploring the set of things the network knows about. We can even start this process from a random-noise image, so that the result becomes purely the result of the neural network, as seen in the following images:
    Neural net “dreams”— generated purely from random noise, using a network trained on places by MIT Computer Science and AI Laboratory. See our Inceptionism gallery for hi-res versions of the images above and more (Images marked “Places205-GoogLeNet” were made using this network).
    The techniques presented here help us understand and visualize how neural networks are able to carry out difficult classification tasks, improve network architecture, and check what the network has learned during training. It also makes us wonder whether neural networks could become a tool for artists—a new way to remix visual concepts—or perhaps even shed a little light on the roots of the creative process in general.

  • New ways to add Reminders in Inbox by Gmail
    Posted by Dave Orr, Google Research Product Manager

    Last week, Inbox by Gmail opened up and improved many of your favorite features, including two new ways to add Reminders.

    First up, when someone emails you a to-do, Inbox can now suggest adding a Reminder so you don’t forget. Here's how it looks if your spouse emails you and asks you to buy milk on the way home:
    To help you add Reminders, the Google Research team used natural language understanding technology to teach Inbox to recognize to-dos in email.
    And much like Gmail and Inbox get better when you report spam, your feedback helps improve these suggested Reminders. You can accept or reject them with a single click:
    The other new way to add Reminders in Inbox is to create Reminders in Google Keep--they will appear in Inbox with a link back to the full note in Google Keep.
    Hopefully, this little extra help gets you back to what matters more quickly and easily. Try the new features out, and as always, let us know what you think using the feedback link in the app.

  • Google Computer Vision research at CVPR 2015
    Posted by Vincent Vanhoucke, Google Research Scientist

    Much of the world's data is in the form of visual media. In order to utilize meaningful information from multimedia and deliver innovative products, such as Google Photos, Google builds machine-learning systems that are designed to enable computer perception of visual input, in addition to pursuing image and video analysis techniques focused on image/scene reconstruction and understanding.

    This week, Boston hosts the 2015 Conference on Computer Vision and Pattern Recognition (CVPR 2015), the premier annual computer vision event comprising the main CVPR conference and several co-located workshops and short courses. As a leader in computer vision research, Google will have a strong presence at CVPR 2015, with many Googlers presenting publications in addition to hosting workshops and tutorials on topics covering image/video annotation and enhancement, 3D analysis and processing, development of semantic similarity measures for visual objects, synthesis of meaningful composites for visualization/browsing of large image/video collections and more.

    Learn more about some of our research in the list below (Googlers highlighted in blue). If you are attending CVPR this year, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. Members of the Jump team will also have a prototype of the camera on display and will be showing videos produced using the Jump system on Google Cardboard.

    Applied Deep Learning for Computer Vision with Torch
    Koray Kavukcuoglu, Ronan Collobert, Soumith Chintala

    DIY Deep Learning: a Hands-On Tutorial with Caffe
    Evan Shelhamer, Jeff Donahue, Yangqing Jia, Jonathan Long, Ross Girshick

    ImageNet Large Scale Visual Recognition Challenge Tutorial
    Olga Russakovsky, Jonathan Krause, Karen Simonyan, Yangqing Jia, Jia Deng, Alex Berg, Fei-Fei Li

    Fast Image Processing With Halide
    Jonathan Ragan-Kelley, Andrew Adams, Fredo Durand

    Open Source Structure-from-Motion
    Matt Leotta, Sameer Agarwal, Frank Dellaert, Pierre Moulon, Vincent Rabaud

    Oral Sessions:
    Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection
    George Papandreou, Iasonas Kokkinos, Pierre-André Savalle

    Going Deeper with Convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

    DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time
    Richard A. Newcombe, Dieter Fox, Steven M. Seitz

    Show and Tell: A Neural Image Caption Generator
    Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

    Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
    Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, Trevor Darrell

    Visual Vibrometry: Estimating Material Properties from Small Motion in Video
    Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Frédo Durand, William T. Freeman

    Fast Bilateral-Space Stereo for Synthetic Defocus
    Jonathan T. Barron, Andrew Adams, YiChang Shih, Carlos Hernández

    Poster Sessions:
    Learning Semantic Relationships for Better Action Retrieval in Images
    Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Charles Rosenberg, Li Fei-Fei

    FaceNet: A Unified Embedding for Face Recognition and Clustering
    Florian Schroff, Dmitry Kalenichenko, James Philbin

    A Mixed Bag of Emotions: Model, Predict, and Transfer Emotion Distributions
    Kuan-Chuan Peng, Tsuhan Chen, Amir Sadovnik, Andrew C. Gallagher

    Best-Buddies Similarity for Robust Template Matching
    Tali Dekel, Shaul Oron, Michael Rubinstein, Shai Avidan, William T. Freeman

    Articulated Motion Discovery Using Pairs of Trajectories
    Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

    Reflection Removal Using Ghosting Cues
    YiChang Shih, Dilip Krishnan, Frédo Durand, William T. Freeman

    P3.5P: Pose Estimation with Unknown Focal Length
    Changchang Wu

    MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching
    Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg

    Inferring 3D Layout of Building Facades from a Single Image
    Jiyan Pan, Martial Hebert, Takeo Kanade

    The Aperture Problem for Refractive Motion
    Tianfan Xue, Hossein Mobahei, Frédo Durand, William T. Freeman

    Video Magnification in Presence of Large Motions
    Mohamed Elgharib, Mohamed Hefeeda, Frédo Durand, William T. Freeman

    Robust Video Segment Proposals with Painless Occlusion Handling
    Zhengyang Wu, Fuxin Li, Rahul Sukthankar, James M. Rehg

    Ontological Supervision for Fine Grained Classification of Street View Storefronts
    Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, Liron Yatziv

    VIP: Finding Important People in Images
    Clint Solomon Mathialagan, Andrew C. Gallagher, Dhruv Batra

    Fusing Subcategory Probabilities for Texture Classification
    Yang Song, Weidong Cai, Qing Li, Fan Zhang

    Beyond Short Snippets: Deep Networks for Video Classification
    Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici

    THUMOS Challenge 2015
    Program organizers include: Alexander Gorban, Rahul Sukthankar

    DeepVision: Deep Learning in Computer Vision 2015
    Invited Speaker: Rahul Sukthankar

    Large Scale Visual Commerce (LSVisCom)
    Panelist: Luc Vincent

    Large-Scale Video Search and Mining (LSVSM)
    Invited Speaker and Panelist: Rahul Sukthankar
    Program Committee includes: Apostol Natsev

    Vision meets Cognition: Functionality, Physics, Intentionality and Causality
    Program Organizers include: Peter Battaglia

    Big Data Meets Computer Vision: 3rd International Workshop on Large Scale Visual Recognition and Retrieval (BigVision 2015)
    Program Organizers include: Samy Bengio
    Includes speaker Christian Szegedy - “Scalable approaches for large scale vision”

    Observing and Understanding Hands in Action (Hands 2015)
    Program Committee includes: Murphy Stein

    Fine-Grained Visual Categorization (FGVC3)
    Program Organizers include: Anelia Angelova

    Large-scale Scene Understanding Challenge (LSUN)
    Winners of the Scene Classification Challenge: Julian Ibarz, Christian Szegedy and Vincent Vanhoucke
    Winners of the Caption Generation Challenge: Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan

    Looking from above: when Earth observation meets vision (EARTHVISION)
    Technical Committee includes: Andreas Wendel

    Computer Vision in Vehicle Technology: Assisted Driving, Exploration Rovers, Aerial and Underwater Vehicles
    Invited Speaker: Andreas Wendel
    Program Committee includes: Andreas Wendel

    Women in Computer Vision (WiCV)
    Invited Speaker: Mei Han

    ChaLearn Looking at People (sponsor)

    Fine-Grained Visual Categorization (FGVC3) (sponsor)

All the Latest

Getting Around the Site

Home - all the latest on SNC
SEO - our collection of SEO articles
Technical SEO - for the geeks
Latest News - latest news in search
Analytics - measure up and convert
RSS Rack - feeds from around the industry
Search - looking for something specific?
Authors - Author Login
SEO Training - Our sister site
Contact Us - get in touch with SNC

What's New?

All content and images copyright Search News Central 2014
SNC is a Verve Developments production, the Forensic SEO Specialists- where Gypsies roam.