- Lens Blur in the new Google Camera app
Posted by Carlos Hernández, Software Engineer
One of the biggest advantages of SLR cameras over camera phones is the ability to achieve shallow depth of field and bokeh effects. Shallow depth of field makes the object of interest "pop" by bringing the foreground into focus and de-emphasizing the background. Achieving this optical effect has traditionally required a big lens and aperture, and therefore hasn’t been possible using the camera on your mobile phone or tablet.
That all changes with Lens Blur, a new mode in the Google Camera app. It lets you take a photo with a shallow depth of field using just your Android phone or tablet. Unlike a regular photo, Lens Blur lets you change the point or level of focus after the photo is taken. You can choose to make any object come into focus simply by tapping on it in the image. By changing the depth-of-field slider, you can simulate different aperture sizes, to achieve bokeh effects ranging from subtle to surreal (e.g., tilt-shift). The new image is rendered instantly, allowing you to see your changes in real time.
Lens Blur replaces the need for a large optical system with algorithms that simulate a larger lens and aperture. Instead of capturing a single photo, you move the camera in an upward sweep to capture a whole series of frames. From these photos, Lens Blur uses computer vision algorithms to create a 3D model of the world, estimating the depth (distance) to every point in the scene. Here’s an example -- on the left is a raw input photo, in the middle is a “depth map” where darker things are close and lighter things are far away, and on the right is the result blurred by distance:
Here’s how we do it. First, we pick out visual features in the scene and track them over time, across the series of images. Using computer vision algorithms known as Structure-from-Motion (SfM) and bundle adjustment, we compute the camera’s 3D position and orientation and the 3D positions of all those image features throughout the series.
Once we’ve got the 3D pose of each photo, we compute the depth of each pixel in the reference photo using Multi-View Stereo (MVS) algorithms. MVS works the way human stereo vision does: given the location of the same object in two different images, we can triangulate the 3D position of the object and compute the distance to it. How do we figure out which pixel in one image corresponds to a pixel in another image? MVS measures how similar they are -- on mobile devices, one particularly simple and efficient way is computing the Sum of Absolute Differences (SAD) of the RGB colors of the two pixels.
Now it’s an optimization problem: we try to build a depth map where all the corresponding pixels are most similar to each other. But that’s typically not a well-posed optimization problem -- you can get the same similarity score for different depth maps. To address this ambiguity, the optimization also incorporates assumptions about the 3D geometry of a scene, called a "prior,” that favors reasonable solutions. For example, you can often assume two pixels near each other are at a similar depth. Finally, we use Markov Random Field inference methods to solve the optimization problem.
Having computed the depth map, we can re-render the photo, blurring pixels by differing amounts depending on the pixel’s depth, aperture and location relative to the focal plane. The focal plane determines which pixels to blur, with the amount of blur increasing proportionally with the distance of each pixel to that focal plane. This is all achieved by simulating a physical lens using the thin lens approximation.
The algorithms used to create the 3D photo run entirely on the mobile device, and are closely related to the computer vision algorithms used in 3D mapping features like Google Maps Photo Tours and Google Earth. We hope you have fun with your bokeh experiments!
- Sawasdeee ka Voice Search
Posted by Keith Hall and Richard Sproat, Staff Research Scientists, Speech
Typing on mobile devices can be difficult, especially when you're on the go. Google Voice Search gives you a fast, easy, and natural way to search by speaking your queries instead of typing them. In Thailand, Voice Search has been one of the most requested services, so we’re excited to now offer users there the ability to speak queries in Thai, adding to over 75 languages and accents in which you can talk to Google.
To power Voice Search, we teach computers to understand the sounds and words that build spoken language. We trained our speech recognizer to understand Thai by collecting speech samples from hundreds of volunteers in Bangkok, which enabled us to build this recognizer in just a fraction of the time it took to build other models. Our helpers are asked to read popular queries in their native tongue, in a variety of acoustic conditions such as in restaurants, out on busy streets, and inside cars.
Each new language for voice recognition often requires our research team to tackle new challenges, including Thai.
- Segmentation is a major challenge in Thai, as the Thai script has no spaces between words, so it is harder to know when a word begins and ends. Therefore, we created a Thai segmenter to help our system recognize words better. For example: ตากลม can be segmented to ตาก ลม or ตา กลม. We collected a large corpus of text and asked Thai speakers to manually annotate plausible segmentations. We then trained a sequence segmenter on this data allowing it to generalize beyond the annotated data.
- Numbers are an important part of any language: the string “87” appears on a web page and we need to know how people would say that. As with over 40 other languages, we included a number grammar for Thai, that tells you that “87” would be read as แปดสิบเจ็ด.
- Thai users often mix English words with Thai, such as brand or artist names, in both spoken and written Thai which adds complexity to our acoustic models, lexicon models, and segmentation models. We addressed this by introducing ‘code switching’, which allows Voice Search to recognize when different languages are being spoken interchangeably and adjust phonetic transliteration accordingly.
- Many Thai users frequently leave out accents and tone markers when they search (eg โน๊ตบุก instead of โน้ตบุ๊ก OR หมูหยอง instead of หมูหย็อง) so we had to create a special algorithm to ensure accents and tones were restored in search results provided and our Thai users would see properly formatted text in the majority of cases.
We’re particularly excited that Voice Search can help people find locally relevant information, ranging from travel directions to the nearest restaurant, without having to type long phrases in Thai.
Voice Search is available for Android devices running Jelly Bean and above. It will be available for older Android releases and iOS users soon.
- Making Blockly Universally Accessible
Posted by Neil Fraser, Chief Interplanetary Liaison
We work hard to make our products accessible to people everywhere, in every culture. Today we’re expanding our outreach efforts to support a traditionally underserved community -- those who call themselves "tlhIngan."
Google's Blockly programming environment is used in K-12 classrooms around the world to teach programming. But the world is not enough. Students on Qo'noS have had difficulty learning to code because most of the teaching tools aren't available in their native language. Additionally, many existing tools are too fragile for their pedagogical approach. As a result, Klingons have found it challenging to enter computer science. This is reflected in the fact that less than 2% of Google engineers are Klingon.
Today we launch a full translation of Blockly in Klingon. It incorporates Klingon cultural norms to facilitate learning in this unique population:
- Blockly has no syntax errors. This reduces frustration, and reduces the number of computers thrown through bulkheads.
- Variables are untyped. Type errors can too easily be perceived as a challenge to the honor of a student's family (and we’ve seen where that ends).
- Debugging and bug reports have been omitted, our research indicates that in the event of a bug, they prefer the entire program to just blow up.
Get a little keyboard dirt under your fingernails. Learn that although ghargh is delicious, code structure should not resemble it. And above all, be proud that tlhIngan maH. Qapla'!
You can try out the demo here or get involved here.
- Celebrating the First Set of Google Geo Education Awardees and Announcing Round Two
Posted by Dave Thau, Senior Developer Advocate
Google's GeoEDU Outreach program is excited to announce the opening of the second round of our Geo Education Awards, aimed at supporting qualifying educational institutions who are creating content and curricula for their mapping, remote sensing, or GIS initiatives.
If you are an educator in these areas, we encourage you to apply for an award. To celebrate the first round of awardees, and give a sense of the kind of work we have supported in the past, here are brief descriptions of some of our previous awards.
Nicholas Clinton, Tsinghua University
Development of online remote sensing course content using Google Earth Engine
Nick is building 10 labs for an introductory remote sensing class. Topics include studying electromagnetic radiation, image processing, time series analysis, and change detection. The labs are being taught currently, and materials will be made available when the course has been completed. From Lab 6:
|Let's look at some imagery in Earth Engine. Search for the place 'Mountain View, CA, USA.' What the heck is all that stuff!? We are looking at this scene because of the diverse mix of things on the Earth surface.|
|Add the Landsat 8 32-day EVI composite. What do you observe? Recall that the more vegetative cover the higher the index. It looks like the "greenest" targets in this scene are golf courses.|
|Let's say we don't really care about vegetation (not true, of course!), but we do care about water. Let's see if the water indices can help us decipher our Mountain View mystery scene.|
Dana Tomlin, University of Pennsylvania
Geospatial Programming: Child's Play
Dana is creating documentation, lesson plans, sample scripts, and homework assignments for each week in a 13-week, university-level course on geospatial programming. The course uses the Python computer programming language to utilize, customize, and extend the capabilities of three geographic information systems: Google’s Earth Engine, ESRI’s ArcGIS, and the open-source QGIS.
Declan G. De Paor, Old Dominion University
A Modular Approach to Introducing Google Mapping Technologies into Geoscience Curricula Worldwide
Declan's award supports senior student Chloe Constants who is helping design Google Maps Engine and Google Earth Engine modules for existing geoscience coursework, primarily focused on volcanic and tectonic hazards, and digital mapping. Declan and Chloe will present the modules at faculty development workshops in person and online. They see GME/GEE as a terrific way to offer authentic undergraduate research experiences to non-traditional geoscience students.
Mary Elizabeth Killilea, New York University
Google Geospatial Tools in a Global Classroom: “Where the City Meets the Sea: Studies in Coastal Urban Environments"
Mary and the Global Technology Services team at NYU are developing a land cover change lab using Google Earth Engine. NYU has campuses around the world, so their labs are written to be used globally. In fact, students in four campuses around the globe are currently collecting and sharing data for the lab. Students at their sites analyze their local cities, but do so in a global context.
|One group of students used Android mobile devices to collect land use data in New York's Battery Park.|
|While others in the same course collected these points in Abu Dhabi. Upon collection, the observations were automatically uploaded, mapped, and shared.|
Scott Nowicki and Chris Edwards, University of Nevada at Las Vegas
Advanced Manipulation and Visualization of Remote Sensing Datasets with Google Earth Engine
Scott and Chris are taking biology, geoscience, and social science students on a field trip to collect geological data, and are generating screencast tutorials to show how these data can be queried, downloaded, calibrated, manipulated and interpreted using free tools including Google Earth Engine. These tutorials may be freely incorporated into any geospatial course, and all the field site data and analyses will be publicly released and published, giving a full description of what features are available to investigate, and how best to interpret both the remote sensing datasets and ground truth activities.
Steven Whitmeyer and Shelley Whitmeyer, James Madison University
Using Google Earth to Model Geologic Change Through Time
Steven and Shelley are building exercises for introductory geoscience courses focusing on coastal change, and glacial landform change. These exercises incorporate targets and goals of the Next Generation Science Standards. They are also developing tools to create new tectonic reconstructions of how continents and tectonic plates have moved since Pangaea breakup. Some of the current animations are available here and here.
We hope this overview of previous award recipients gives you a sense for the range of educational activities our GeoEDU awards are supporting. If you are working on innovative geospatial education projects, we invite you to apply for a GeoEDU award.
- Making Sense of MOOC Data
Posted by Julia Wilkowski, Staff Instructional Designer
In order to further evolve the open education system and online platforms, Google’s course design and development teams continually experiment with massive, open online courses. Recently, at the Association for Computing Machinery’s recent Learning@Scale conference in Atlanta, GA, several members of our team presented findings about our online courses. Our research focuses on learners’ goals and activities as well as self-evaluation as an assessment tool. In this post, I will present highlights from our research as well as how we’ve applied this research to our current course, Making Sense of Data.
Google’s five online courses over the past two years have provided an opportunity for us to identify learning trends and refine instructional design. As we posted previously, learners register for online courses for a variety of reasons. During registration, we ask learners to identify their primary goal for taking the class. We found that just over half (52.5%) of 41,000 registrants intended to complete the Mapping with Google course; the other half aimed to learn portions of the curriculum without earning a certificate. Next we measured how well participants achieved those goals by observing various interaction behaviors in the course, such as watching videos, viewing text lessons, and activity completion. We found that 42.4% of 21,000 active learners (who did something in the course other than register) achieved the goals they selected during registration. Similarly, for our Introduction to Web Accessibility course, we found that 56.1% of 4,993 registrants intended to complete the course. Based on their interactions with course materials, we measured that 49.5% of 1,037 active learners achieved their goals.
Although imperfect, these numbers are more accurate measures of course success than completion rates. Because students come to the course for many different reasons, course designers should make it easier for learners to meet a variety of objectives. Since many participants in online courses may just want to learn a few new things, we can help them by releasing all course content at the outset of the course and enabling them to search for specific topics of interest. We are exploring other ways of personalizing courses to help learners achieve individual goals.
Our research also indicates that learners who complete activities are more likely to complete the course than peers who completed no activities. Activities include auto-graded multiple-choice or short-answer questions that encourage learners to practice skills from the course and receive instant feedback. In the Mapping with Google course, learners who completed at least sixty percent of course activities were much more likely to submit final projects than peers who finished fewer activities. This leads us to believe that as course designers, we should be paying more attention to creating effective, relevant activities than focusing so heavily on course content. We hypothesize that learners also use activities’ instant feedback to help them determine whether they should spend time reviewing the associated content. In this scenario, we believe that learners could benefit from experiencing activities before course content.
As technological solutions for assessing qualitative work are still evolving, an active area of our research involves self-evaluation. We are also intrigued by previous research showing the links between self-evaluation and enhanced metacognition. In several courses, we have asked learners to submit projects aligned with course objectives, calibrate themselves by evaluating sample work, then apply a rubric to assess their own work. Course staff graded a random sample of project submissions then compared the learners’ scores with course staff’s scores. In general, we found a moderate agreement on Advanced Power Searching (APS) case studies (55.1% within 1 point of each other on a 16-point scale), with an increased agreement on the Mapping projects (71.6% within 2 points of each other on a 27-point scale). We also observed that students submitted high quality projects overall, with course staff scoring 73% of APS assignments a B (80%) or above; similarly, course staff evaluated 94% of Mapping projects as a B or above.
What changed between the two courses that allowed for a higher agreement with the mapping course? The most important change seems to be more objective criteria for the mapping project rubric. We also believe that we haven’t given enough weight to teaching learners how to evaluate their own work. We plan to keep experimenting with self-evaluation in future courses.
Since we are dedicated to experimenting with courses, we have not only applied these findings to the Making Sense of Data course, but we have also chosen to experiment with new open-source software and tools. We’re exploring the following aspects of online education in this class:
- Placing activities before content
- Reduced use of videos
- Final project that includes self-reflection without scores
- New open-source technologies, including authoring the course using edX studio and importing it into cbX (running on Google’s AppEngine platform) as well as Oppia explorations
We hope that our research and the open-source technologies we’re using will inspire educators and researchers to continue to evolve the next generation of online learning platforms.