Modular visual question answering via code generation

Favorite Posted by Sanjay Subramanian, PhD student, UC Berkeley, and Arsha Nagrani, Research Scientist, Google Research, Perception Team Visual question answering (VQA) is a machine learning task that requires a model to answer a question about an image or a set of images. Conventional VQA approaches need a large amount

Read More
Shared by Google AI Technology July 7, 2023

Pic2Word: Mapping pictures to words for zero-shot composed image retrieval

Favorite Posted by Kuniaki Saito, Student Researcher, Google Research, Cloud AI Team, and Kihyuk Sohn, Research Scientist, Google Research Image retrieval plays a crucial role in search engines. Typically, their users rely on either image or text as a query to retrieve a desired target image. However, text-based retrieval has

Read More
Shared by Google AI Technology July 6, 2023

A principled approach to evolving choice and control for web content

Favorite We’re kicking off a public discussion across the web and AI communities to develop new machine-readable means to provide web publisher choice and control. View Original Source (blog.google/technology/ai/) Here.

Unlocking the AI-powered opportunity in the UK

Favorite AI is the most profound technology that humanity is working on today. It’s a critical part of solving big societal challenges and helping to make our everyday lives bett… View Original Source (blog.google/technology/ai/) Here.

Leonardo da Vinci: Inside a genius mind

Favorite 28 institutions from around the world join forces to showcase Leonardo da Vinci’s unparalleled legacy, blending art, science, and AI innovation View Original Source (blog.google/technology/ai/) Here.

Announcing the first Machine Unlearning Challenge

Favorite Posted by Fabian Pedregosa and Eleni Triantafillou, Research Scientists, Google Deep learning has recently driven tremendous progress in a wide array of applications, ranging from realistic image generation and impressive retrieval systems to language models that can hold human-like conversations. While this progress is very exciting, the widespread use

Read More
Shared by Google AI Technology June 29, 2023

On-device diffusion plugins for conditioned text-to-image generation

Favorite Posted by Yang Zhao and Tingbo Hou, Software Engineers, Core ML In recent years, diffusion models have shown great success in text-to-image generation, achieving high image quality, improved inference performance, and expanding our creative inspiration. Nevertheless, it is still challenging to efficiently control the generation, especially with conditions that

Read More
Shared by Google AI Technology June 29, 2023

Unifying image-caption and image-classification datasets with prefix conditioning

Favorite Posted by Kuniaki Saito, Student Researcher, Cloud AI Team, and Kihyuk Sohn, Research Scientist, Perception Team Pre-training visual language (VL) models on web-scale image-caption datasets has recently emerged as a powerful alternative to traditional pre-training on image classification data. Image-caption datasets are considered to be more “open-domain” because they

Read More
Shared by Google AI Technology June 27, 2023

Preference learning with automated feedback for cache eviction

Favorite Posted by Ramki Gummadi, Software Engineer, Google and Kevin Chen, Software Engineer, YouTube Caching is a ubiquitous idea in computer science that significantly improves the performance of storage and retrieval systems by storing a subset of popular items closer to the client based on request patterns. An important algorithmic

Read More
Shared by Google AI Technology June 23, 2023

Our support for early warning systems

Favorite Google’s Yossi Matias and WMO Director of Infrastructure Anthony Rea discuss the Early Warnings For All Initiative. View Original Source (blog.google/technology/ai/) Here.