AudioLM: a Language Modeling Approach to Audio Generation

Favorite Posted by Zalán Borsos, Research Software Engineer, and Neil Zeghidour, Research Scientist, Google Research Generating realistic audio requires modeling information represented at different scales. For example, just as music builds complex musical phrases from individual notes, speech combines temporally local structures, such as phonemes or syllables, into words and

Read More
Shared by Google AI Technology October 6, 2022

Large Motion Frame Interpolation

Favorite Posted by Fitsum Reda and Janne Kontkanen, Google Research Frame interpolation is the process of synthesizing in-between images from a given set of images. The technique is often used for temporal up-sampling to increase the refresh rate of videos or to create slow motion effects. Nowadays, with digital cameras

Read More
Shared by Google AI Technology October 4, 2022

Quantization for Fast and Environmentally Sustainable Reinforcement Learning

Favorite Posted by Srivatsan Krishnan, Student Researcher, and Aleksandra Faust, Senior Staff Research Scientist, Google Research, Brain Team Deep reinforcement learning (RL) continues to make great strides in solving real-world sequential decision-making problems such as balloon navigation, nuclear physics, robotics, and games. Despite its promise, one of its limiting factors

Read More
Shared by Google AI Technology September 27, 2022

TensorStore for High-Performance, Scalable Array Storage

Favorite Posted by Jeremy Maitin-Shepard and Laramie Leavitt, Software Engineers, Connectomics at Google Many exciting contemporary applications of computer science and machine learning (ML) manipulate multidimensional datasets that span a single large coordinate system, for example, weather modeling from atmospheric measurements over a spatial grid or medical imaging predictions from

Read More
Shared by Google AI Technology September 22, 2022

View Synthesis with Transformers

Favorite Posted by Carlos Esteves and Ameesh Makadia, Research Scientists, Google Research A long-standing problem in the intersection of computer vision and computer graphics, view synthesis is the task of creating new views of a scene from multiple pictures of that scene. This has received increased attention [1, 2, 3]

Read More
Shared by Google AI Technology September 21, 2022

FindIt: Generalized Object Localization with Natural Language Queries

Favorite Posted by Weicheng Kuo and Anelia Angelova, Research Scientists, Google Research, Brain Team Natural language enables flexible descriptive queries about images. The interaction between text queries and images grounds linguistic meaning in the visual world, facilitating a better understanding of object relationships, human intentions towards objects, and interactions with

Read More
Shared by Google AI Technology September 20, 2022

Google at Interspeech 2022

Favorite Posted by Cat Armato, Program Manager, Google This week, the 23rd Annual Conference of the International Speech Communication Association (INTERSPEECH 2022) is being held in Incheon, South Korea, representing one of the world’s most extensive conferences on research and technology of spoken language understanding and processing. Over 2,000 experts

Read More
Shared by Google AI Technology September 17, 2022

Robust Online Allocation with Dual Mirror Descent

Favorite Posted by Santiago Balseiro, Staff Research Scientist, Google Research, and Associate Professor at Columbia University, and Vahab Mirrokni, Distinguished Scientist, Google Research The emergence of digital technologies has transformed decision making across commercial sectors such as airlines, online retailing, and internet advertising. Today, real-time decisions need to be repeatedly

Read More
Shared by Google AI Technology September 16, 2022

PaLI: Scaling Language-Image Learning in 100+ Languages

Favorite Posted by Xi Chen and Xiao Wang, Software Engineers, Google Research Advanced language models (e.g., GPT, GLaM, PaLM and T5) have demonstrated diverse capabilities and achieved impressive results across tasks and languages by scaling up their number of parameters. Vision-language (VL) models can benefit from similar scaling to address

Read More
Shared by Google AI Technology September 15, 2022