Alternating updates for efficient transformers

Favorite Posted by Xin Wang, Software Engineer, and Nishanth Dikkala, Research Scientist, Google Research Contemporary deep learning models have been remarkably successful in many domains, ranging from natural language to computer vision. Transformer neural networks (transformers) are a popular deep learning architecture that today comprise the foundation for most tasks

Read More
Shared by Google AI Technology November 7, 2023

Zero-shot adaptive prompting of large language models

Favorite Posted by Xingchen Wan, Student Researcher, and Ruoxi Sun, Research Scientist, Cloud AI Team Recent advances in large language models (LLMs) are very promising as reflected in their capability for general problem-solving in few-shot and zero-shot setups, even without explicit training on these tasks. This is impressive because in

Read More
Shared by Google AI Technology November 2, 2023

Looking back at wildfire research in 2023

Favorite Posted by Yi-Fan Chen, Software Engineer, and Carla Bromberg, Program Lead, Google Research Wildfires are becoming larger and affecting more and more communities around the world, often resulting in large-scale devastation. Just this year, communities have experienced catastrophic wildfires in Greece, Maui, and Canada to name a few. While

Read More
Shared by Google AI Technology October 25, 2023

Batch calibration: Rethinking calibration for in-context learning and prompt engineering

Favorite Posted by Han Zhou, Student Researcher, and Subhrajit Roy, Senior Research Scientist, Google Research Prompting large language models (LLMs) has become an efficient learning paradigm for adapting LLMs to a new task by conditioning on human-designed instructions. The remarkable in-context learning (ICL) ability of LLMs also leads to efficient

Read More
Shared by Google AI Technology October 13, 2023

Re-weighted gradient descent via distributionally robust optimization

Favorite Ramnath Kumar, Pre-Doctoral Researcher, and Arun Sai Suggala, Research Scientist, Google Research Deep neural networks (DNNs) have become essential for solving a wide range of tasks, from standard supervised learning (image classification using ViT) to meta-learning. The most commonly-used paradigm for learning DNNs is empirical risk minimization (ERM), which

Read More
Shared by Google AI Technology September 28, 2023

Google Research embarks on effort to map a mouse brain

Favorite Posted by Michał Januszewski, Research Scientist, Google Research The human brain is perhaps the most computationally complex machine in existence, consisting of networks of billions of cells. Researchers currently don’t understand the full picture of how glitches in its network machinery contribute to mental illnesses and other diseases, such

Read More
Shared by Google AI Technology September 26, 2023

Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes

Favorite Posted by Cheng-Yu Hsieh, Student Researcher, and Chen-Yu Lee, Research Scientist, Cloud AI Team Large language models (LLMs) have enabled a new data-efficient learning paradigm wherein they can be used to solve unseen new tasks via zero-shot or few-shot prompting. However, LLMs are challenging to deploy for real-world applications

Read More
Shared by Google AI Technology September 21, 2023