7 posts tagged with "Transformers"

Staggered Attention Mechanism - Long Input Transformers

August 24, 2022 · 10 min read

Upcoming Applied Scientist Amazon Science

Transformers are at the heart of NLP in the current scenario. They are making great strides and producing state-of-the-art results in diverse domains ranging from Computer Vision to Graph NNs.

In this post, we will dive into the details of the Staggered Attention Mechanism introduced in the paper Investigating Efficiently Extending Transformers for Long Input Summarization by Jason Phang Yao Zhao and Peter J. Liu, researchers at Google Brain.

Huggingface Accelerate to train on multiple GPUs.

August 3, 2022 · 5 min read

Vishnu Subramanian

Founder @ Aceus.com

PyTorch is a simple and stable framework for building deep learning solutions. Its simplicity offers freedom and control over the complete code.

While using PyTorch is fun 😁, sometimes we have to do a lot of things manually. Some of the things we may want to do are

DeBerta is the new King!

July 5, 2022 · 11 min read

Tanul Singh

Global Ambassador at Aceus.com | AI Research @ LevelAI

NLP’s State completely changed when in 2018, researchers from Google open-sourced BERT (Bi-Directional Encoder Representation From Transformers).

Huggingface 🤗 is all you need for NLP and beyond

May 26, 2022 · 31 min read

Atharva Ingle

Kaggle 4x Expert || Weights and Biases Dev Expert

Natural Language Processing is one of the fastest-growing fields in Deep Learning. NLP has completely changed since the inception of Transformers. Later on, variants of Transformer architecture where-in only the encoder part was used (BERT) cracked the transfer learning game in NLP. Now, you can download a pre-trained model from the internet which is already trained on huge amounts of data and has the knowledge of language and use it for your downstream tasks with a bit of fine-tuning.

Longformer — The Long Document Transformer

February 18, 2022 · 8 min read

Tanul Singh

Global Ambassador at Aceus.com | AI Research @ LevelAI

Transformer-Based Models have become the go-to models in about every NLP task since their inception, but when it comes to long documents they suffer from a drawback of limited tokens. Transformer-Based Models are unable to process long sequences due to their self-attention which scales quadratically with the sequence length.

Training Large NLP Models Efficiently with DeepSpeed Hugging Face

February 1, 2022 · 5 min read

Tanul Singh

Global Ambassador at Aceus.com | AI Research @ LevelAI

With the recent advancements in NLP, we are moving towards solving more and more sophisticated problems like Open Domain Question Answering, Empathy in Dialogue Systems, Multi-Modal Problems, etc but with this, the parameters associated with the models have also been rising and have gone to the scale of billions and even Trillions in the largest model Megatron.

Rate Severity of Toxic Comments using RoBERTa in PyTorch Lightning

December 8, 2021 · 15 min read

Ishan Dutta

Machine Learning Engineer - Metadome AI | Global Ambassador Aceus.com

Introduction

In this article, we will walk through a baseline model for the Jigsaw Rate Severity of Toxic Comments Competition on Kaggle. The goal of the competition is to rank relative ratings of toxicity between comments.

Introduction​

Introduction