Skip to main content

5 posts tagged with "transformers"

View All Tags

· 14 min read
Harshil Shah

Re:infer's machine learning algorithms are based on pre-trained Transformer models, which learn semantically informative representations of sequences of text, known as embeddings. Over the past few years, Transformer models have achieved state of the art results on the majority of common natural language processing (NLP) tasks.

But how did we get here? What has led to the Transformer being the model of choice for training embeddings? Over the past decade, the biggest improvements in NLP have been due to advances in learning unsupervised pre-trained embeddings of text. In this post, we look at the history of embedding methods, and how they have improved over time.

This post will

  • Explain what embeddings are and how they are used in common NLP applications.
  • Present a history of popular methods for training embeddings, including traditional methods like word2vec and modern Transformer-based methods such as BERT.
  • Discuss the weaknesses of embedding methods, and how they can be addressed.

· 11 min read
Harshil Shah

Re:infer's machine learning models use an architecture called the Transformer, which over the past few years has achieved state of the art results on the majority of common natural language processing (NLP) tasks. The go-to approach has been to take a pre-trained Transformer language model and fine-tune it on the task of interest.

More recently, we have been looking into 'prompting'—a promising group of methods which are rising in popularity. These involve directly specifying the task in natural language for the pre-trained language model to interpret and complete.

Prompt-based methods have significant potential benefits, so should you use them? This post will:

  • Illustrate the difference between traditional fine-tuning and prompting.
  • Explain the details of how some popular prompt-based methods work.
  • Discuss the pros and cons of prompt-based methods, and provide our recommendation on whether or not to use them.

· 11 min read
Harshil Shah

This two-part post looks at how to make state of the art NLP more efficient by exploring modifications to the popular but computationally demanding Transformer-based language modelling techniques.

The previous post:

  • Explained why the Transformer’s self-attention mechanism has a high computational workload.
  • Presented alternative attention mechanisms which are more efficient to run without significantly compromising performance.

This post will:

  • Explore methods which train small models to reproduce the outputs of large models.
  • Explain how to fine-tune language models efficiently.
  • Provide our recommendations for scenarios in which to use the different efficient Transformer approaches.

· 10 min read
Harshil Shah

Business runs on communications. Customers reach out when they need something. Colleagues connect to get work done. At Re:infer, our mission is to fundamentally change the economics of service work in the enterprise—to unlock the value in every interaction and make service efficient and scalable. We do this by democratising access to state of the art NLP and NLU.

Specifically, Re:infer models use deep learning architectures called Transformers. Transformers facilitate huge improvements in NLU performance. However, they are also highly compute intensive—both in training the models to learn new concepts and using them to make predictions. This two part series will look at multiple techniques to increase the speed and reduce the compute cost of using these large Transformer architectures.

This post will:

  • Present a brief history of embedding models in NLP.
  • Explain why the Transformer’s self-attention mechanism has a high computational workload.
  • Review modifications to the traditional Transformer architecture that are more computationally efficient to train and run without significantly compromising performance.

The next post, will look at additional computational and approximation techniques that yield further efficiency gains. The next post will:

  • Explore distillation techniques, where smaller models are trained to approximate the performance of larger models.
  • Explain efficient fine tuning techniques, where parameter updates are restricted.
  • Provide our recommendations for when to use each of these methods.

· 21 min read
Harshil Shah

Businesses run on communication - customers reach out when they want something, colleagues communicate to get work done. Every message counts. Our mission at Re:infer is to unlock the value in these messages and to help every team in a business deliver better products and services efficiently and at scale.

With that goal, we continuously research and develop our core machine learning and natural language understanding technology. The machine learning models at Re:infer use pre-training, unsupervised learning, semi-supervised learning and active learning to deliver state of the art accuracy with minimal time and investment from our users.

In this research post, we explore a new unsupervised approach to automatically recognising the topics and intents, and their taxonomy structure, from a communications dataset. It's about improving the quality of the insights we deliver and the speed with which these are obtained.