Category: Uncategorized
-
Continuing from the previous post, let’s now dive into the second half of the transformer – the decoder. If you recall, the decoder takes the contextualized embeddings produced by the encoder from the input sequence. It then generates the output sequence, one token at a time. Let’s break down how this process unfolds, step by…
-
Continuing from the previous post, let’s dive into the first section of the transformer – **the encoder**. As we discussed, the encoder embeds the input tokens, uses positional encoding and attention to imbue the token embeddings with relevant meaning, and passes the modified embeddings to the decoder. We’ve already covered how token embeddings work, so…
-
Introduced in the seminal paper “Attention Is All You Need,” the transformer revolutionized the world of natural language processing (NLP) and supercharged the progress of LLMs today. Let’s take a look at how it works. Let’s consider transformers used for causal language modelling. Causal refers to the property of depending only on prior and current…
-
Continuing from the first part, let’s look at some more of OpenAI’s model naming conventions. Pro While mini models sacrifice performance for improved speed and decreased costs, pro-class models do the opposite. They are optimized for accuracy and better reasoning, and as such, are slower and more expensive. Thus, they are more suited to mission-critical…
-
As OpenAI models have progressed over the years, you might have heard of the new models being released via headlines or in passing. But beyond the base GPT versions, the naming conventions probably seem rather confusing. 4o, .5, turbo? What does it even mean? Let’s take a look, starting with the basics. Base GPT –…
-
Continuing from the last post, here are some more prompt engineering techniques. Tree of Thought Rather than being a way to enhance a prompt, Tree of Thought is more so a prompting framework. You break down your task into intermediate steps and repeat a step multiple times. If a step seems like it is in…
-
Sometimes you may feel like you can’t get an LLM chatbot to do quite what you want. Sometimes, this can be resolved by improving your prompt. Let’s take a look at prompt engineering, the art of constructing an effective prompt! Prompts and prompt engineering A prompt is simply the textual input you give generative AI…
-
When we previously discussed embeddings, we talked about being able to imbue an embedding with relevant semantic meaning from the surrounding context. For example, in the sentence “I spoke to Mark but he …”, an LLM would like to know what the embedding “he” refers to. The method that makes this possible is called attention.…
-
When you read any kind of text, you’re able to quite naturally understand what’s written, without giving it much active thought. Take a look at someone learning a new language, however, and you’ll see that when they try to read a sentence, they do so by breaking it down – usually word-by-word, and sometimes breaking…
-
If you’re unfamiliar with how Large Language Models (LLMs) – such as those behind chatbots like ChatGPT – work, you may wonder how they can seemingly understand what you’re saying. More impressively, even if you don’t convey your intentions very well, they can often pick up on what you want to say! Behind these feats…