Large Language Models

NLP Terminology

Zero-shot classification: Give labels and classify without retraining
Few-shot classification: Give few examples of labelling and classify without retraining
Summarisation
- Extractive: Select representative pieces of text
- Abstractive: Generate new text
Sentiment Analysis
Translation
Question-Answers
Foundation models: Trained on text generation tasks such as predicting the next token in a sequence
Instruction-following models: Tuned to follow (almost) arbitrary instructions or prompts
Prompt Hacking
- Prompt injection: Adding malicious content
- Prompt leaking: Extract sensitive information
- Jailbreaking: Bypass moderation rule
Searching and Sampling
- Search: Given the tokens generated so far, pick the next most likely token in a "search."
  - Greedy search (default): Pick the single next most likely token in a greedy search.
  - Beam search: Greedy search can be extended via beam search, which searches down several sequence paths, via the parameter num_beams.
- Sampling: Given the tokens generated so far, pick the next token by sampling from the predicted distribution of tokens.
  - Top-K sampling: The parameter top_k modifies sampling by limiting it to the k most likely tokens.
  - Top-p sampling: The parameter top_p modifies sampling by limiting it to the most likely tokens up to probability mass p.
- For more background on search and sampling, see this Hugging Face blog post.

FAISS (video): Vector Library for generating embeddings and searching
Chroma db (video): Vector database, for generating embedding and searching. It can be combined with LLM to build a RAG
Pinecone (video): Cloud based vector database
Weaviate (video): Vector database