-AI Terminology and Concepts Cheat Sheet


This article presents a cheat sheet for terms and concepts related to Artificial Intelligence (AI). Much of this may assume LLM as a context. Some is copied directly from Wikipedia or google gemini output. Note that I have no education or expertise regarding AI; this effort definitely turned into a number of rabbit holes. Assume AI as the context for each of these terms and their explanations. These terms can be somewhat circular in use and hence appear in no particular order. If you have any terms or content to add or correct, please comment. I spent too much time on this already but would be happy to pass the markdown to anyone that might want to work on this in anyway.

Update 20.Dec.2025: I moved this article to here:


  • Artificial Intelligence (AI)
  • Artificial General Intelligence (AGI)
  • Artificial Super Intelligence (ASI)
  • Chatbot
  • Large Language Model (LLM)
    • An LLM is just one type of system as an AI
    • Large: Trained on huge data set and many Parameters.
    • Language: Human language interface: input text (Prompt), output text.
    • Model: The trained Neural Network structure that performs the task (includes the Parameters). The model processes and outputs text in units called Tokens.
    • Predictive text generation: Repeatedly predicts the most probable next word in a sequence, given all the preceding words.
    • Neural Network architectural basis, often Transformer architecture
    • Identifies patterns that represent grammar, semantics, context, and meaning
    • Typically Generative.
    • Can maintain Context over long sequences of interactions.
    • Applications include Chatbots, search engines, code generation, and customer service.
    • https://en.wikipedia.org/wiki/Large_language_model
  • Foundation Model
  • Few-Shot Learning
  • Prompt
    • Natural language text instructions used to direct Generative AI.
  • System Prompt, System Instructions
    • Invisible instructions given to the model by the developers or application builders to set its persona or behavior.
  • Prompt Engineering
  • Prompt Injection
  • Attention Mechanism
    • Determines the importance of each component in a sequence relative to the other components in that sequence.
    • Computational cost grows quadratically with the length of the input sequence, which is the primary technical reason for the Context Window limit.
    • https://en.wikipedia.org/wiki/Attention_(machine_learning)
  • Context
    • Information provided to an AI in a single request to guide its response (AI’s short-term memory).
    • Typically consists of the current Prompt, any conversation history, and any System Prompt(s).
  • Context Window
  • Generative AI
  • Trainable Parameters (Weights and Biases)
    • Numerical values stored during training to encode the model’s knowledge, patterns, and ability to generate language.
    • Automatically and iteratively adjusted during training to constitute a model’s stored knowledge. A high weight means that input is very influential in the final output. Biases allow neurons to activate even if all inputs are zero, giving the model a flexible baseline for making predictions.
    • https://en.wikipedia.org/wiki/Weight_initialization
  • Hyperparameters
    • External settings from model developers before or during the training process that control how the model learns and generates output.
      • Architecture Hyperparameters define the model’s structure, such as the number of layers in the network and the size of its internal dimensions.
      • Training Hyperparameters control the learning process, such as the learning rate (how much the trainable parameters are adjusted in each step) and the batch size (how many samples the model processes before updating its parameters).
      • Inference/Generation Hyperparameters control the model’s output behavior when a user prompts it (after training is complete):
      • Temperature: A higher temperature means more unpredictable, random, creative, and diverse responses.
      • Max Tokens: Limits the length of the generated response.
      • Top-p / Top-k: Sampling methods that refine the pool of possible next words from which to choose.
    • https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)
  • Neural Network (NN)
    • Computational model inspired by the structure and functions of biological neural networks such as brains and nervous systems.
    • Artificial neurons (nodes):
      • Loosely model the neurons in the brain connected by edges, which model synapses.
      • Receive, perform some calculation, and transmit signals (numbers) between connected neurons.
      • Are often aggregated into layers, where each performs a different transformation.
    • Can learn from experience and derive conclusions from complex and seemingly unrelated sets of information.
    • https://en.wikipedia.org/wiki/Neural_network_(machine_learning)
  • Recurrent Neural Networks (RNN)
  • Feedback Neural Network
  • Recursive Neural Network
    • Applies the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order.
    • https://en.wikipedia.org/wiki/Recursive_neural_network
  • Transformer
    • As opposed to Recurrent Neural Networks (RNNs), Transformer Neural Networks simultaneously process multiple elements in an input sequence.
    • Input flow through a series of Encoder Blocks that refine the representation of the input sequence:
      • Self-Attention Layer: Evaluates relative importance of elements in an input sequence
      • Feed-Forward Layer: Provides non-linear transformations and introduces the model’s complexity, allowing it to learn intricate patterns beyond simple attention calculations.
      • Residual Connections and Layer Normalization: These connections allow gradients (error signals) to flow directly through the network during training, preventing them from fading away (the vanishing gradient problem) and speeding up convergence. Layer normalization stabilizes training.
      • Encoder-Decoder: The classic Transformer architecture (used for translation) with separate blocks for processing input (Encoder) and generating output (Decoder). Modern LLMs like GPT are often Decoder-only models.
    • https://en.wikipedia.org/wiki/Transformer_(deep_learning)
  • Perceptron
  • Generative Pre-trained Transformer (GPT)
  • Training
    • Multi-stage process of iteratively adjusting the model’s Parameters by exposing it to data.
    • Consists of pre-training, fine-tuning (instruction tuning), and alignment (RLHF).
  • Reinforcement Learning
  • Reinforcement Learning from Human Feedback (RLHF)
  • Retrieval Augmented Generation (RAG)
  • Token
  • Reasoning Model, Reasoning Language Model (RLM)
    • LLM trained to solve complex tasks that require multiple steps of logical reasoning.
    • Demonstrate superior performance on logic, mathematics, and programming tasks.
    • Ability to revisit and revise earlier reasoning steps and utilize additional computation during inference as a method to scale performance.
    • https://en.wikipedia.org/wiki/Reasoning_model
  • Model Context Protocol (MCP)
  • Machine Learning (ML)
  • Deep Learning
  • Data Mining
    • This term is a misnomer; it means something more like sifting existing data than retrieving new data.
    • Identification of patterns in and transformation of existing data.
    • https://en.wikipedia.org/wiki/Data_mining
  • Agent
    • Systems that perform tasks autonomously on behalf of users.
    • Optimally, through simulation of reasoning and planning with memory and context.
  • Traceability
    • The ability to follow the process from generated output back through the AI:
      • What input training data it used.
      • The model and model version used.
      • The Prompts used.
      • Any Agents used.
      • Any additional steps including transformations.
  • Alignment
  • Anthropomorphization
  • Deification, Apotheosis
  • AI Psychosis, Chatbot Psychosis
  • Confabulation (often anthropomorphized as Hallucination)
  • Generative Image ModelText-to-Image ModelDiffusion ModelLarge Vision Model
    • Image generation models that I didn’t research.
  • Text-to-Video ModelLarge Video Model
    • Video generation models that I didn’t research.
  • Text-to-Audio ModelText-to-Music ModelAudio Language Model
    • Audio generation models that I didn’t research.
  • Multimodal Model

See also:

Bummer; it looked much better as rendered markdown.

One thought on “-AI Terminology and Concepts Cheat Sheet

Leave a comment