AI Glossary
Terms and concepts in AI, machine learning, and generative AI
Key terms used when working with autonomous systems and AI agents, grouped by category. Use the search box to filter.
Fundamentals of AI
-
Pattern Recognition
—
Techniques for finding and classifying patterns in data. Applied in image, speech, and text recognition and has grown as a major use of machine learning.
-
Singularity (Technological)
—
A hypothetical point where AI creates smarter AI and intelligence grows rapidly. Often discussed in ethics and impact on society.
-
Expert System
—
Systems that encode expert knowledge in rules and knowledge bases and answer via inference. A classic form of AI before machine learning became dominant.
-
Big Data
—
Data that is too large, diverse, or fast-changing to handle with traditional methods. Used as training material for AI and is a precondition for higher performance.
-
Chinese Room
—
A thought experiment: manipulating symbols alone does not amount to “understanding.” Invoked in philosophical debates about whether AI truly understands.
-
Deep Learning
—
Learning with neural networks of many layers. Powers high performance in vision, speech, and language and underlies large language models and image generation.
-
Strong AI vs. Weak AI
—
“Weak AI” handles specific tasks such as image recognition or translation. “Strong AI” refers to human-level general intelligence, which does not exist yet.
-
Frame Problem
—
The difficulty of cleanly separating what is relevant to an action from what is not. A recurring topic in designing robots and agents.
-
Data Mining
—
The process of discovering useful patterns and rules from large datasets, combining machine learning and statistics in both business and research.
-
Turing Test
—
The idea that if a judge cannot tell human from machine in text-only conversation, the machine can be considered intelligent. Often cited in defining AI.
-
Generative AI
—
AI that “creates” text, images, or audio from learned patterns. Used for dialogue, summarization, creative work, and code. The shift from “predicting” to “generating” is key.
-
Inference / Recognition / Decision
—
Inference = drawing conclusions from knowledge. Recognition = interpreting inputs such as images or speech. Decision = choosing actions or outputs. Core roles often ascribed to AI.
-
Multimodal AI
—
AI that handles different input types—text, images, audio—together. Used when understanding needs or content from multiple angles.
-
Machine Learning (ML)
—
Systems that learn patterns or rules from data so they can perform tasks without being explicitly programmed for every case. Often divided into supervised, unsupervised, and reinforcement learning.
-
Artificial Intelligence (AI)
—
Technologies that enable computers to reason, recognize, decide, and learn. Today this mostly refers to systems built on machine learning and deep learning.
Overview of Machine Learning
-
Confusion Matrix / Precision / Recall
—
Confusion matrix = table of predicted vs. actual. Precision = share of positive predictions that are correct. Recall = share of actual positives that were predicted. Important for imbalanced data.
-
Reinforcement Learning
—
The agent interacts with an environment and learns behavior that maximizes reward. Used in game AI, robotics, and recommendation.
-
Principal Component Analysis (PCA)
—
Reduces dimensions by projecting onto axes of highest variance. Used for visualization, preprocessing, and noise reduction.
-
Overfitting / Underfitting / Generalization
—
Overfitting = fitting training data too closely so performance degrades elsewhere. Underfitting = model too simple to learn well. Generalization = how well the model performs on unseen data.
-
Support Vector Machine (SVM)
—
Classification and regression based on maximizing the margin between classes. Kernels allow non-linear boundaries; SVMs have long been a workhorse method.
-
Unsupervised Learning
—
Learning from unlabeled data by discovering structure or patterns. Used for clustering, dimensionality reduction, and anomaly detection.
-
Feature / Feature Engineering
—
How data is represented for the model. Traditionally hand-designed; in deep learning, layers often learn useful features automatically.
-
Q-Learning / Markov Decision Process
—
Q-learning = learning the value (Q) of taking an action in a state; a classic reinforcement learning algorithm. MDP is the mathematical framework of states, actions, rewards, and transitions.
-
Cross-Validation / Model Evaluation
—
Splitting data and repeatedly training and evaluating to estimate performance. Measured with accuracy, precision, recall, F1, AUC, etc.
-
Clustering
—
Grouping unlabeled data by similarity. Methods include k-means and hierarchical clustering.
-
Ensemble Learning
—
Combining multiple models so their outputs are aggregated. Used to improve accuracy and generalization by compensating for individual weaknesses.
-
Classification and Regression
—
Classification = predicting discrete labels. Regression = predicting continuous values. Both are standard supervised learning tasks.
-
Semi-supervised / Self-supervised Learning
—
Semi-supervised = using both labeled and unlabeled data. Self-supervised = creating labels from the data itself. LLM pre-training falls into the latter.
-
Decision Tree / Random Forest
—
Decision tree = model that branches on conditions. Random forest = many trees combined, reducing overfitting and often remaining interpretable.
-
Logistic Regression / Linear Regression
—
Linear models: linear regression for numeric prediction, logistic regression for binary classification. Simple and interpretable, often used as a baseline.
-
Supervised Learning
—
Learning from input–output pairs. Used for classification and regression—e.g., training image recognition from labeled photos.
Deep Learning Building Blocks
-
ResNet / Skip Connection
—
Adding inputs directly to later layers (skip connection) improves gradient flow and enables very deep networks. ResNet is a leading example.
-
Regularization / Dropout
—
Ways to reduce overfitting. L1/L2 penalize large weights; dropout randomly disables units during training.
-
Transfer Learning / Pre-training
—
Reusing a model trained on another task or dataset for a target task or smaller data. Reduces compute and data requirements.
-
Vanishing / Exploding Gradient
—
In deep networks, gradients can shrink to near zero or blow up during backprop. Addressed with ReLU, skip connections, and normalization.
-
Activation Function
—
Non-linear functions (ReLU, sigmoid, tanh, softmax) applied at neurons. They provide the expressiveness of stacked layers.
-
Autoencoder / VAE
—
Models that compress input then reconstruct it. VAE models the compressed space probabilistically, enabling sampling of new examples.
-
Perceptron
—
The simplest unit: weighted sum of inputs and a threshold. Stacking many gives a multilayer perceptron (MLP).
-
Batch Normalization / Layer Normalization
—
Normalizing layer outputs to stabilize training. BatchNorm normalizes per batch; LayerNorm is common in Transformers.
-
Backpropagation
—
Propagating error backward through the network to compute how to update each weight. Essential for training deep networks.
-
Convolutional Neural Network (CNN)
—
Networks using convolution and pooling to capture local structure. Standard for image recognition, object detection, and segmentation.
-
Hyperparameter
—
Values set before training (learning rate, depth, batch size, etc.). Tuned via grid search, random search, or Bayesian optimization.
-
Gradient Descent / Learning Rate / Epoch
—
Updating parameters along the gradient. Learning rate = step size; epoch = one pass over the training data. SGD and Adam are widely used.
-
Pooling Layer
—
Summarizing convolution outputs per region (e.g., max or average). Adds invariance and reduces computation.
-
Generative Adversarial Network (GAN)
—
A generator and a discriminator compete during training. Pioneered image generation; diffusion models now dominate in many applications.
-
Input / Hidden / Output Layer
—
Input layer receives data; hidden layers transform it; output layer produces the final prediction. Deeper networks can represent more complex functions.
-
RNN / LSTM
—
Networks for sequential data. RNNs maintain state but suffer from vanishing gradients; LSTM uses gating to capture long-range dependencies.
-
Adam / SGD (Optimizers)
—
SGD updates parameters using gradients per minibatch. Adam adds momentum and adaptive learning rates and is a default in deep learning.
-
Neural Network
—
Models inspired by biological neurons. Layers of nodes with learned weights and activations transform input through training.
Generative AI & LLM Technologies
-
Guardrail
—
Mechanisms to monitor and constrain generative AI inputs and outputs—blocking harmful or sensitive content via prompts, filters, and policies.
-
Retrieval-Augmented Generation (RAG)
—
Retrieving external knowledge or documents and feeding them into the prompt so the LLM can answer with fewer hallucinations and with up-to-date or internal data.
-
Chain-of-Thought (CoT)
—
Prompting the model to “think step by step.” Can improve quality on complex reasoning and math tasks.
-
Transformer
—
Architecture centered on attention, processing long-range dependencies in parallel. The basis for most current LLMs.
-
Benchmark / Leaderboard
—
Standard tasks and datasets for comparing models (e.g., GLUE, MMLU). Leaderboards rank results.
-
Fine-tuning
—
Further training a pre-trained model on a target task or domain. Instruction tuning trains the model to follow instructions.
-
Hallucination
—
Output that is factually wrong but stated confidently. Mitigated with RAG, guardrails, and human review where accuracy matters.
-
Embedding
—
Representing text or images as vectors so that similar meaning is close in vector space. Used for search, similarity, and RAG indexing.
-
Prompt / Prompt Engineering
—
The text (instructions and input) sent to an LLM is the prompt. Shaping and refining it for a task is prompt engineering—often where practical gains come from.
-
Large Language Model (LLM)
—
Large models trained on huge text corpora. They can write, summarize, translate, and answer questions—e.g., GPT, Claude, Gemini.
-
Scaling Laws
—
Empirical observation that performance improves predictably with more model size, data, and compute. One rationale for scaling up LLMs.
-
Attention Mechanism
—
Mechanism that weights “where to look” in the input. Query, Key, and Value are used to select and combine relevant information. Core of the Transformer.
-
Diffusion Model
—
Models that iteratively remove noise to form images (or other data). Dominant in image generation—e.g., Stable Diffusion.
-
RLHF / Alignment
—
Reinforcement learning from human feedback uses preferences as reward to align outputs with human intent. Alignment = making AI safe and useful.
-
Prompt Injection
—
Injecting malicious instructions into prompts to change behavior or extract information. Defended by input validation and guardrails.
-
Zero-shot / Few-shot
—
Zero-shot = task described without examples. Few-shot = a few examples in the prompt to steer behavior. Tied to in-context learning in LLMs.
-
Foundation Model
—
Large models pre-trained on broad data and adaptable to many tasks via fine-tuning or prompting. LLMs and multimodal models are examples.
-
Token / Tokenizer
—
Tokens are the units the model consumes; the tokenizer splits text into them. Context length is measured in tokens.
-
Instruction Tuning
—
Additional training on instruction–response pairs so the model follows instructions. A foundation for chat-style LLMs.
-
GPT / BERT
—
GPT = decoder-only, predicting next tokens; basis of ChatGPT. BERT = encoder-only, using bidirectional context; used for classification and QA.
-
Sampling / Temperature / Top-p
—
How the next token is chosen from the distribution. Temperature controls randomness; top-p narrows the candidate set. Used to tune diversity vs. determinism.
-
In-Context Learning
—
Solving tasks from examples or information in the prompt without updating parameters. Often used together with few-shot prompting.
Natural Language, Image & Speech
-
Optical Character Recognition (OCR)
—
Extracting text from images or PDFs. Used for digitizing documents, forms, and as preprocessing for LLMs.
-
Speech Recognition / Speech Synthesis
—
ASR turns speech into text; TTS turns text into speech. Both have improved greatly with deep learning.
-
Natural Language Processing (NLP)
—
Technologies for handling language with computers: translation, summarization, sentiment, QA, and LLMs.
-
Text-to-Image
—
Generating images from text descriptions. Examples: DALL·E, Stable Diffusion, Midjourney.
-
Morphological Analysis
—
Breaking text into morphemes (smallest meaning units). Foundation for tokenization and POS tagging in languages like Japanese (e.g., MeCab).
-
Image Recognition / Object Detection / Segmentation
—
Recognition = classifying content. Detection = locating objects. Segmentation = pixel-level regions. Implemented with CNNs and Vision Transformers.
-
Word2Vec / Distributed Representation
—
Representing words as vectors so similar words are close. Contrasts with one-hot encoding; foundational for many NLP methods.
Agents & Integration
-
Tool Use / Function Calling
—
Letting the LLM call external APIs or functions. Used when agents need to search, compute, or access databases.
-
Context / Context Length
—
The information the model sees when generating (conversation history, retrieved docs, system instructions). Context length = maximum tokens that can be passed at once.
-
AI Agent
—
AI that works toward a goal by deciding, using tools, search, or code execution. Often LLM-based and interacting with the outside world.
-
ReAct
—
Agent pattern that alternates “reasoning” and “acting”: think → act → observe, in a loop to solve tasks incrementally.
-
API
—
Application Programming Interface—how programs exchange data and call services. Used to integrate LLMs and external services into applications.
-
MCP (Model Context Protocol)
—
A protocol for connecting AI to external tools, data, and APIs safely. Used as a base for file operations and API integration.
AI Ethics, Governance & Deployment
-
Traceability / Reproducibility
—
Traceability = being able to trace data sources and training conditions. Reproducibility = obtaining the same results under the same conditions. Important for audit and quality.
-
Adversarial Attack
—
Deliberately adding small perturbations to cause misclassification. A concern for security and reliability of vision and generative models.
-
Privacy / Personal Data Protection
—
AI uses large amounts of data; compliance with privacy laws and GDPR is required. Privacy-by-design is increasingly important.
-
CRISP-DM
—
A framework for data and AI projects: business understanding, data understanding, modeling, evaluation, deployment.
-
Bias / Fairness
—
Data bias can lead to unfair outcomes for certain groups. Ensuring fairness is a central theme in AI ethics.
-
Transparency Report
—
Public reporting on training data, capabilities, limitations, and usage. Often requested for explainability and governance.
-
AI Ethics / AI Governance
—
Values for building and using AI (fairness, transparency, explainability, privacy) and how organizations govern AI. Guidelines and regulation are evolving globally.
-
Deepfake
—
Synthetic or altered face/voice content created with deep learning. Detection and prevention of misuse are active societal and technical topics.
-
MLOps
—
Practices to run the full cycle: development, deployment, monitoring, retraining. CI/CD, reproducibility, and monitoring are key.
-
Explainable AI (XAI)
—
Making AI decisions interpretable. Deep learning is opaque; methods like Grad-CAM and SHAP provide explanations.
-
Copyright & Generative AI
—
Issues around using data for training and who owns AI-generated output. Legislation and guidelines are developing in many countries.
-
Risk-based Approach
—
Adapting regulation and safeguards to the level of risk per use case. Reflected in regulations such as the EU AI Act.
Business & Infrastructure
-
Annotation
—
Adding labels or boundaries to training data—e.g., object locations in images, transcriptions, tags. Essential for supervised learning.
-
Cloud / GPU
—
Cloud = using compute and storage over the network. GPUs accelerate training and inference; many LLM and ML services are offered in the cloud.
-
Project "CRAai"
—
An autonomous system centered on file processing, integrating APIs via MCP and similar to run operations. CRAai’s core project.
-
Data Scientist
—
Role covering data collection, analysis, modeling, and turning results into business value, combining statistics, ML, and domain knowledge.
-
IoT (Internet of Things)
—
Connecting sensors and devices to the network to collect data and control systems. Combined with AI for predictive maintenance, smart factories, etc.
-
Autonomous Operation
—
Operations that continue to deliver value with AI and systems making decisions and executing without constant human oversight. Core to CRAai’s vision.
-
RPA (Robotic Process Automation)
—
Automating rule-based tasks with software robots. Evolving toward intelligent process automation with AI.
-
Proof of Concept (PoC)
—
Testing whether a technology or idea works at small scale. In AI adoption, used to check data, accuracy, and fit with the business.