Guides: Generative AI: Glossary

Agentic AI - AI systems that exhibit some degree of autonomy and decision-making capabilities. These systems can operate independently, process data and react in real-time, and adjust its behavior based on new information without requiring human intervention.

AI Distillation - Is the process where knowledge from a large, complex model ("teacher model") is transferred to a smaller, more efficient model ("student model"), AKA "distilling" the knowledge from the teacher model into the student model. This allows the student model to use significantly less computational power and memory, making it more cost-effective, while achieving a comparable performance to the teacher model.

AI Washing - Falsely advertising to consumers that a product or service uses AI technology when it actually doesn't or it over-exaggerates the capabilities of the AI.

Artificial General Intelligence (AGI) - AKA "Strong AI". It is a hypothetical machine intelligence that has the ability to learn, understand, and apply intelligence to perform any intellectual task that a human being can do.

Artificial Intelligence (AI) - Systems or machines that have the ability to simulate human intelligence by learning from data inputs. It can perform tasks such as learning, reasoning, problem-solving, perception, and language understanding.

Black Box - An AI system whose internal workings and decision-making processes are opaque or difficult to track by humans.

Chatbot - A software application designed to simulate human conversation. They use AI technologies such as natural language processing to interpret and respond to user queries.

Conversational AI - A more advanced form of AI that allows machines to converse in a human-like manner. It serves as the synthetic "brain" behind some chatbots.

Deep Learning - A branch of artificial intelligence and machine learning that uses multi-layered neural networks to analyze data and make predictions. Each layer processes input from the layer above and outputs to the layer below. It is used for image recognition, speech recognition, natural language processing, and predictive analytics.

Deepfake - AI generated media where a person's likeness and/or voice is manipulated or replaced. Deepfakes have become very difficult to differentiate from reality.

Foundational Model - An AI model trained on a massive amount of data, usually through self-supervised learning, that is capable of accurately performing a wide range of tasks with minimal fine-tuning. GPT, BERT, or DALL-E are examples of foundational models that are pre-trained on large data sets and adaptable for a variety of specific tasks.

Generative Adversarial Network (GAN) - A style of training neural networks or deep learning machines that pits the two neural networks (a generator and a discriminator) against each other in a competitive setting to generate new data. The generator creates outputs that could be mistaken for real data while the discriminator must determine if that data was created using AI or not. The goal is for the generator to produce realistic outputs that are indistinguishable from real data and for the discriminator to strengthen is ability to distinguish real and fake data.

Generative AI (GenAI or GAI) - AI systems that can independently generate new content, such as text, images, music, or videos, based on learned patterns from the data its been trained on. However, some GenAI are not limited to their training datasets and can learn to respond to queries containing information that they were not previously trained on.

Generative Pre-trained Transformer (GPT) - A type of language model developed by OpenAI. These models are pre-trained on large datasets to understand and generate human-like text.

Hallucination - An instance where an AI system provides an output with incorrect, misleading, or fabricated information. It is possible for a hallucination to sound plausible but is not grounded in any factual or verifiable data.

Large Language Model (LLM) - A deep learning model trained on massive amounts of textual data to learn, understand, generate, and manipulate language. These models can perform tasks such as generating content, summarizing documents, translating text, and answering questions.

Machine Learning - A broad branch of AI that involves training AI systems to recognize patterns, understand concepts, make predictions or decisions, and solve problems in a way that imitates intelligent human behavior. These models improve their performance by learning from data.

Model - An AI tool, algorithm, or system trained on data to recognize patterns and make predictions that are human-like but require no human intervention. It consists of a specific architecture with learned parameters that enable it to process new inputs and generate outputs based on what it learned during training.

Multimodal AI - AI systems that can process and integrate multiple types of data, such as text, images, audio, and video, to understand and generate responses.

Natural Language Processing (NLP) - A branch of AI that focuses on the interaction between computers and human language. NLP allows machines to read, understand, and generate text or speech. Chatbots and transcription services use NLP.

Neural Networks (Neural Nets) - A type of machine learning model that parallels the structure of the human brain. Neural Nets contain payers of interconnected nodes (or neurons) that process and pass data through to make predictions or decisions.

Prompt Engineering - The practice of crafting and refining input prompts or queries to effectively guide AI models to generate desired outputs.

Reinforcement Learning - A type of machine learning where agents learn by interacting with an environment and receive rewards or penalties based on their actions.

Retrieval Augmented Generation (RAG) - A method that combines traditional information retrieval techniques with generative AI models. RAG models can search a large database of documents to find relevant information, and then use a generative model to generate that information into a response.

Self-Supervised Learning - A type of machine learning where the system learns from the structure or patterns in the data itself, rather than relying on labeled datasets. The model generates its own labels or predictions and uses these to refine its learning, often used in pre-training large models.

Semi-Supervised Learning - A type of machine learning where the model is trained using a combination of a small amount of labeled data and a large amount of unlabeled data. This approach can significantly reduce the need for labeled data, making it useful for tasks where labeling is expensive or time-consuming.

Supervised Learning - A type of machine learning where the model is trained on labeled data, meaning each training example is paired with a correct output. The model learns to map inputs to outputs, and the accuracy of the model is measured by how well it predicts the correct labels for new, unseen data.

Token - Is a unit of text, which can be a word, part of a word, or even a punctuation mark. Tokens are the basic elements that AI models process when analyzing or generating text.

Transformers - A type of deep learning model designed to handle data that comes in sequences, like text. They are especially good at tasks involving language because they use a system called "attention" that helps the model focus on different parts of the input. Well-known models built on transformers include GPT, BERT, and T5.

Unsupervised Learning - A type of machine learning where the model is not provided with labeled data. It tries to identify patterns and structures within the data on its own. Common techniques include clustering and anomaly detection.

Wrapper - A method in machine learning used to choose the best features (or pieces of data) for a model. It works by testing different combinations of features and picking the ones that make the model perform better, helping improve its accuracy.

Generative AI: Glossary

US Code : Definitions