Introduction to Generative AI, Second Edition you own this product

Numa Dhamani and Maggie Engler

MEAP began June 2025
Last updated November 2025
Publication in February 2026 (estimated)

ISBN 9781633434882
325 pages (estimated)

Included with a Manning Online subscription

printed in black & white

catalog / Data Science / Deep Learning / Generative AI

table of content

1 Large language models: The foundation of generative AI

1.1 The evolution of natural language processing

1.2 The birth of LLMs

1.3 The explosion of LLMs

1.4 What are LLMs used for?

1.4.1 Language modeling

1.4.2 Question answering

1.4.3 Coding

1.4.4 Content generation

1.4.5 Logical reasoning

1.4.6 Other natural language tasks

1.5 Where do LLMs fall short?

1.5.1 Training data and bias

1.5.2 Limitations in controlling machine outputs

1.5.3 Sustainability of LLMs

1.6 Major players in generative AI

1.6.1 OpenAI

1.6.2 Google

1.6.3 Meta

1.6.4 Microsoft

1.6.5 Anthropic

1.6.6 Other notable players

1.7 Conclusion

1.8 Summary

2 Training large language models: Learning at scale

2.1 How are LLMs trained?

2.1.1 Exploring open web data collection

2.1.2 Demystifying autoregression and bidirectional token prediction

2.2 Training multimodal LLMs

2.3 Transferring knowledge for efficient models

2.4 Mixture-of-experts and sparse models

2.5 Reasoning models

2.6 Techniques for post-training LLMs

2.6.1 Supervised fine-tuning

2.6.2 Reinforcement learning from human feedback

2.6.3 Direct preference optimization

2.6.4 Reinforcement learning from AI feedback

2.7 Emergent properties of LLMs

2.7.1 Learning with a few examples

2.7.2 Is emergence an illusion?

2.8 Conclusion

2.9 Summary

3 Data privacy and safety: Technical and legal controls

3.1 What’s in the training data?

3.1.1 Encoding bias

3.1.2 Linguistic diversity

3.1.3 Sensitive information

3.2 Safety-focused improvements for LLM generations

3.2.1 Post-processing detection algorithms

3.2.2 Content filtering or conditional pre-training

3.2.3 Safety post-training

3.2.4 Machine unlearning

3.3 Navigating user privacy and commercial risks

3.3.1 Inadvertent data leakage

3.3.2 Best practices when interacting with LLMs

3.4 Data protection and privacy in the age of AI

3.4.1 International standards and data protection laws

3.4.2 Are generative AI systems GDPR-compliant?

3.4.3 Privacy regulations in academia

3.4.4 Corporate policies

3.4.5 Governing data in an AI-driven world

3.5 Conclusion

3.6 Summary

4 AI and the creative economy: Innovation and intellectual property

4.1 The rise of synthetic media

4.1.1 Techniques for creating synthetic media

4.1.2 The opportunities and risks of synthetic media

4.1.3 Detecting synthetic media

4.2 Transforming creative workflows

4.2.1 Marketing and media applications

4.2.2 Visual and digital art

4.2.3 Filmmaking

4.2.4 Music

4.3 Intellectual property in the LLM era

4.3.1 Copyright law and fair use

4.3.2 Open source and licenses

4.3.3 Creator’s rights and data licensing

4.4 Conclusion

4.5 Summary

5 Misuse and adversarial attacks: Challenges and responsible testing

5.1 Intentional misuse

5.1.1 Cybersecurity and social engineering

5.1.2 Illicit and harmful applications

5.1.3 Adversarial narratives

5.1.4 Political manipulation and electioneering

5.2 Hallucinations

5.2.1 Why do LLMs hallucinate?

5.2.2 Misuse of LLMs in the professional world

5.3 Red teaming LLMs

5.4 Conclusion

5.5 Summary

6 Machine-augmented work: Productivity, education, and economy

6.1 Using LLMs in the professional space

6.1.1 LLMs assisting doctors with administrative tasks

6.1.2 LLMs for legal research, discovery, and documentation

6.1.3 LLMs augmenting financial investing and bank customer service

6.1.4 LLMs as collaborators in creativity

6.2 LLMs as a programming partner

6.3 LLMs in daily life

6.4 Generative AI in education

6.5 Detecting machine-generated text

6.6 Generative AI and the labor market

6.7 Conclusion

6.8 Summary

7 Prompt engineering: Strategies for guiding and evaluating LLMs

7.1 What is prompt engineering?

7.2 Prompting techniques and frameworks

7.2.1 Overview of common prompting techniques

7.2.2 Structuring prompts to guide model behavior

7.2.3 Prompting frameworks for structured output

7.2.4 Evolving practices in prompt engineering

7.3 Evaluating AI-generated outputs

7.3.1 Identifying evaluation metrics

7.3.2 Assembling evaluation datasets

7.3.3 Scoring model responses

7.4 Prompting vs. post-training

7.5 Conclusion

7.6 Summary

8 AI agents: The rise of autonomous AI systems

8.1 What is an AI agent?

8.2 How are AI agents being used?

8.2.1 Personal assistants

8.2.2 Enterprise workflows

8.2.3 Research and discovery

8.2.4 Software development

8.2.5 Cybersecurity

8.2.6 Physical environments

8.2.7 Multi-agent systems

8.2.8 Toward agentic collaboration

8.3 How are AI agents trained and enabled?

8.3.1 Agent architectures

8.3.2 Retrieval-augmented generation

8.3.3 Model Context Protocol

8.3.4 GUI-native agents

8.3.5 Evaluating agents

8.4 Risks and considerations unique to agents

8.4.1 Autonomy and misalignment

8.4.2 Memory and state persistence

8.4.3 Tool access and real-world consequences

8.4.4 Emergent behaviors in multi-agent systems

8.4.5 Security and adversarial risks

8.4.6 Human factors and decision delegation

8.4.7 Evaluation, monitoring, and oversight

8.4.8 The road ahead

8.5 The future of AI agents

8.6 Conclusion

8.7 Summary

9 Human connections: The social role of chatbots

9.1 The rise of human–chatbot relationships

9.2 Why humans are turning to chatbots for relationships

9.2.1 The loneliness epidemic

9.2.2 Emotional attachment in human–chatbot relationships

9.3 The benefits and risks of human–chatbot relationships

9.4 Toward healthier human–chatbot relationships

9.5 Conclusion

9.6 Summary

10 The future of responsible AI: Risks, practices, and policy

10.1 Where are LLM developments headed?

10.1.1 Language as the universal interface

10.1.2 From tools to agentic systems

10.1.3 The rise of personalized AI

10.1.4 On the horizon

10.2 Sociotechnical risks of generative AI

10.2.1 Bias, toxicity, and representational harms

10.2.2 Hallucinations, fabrications, and epistemic harm

10.2.3 Autonomy and emergent agentic risks

10.2.4 Misuse across domains

10.2.5 Dependency, emotional harm, and relationship risks

10.2.6 Labor and economic disruption

10.2.7 A holistic view of harm

10.3 Best practices for responsible AI development and use

10.3.1 Curating datasets and standardizing documentation

10.3.2 Protecting data privacy

10.3.3 Explainability, transparency, and bias

10.3.4 Design interventions and architectures

10.3.5 Model training strategies for safety

10.3.6 Red teaming and evaluation

10.3.7 Detecting and tracing synthetic media

10.3.8 Platform responsibility and user safeguards

10.3.9 Humans in the loop

10.3.10 Education and digital literacy

10.3.11 Toward responsible generative AI

10.4 AI regulations in practice

10.4.1 The United States

10.4.2 The European Union

10.4.3 China

10.4.4 Corporate self-governance

10.5 Toward an AI governance framework

10.6 Conclusion

10.7 Summary

11 Frontiers of AI: Open questions and global trends

11.1 The quest for artificial general intelligence

11.2 AI sentience and consciousness

11.3 The carbon footprint of LLMs

11.4 The open source movement

11.5 Global investment in AI

11.6 Conclusion

11.7 Summary

Appendix

Appendix A: References

A.1 Chapter 1

A.2 Chapter 2

A.3 Chapter 3

A.4 Chapter 4

A.5 Chapter 5

A.6 Chapter 6

A.7 Chapter 7

A.8 Chapter 8

A.9 Chapter 9

A.10 Chapter 10

A.11 Chapter 11

Overview

1 Large language models: The foundation of generative AI

Large language models have rapidly moved from research labs into everyday life, catalyzed by the public debut of ChatGPT, which revealed how capable and accessible modern AI had become. This chapter situates LLMs as the foundational technology behind today’s generative AI, explaining why they matter for work, creativity, and communication and why a basic intuition for how they function is essential. It balances the excitement around their transformative potential with a pragmatic view of their shortcomings and societal risks, aiming to help readers cut through the hype and use these systems responsibly.

The chapter traces NLP’s evolution from brittle rule-based systems to statistical learning and then to deep neural networks, culminating in transformers and the attention mechanism that unlocked scale, speed, and context handling. It introduces how LLMs are trained through self-supervised next-token prediction, then adapted via fine-tuning and reinforcement learning, and how model size and data shape capability. With this foundation, it surveys what LLMs can do: fluid conversation, text generation and summarization, translation, question answering, coding assistance, and emerging strengths in logical and scientific reasoning—plus early steps into multimodality across text, images, audio, and video.

Alongside their promise, the chapter examines core limitations and risks: the reproduction of bias from web-scale training data, the tendency to hallucinate plausible but false answers, challenges in controlling outputs, and sustainability concerns from the computing and energy required—factors that may concentrate power among a few well-resourced actors. It then maps the competitive landscape—OpenAI, Google, Meta, Microsoft, Anthropic, and a wave of newer entrants—highlighting differing philosophies around capability, safety, openness, and enterprise focus. The result is a clear framework for understanding what LLMs are, what they’re good at, where they fall short, and how the ecosystem is evolving.

The reinforcement learning cycle

The distribution of attention for the word “it” in different contexts.

A timeline of breakthrough events in NLP.

Representation of word embeddings in the vector space

Summary

The history of NLP is as old as computers themselves. The first application that sparked interest in NLP was machine translation in the 1950s, which was also the first commercial application released by Google in 2006.
Transformer models and the debut of the attention mechanism were the biggest NLP breakthroughs of the decade. The attention mechanism attempts to mimic attention in the human brain by placing “importance” on the most relevant information.
The boom in NLP from the late 2010s to early 2020s is due to the increasing availability of text data from around the internet and the development of powerful computational resources. This marked the beginning of the LLM.
Today’s LLMs are trained primarily with self-supervised learning on large volumes of text from the web and are then fine-tuned with reinforcement learning.
GPT, released by OpenAI, was one of the first general-purpose LLMs designed for use with any natural language task. These models can be fine-tuned for specific tasks and are especially well-suited for text-generation applications, such as chatbots.
LLMs are versatile and can be applied to various applications and use cases, including text generation, answering questions, coding, logical reasoning, content generation, and more. Of course, there are also inherent risks, such as encoding bias, hallucinations, and emission of sizable carbon footprints.
In January 2023, OpenAI’s ChatGPT set a record for the fastest-growing user base in history and set off an AI arms race in the tech industry to develop and release LLM-based conversational dialogue agents. As of 2025, the most significant LLMs have come from OpenAI, Google, Meta, Microsoft, and Anthropic.

FAQ

What is a large language model (LLM), and why was ChatGPT’s release a turning point?

LLMs are neural networks trained on massive text corpora to predict the next token and generate fluent, human-like language. ChatGPT packaged an LLM for easy dialogue, showcasing how one model could write, summarize, answer questions, and explain concepts—sparking record-setting adoption and making generative AI mainstream.

How do transformers and the attention mechanism enable modern LLMs?

Attention lets a model weigh which parts of an input sequence matter most for each token, providing rich context. Transformers apply self-attention across the entire sequence and compute these representations in parallel, yielding both speed and state-of-the-art performance compared with older sequential architectures.

How are LLMs trained: self-supervised, supervised, and reinforcement learning?

LLMs primarily use self-supervised learning (predicting hidden or next tokens) so they can learn from unlabeled text at scale. They’re often further refined with supervised fine-tuning on labeled examples and can incorporate reinforcement learning, which uses rewards and penalties to prefer desired behaviors.

What is fine-tuning and why is it useful?

Fine-tuning adapts a pre-trained model to a related task by training briefly on a smaller, task-specific dataset. It reuses the model’s general language understanding to improve performance on targeted applications (for example, classification, domain-specific QA, or a stylistic writing assistant) with far less data and cost than training from scratch.

What can LLMs do today?

Common uses include language modeling and text generation, open- and closed-book question answering, reading comprehension, translation, summarization, and content creation. They’re also used for coding assistance, and they show emerging abilities in arithmetic, logic, and scientific problem-solving—though with uneven reliability depending on the task.

What are tokens, embeddings, and parameters in an LLM?

Tokens are units of text (words or subwords) the model reads and writes. Embeddings map tokens into numerical vectors that capture meaning and context. Parameters (the model’s learned weights) determine how inputs transform into outputs; larger models with more parameters can represent more complex patterns, given sufficient data.

Where do LLMs fall short?

Key issues include bias reflecting patterns in training data, hallucinations (confident but incorrect outputs), and difficulty strictly controlling responses under adversarial prompts. Training and inference are energy-intensive, raising cost, accessibility, and environmental concerns.

What are hallucinations and why do they happen?

Hallucinations are fluent but false statements produced by a model. They can stem from noisy or incomplete training data, gaps in the model’s internal representations, or the combinatorial explosion of possible continuations, which makes consistently factual long outputs hard to guarantee.

How does training data shape model behavior and bias?

LLMs learn patterns from large web-scale corpora (e.g., Wikipedia, books, social media). These sources contain both high-quality knowledge and problematic content; stereotypes, toxic language, and skewed representation can imprint on the model, leading to disparate outputs across identities and contexts.

Who are the major players in generative AI, and how do their strategies differ?

OpenAI (ChatGPT, GPT-4/4o, Sora, o1) emphasizes powerful multimodal models; Google (Gemini, DeepMind) pairs foundational research with product integration; Meta (Llama) pushes efficient, open-access models; Microsoft integrates “Copilot” across its suite via its OpenAI partnership; Anthropic (Claude) emphasizes safety and alignment. Others include DeepSeek (efficient MoE models), Cohere (enterprise focus), Perplexity (AI search), Mistral (efficient open models), xAI (Grok), Stability AI, Midjourney, and Runway (image/video).

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$55.99 $36.39

you save $19.60 (35%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$55.99 $36.39

you save $19.60 (35%)

eBook

pdf, ePub, online

$55.99 $36.39

you save $19.60 (35%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more