內容大鋼
作為一名Java企業開發者或架構師,你很清楚,擁抱AI已不再是可選項,而是保持競爭優勢的關鍵。問題在於,如何巧妙地將這些突破性的AI技術融入應用程序,同時避免陷入複雜性泥潭?
這是一本清晰明了、務實高效的指南,帶你將生成式AI集成到Java企業生態。通過作者Alex Soto Bueno、Markus Eisele和Natale Vinto的真知灼見,你將學會如何將Java企業級生態的穩健性與AI的動態能力相結合。這不僅僅是一本操作指南,更是一種通過巧妙整合AI來提升企業軟體水平的方法,確保你的技能與應用始終保持在技術前沿。在本書中,你將掌握以下能力:
揭開生成式AI在當代軟體開發中的作用與影響
利用Java豐富的開源框架生態,打造可落地的AI驅動應用。
實現經過實戰驗證的AI模式(專為可用於生產環境、具備企業級強度的應用而設計)。
在Java中接入並整合一流的開源AI模型。
以AI為核心,靈活自信地駕馭Java框架生態。
目錄
Preface
1.The Enterprise AI Conundrum
The AI Landscape: A Technical Perspective All the Way to GenAI
Machine Learning: The Foundation of Today's AI
Deep Learning: A Powerful Tool in the AI Arsenal
Generative AI: The Future of Content Generation
Open Source Models and Training Data
Why Open Source Is an Important Driver for GenAI
The Hidden Cost of Bad Data: Understanding Model Behavior Through Training Inputs
Adding Company-Specific Data to LLMs
Explainable and Transparent AI Decisions
Ethical and Sustainability Considerations
The Lifecycle of LLMs and Ways to Influence Their Behavior
MLOps Versus DevOps (and the Rise of AIOps and GenAIOps)
Conclusion
2.The New Types of Applications
Understanding Large Language Models
Key Elements of a Large Language Model
Deployment of Models
Choosing the Right LLM for Your Application
Model Type
Model Size and Efficiency
Deployment Approaches
Supported Precision and Hardware Optimization
Ethical Considerations and Bias
Community and Documentation Support
Closed Versus Open Source
Example Categorization
Foundation Models or Expert Models: Where Are We Headed?
Using Supporting Technologies
Embedding Models and Vector Databases
Caching and Performance Optimization
AI Agent Frameworks
Model Context Protocol
API Integration
Model Security, Compliance, and Access Control
Conclusion
3.Prompts for Developers: Why Prompts Matter in AI-Infused Applications
Types of Prompts
User Prompts: Direct Input from the User
System Prompts: Instructions That Guide Model Behavior
Contextual Prompts: Prepopulated or Dynamically Generated Inputs
Principles of Writing Effective Prompts
Prompting Techniques
Zero-Shot Prompting: Asking Without Context
Few-Shot Prompting: Providing Examples to Guide Responses
Chain-of-Thought Prompting: Encouraging Step-by-Step Reasoning
Self-Consistency: Improving Accuracy by Generating Multiple Responses
Instruction Prompting: Directing the Model Explicitly
Retrieval-Augmented Generation: Enhancing Prompts with External Data
Advanced Strategies
Constructing Dynamic Prompts: Combining Static and Generated Inputs
Using Prompt Chaining to Maintain Context
Using Guardrails and Validations for Safer Outputs
Leveraging APIs for Prompt Customization
Optimizing for Performance Versus Cost
Debugging Prompts: Troubleshooting Poor Responses
Tool Use and Function Calling
Context Engineering as the New Prompt Engineering
Designing Memory and Storage for Context
Fast Access with In-Memory Caches
Hot Memory for Short-Term Context
Vector Databases for Long-Term Semantic Memory
Cold Storage for Archival Data and Large Repositories
Combining Storage Tiers for Effective Context Delivery
Conclusion
4.AI Architectures for Applications
Beyond Traditional Architectures: Why AI-Infused Systems Require a New Approach
Overview of Core Architectural Pillars: A Roadmap for the Chapter
Application Components
Queries and Data: Managing Application Inputs
The AI Gateway: Managing Inputs and Outputs
Context and Memory
Interaction and Transport: Using Tools and Agents
Discovery and Access Control
Model Serving
The Data Preparation Pipeline
Observability and Monitoring: The End-to-End AI Stack
Conclusion
5.Embedding Vectors, Vector Stores, and Running Models Locally
Embedding Vectors and Their Role
Why Are Embeddings Needed?
Structure of an Embedding Vector
Measuring Similarity: Cosine Similarity and Distance
Common Embedding Models
How Are Embeddings Used in AI Applications?
Other Similarity Methods
Uncommon Uses of Embedding Vectors
Vector Stores and Querying Mechanisms
How Vector Databases Store and Retrieve Embeddings
Examples of Common Vector Stores
Retrieval-Augmented Generation
Indexing or Generating Vector Embeddings at Scale
Why Run Models Locally?
Ollama: Local Inferencing with a Simple Interface
Podman Desktop: Using Containerized Environments for AI Workloads
Jlama: Java-Native Model Inferencing for JVM-Based Applications
Comparing Local Inferencing Methods
Using OpenAI's REST API
Overview of OpenAI's Models and Endpoints
Generating Embeddings with OpenAI's API
Conclusion
6.Inference APIs
What Is an Inference API?
Benefits of an Inference API
Examples of Inference APIs
Deploying Inference Models in Java
Inferencing Models with DJL
Looking Under the Hood
Inferencing Models with gRPC
Conclusion
7.Accessing the Inference Model with Java
Connecting to an Inference API with Quarkus
The Architecture
The Fraud Inference API
The Quarkus Project
The REST Client Interface
The REST Resource
Testing the Example
Connecting to an Inference API with Spring Boot WebClient
Adding WebClient Dependency
Using the WebClient
Connecting to the Inference API with the Quarkus gRPC Client
Adding gRPC Dependencies
Implementing the gRPC Client
Conclusion
8.LangChain4j
What Is LangChain4j?
Unified APIs
Prompt Templates
Structured Outputs
Memory
Data Augmentation
Tools
High-Level API
LangChain4j with Plain Java
Extracting Information from Unstructured Text
Performing Text Classification
Generating Images and Descriptions
Spring Boot Integration
Adding Spring Boot Dependencies
Defining the AI Service
Creating a REST Controller
Quarkus Integration
Quarkus Dependencies
Frontend
The AI Service
WebSocket
Optical Character Recognition
Tools
Dependencies
Rides Persistence
Waiting Times Service
AI Service
REST Endpoint
Dynamic Tooling
Final Notes About Tooling
Memory
Dependencies
Changes to Code
Conclusion
9.Vector Embeddings and Stores
Calculating Vector Embeddings
Vector Embeddings Using DJL
Vector Embeddings Using In-Process LangChain4j
Vector Embeddings Using Remote Models with LangChain4j
Text Classifier
Embedding Text-Classification Dependencies
Providing Examples and Categorizing Inputs
Text Clustering
Adding Text Clustering Dependencies
Reading Headline News
Calculating the Vector Embedding
Clustering News
Summarizing News Headlines
Semantic Search
Adding Semantic Search Dependencies
Importing Movies
Querying for Similarities
Semantic Cache
RAG
Ingestion
Retrieval
Reranking
Query Router
Ingestion Splitting Window
Filtering Results
Conclusion
10.LangGraph4j
Understanding Graphs in LangGraph4j
Nodes
Edges
State
Using LangGraph4j
Defining a State
Defining a Node
Defining a Graph
Adding Conditional Edges
Appending Values
Using LangChain4j with LangGraph4j
Routing Agents
Human Interaction with LangGraph4j
Advanced RAG Schema with Self-Reflection
Exploring Additional Features
Subgraphs
Parallel Execution
Time Travel
Conclusion
11.Image Processing
OpenCV
Initializing the Library
Loading and Saving Images
Performing Basic Transformations
Overlaying Elements
Image Processing
Reading Barcodes and QR Codes
Stream Processing
Processing Videos
Processing Webcam Images
OpenCV and Java
OCR
Conclusion
12.Advanced Topics in AI Java Development
Streaming
Streaming with a Low-Level API
Streaming with AI Services
Using LangChain4j and Streaming Integrations
Guardrails
Input Guardrail
Output Guardrail
Guardrail Use Cases
Model Context Protocol
MCP Architecture
MCP Client with Java
MCP Client with Quarkus
MCP Server with Quarkus
Key Benefits of MCP
Next Steps
Index