內容大鋼
訓練數據與演算法本身一樣關係到數據項目的成敗,因為大多數AI系統的失敗都與訓練數據有關。儘管訓練數據是AI和機器學習成功的基礎,但卻很少有全面的資源能幫助你掌握這一過程。
在這本實踐指南中,作者Anthony Sarkis(Diffgram AI數據訓練軟體的首席工程師)向技術專業人員、管理人員、主題專家展示了如何使用和擴展訓練數據,同時闡明了監督機器的人性化一面。工程領導者、數據工程師、數據科學專業人士都將深入了解使用訓練數據取得成功所需的概念、工具和流程。
通過本書,你將學習如何:
有效地使用包括模式、原始數據、註釋在內的訓練數據;
改造你的工作、團隊或組織,使其更加以AI,ML數據為中心;
向其他員工、團隊成員、利益相關者清晰地解釋訓練數據概念;
為生產級AI應用設計、部署、交付訓練數據;
識別並糾正新的基於訓練數據的故障模式,如數據偏差;
自信地使用自動化技術來更有效地創建訓練數據;
成功維護、操作、改進訓練數據記錄系統。
目錄
Preface
1. Training Data Introduction
Training Data Intents
What Can You Do With Training Data?
What Is Training Data Most Concerned With?
Training Data Opportunities
Business Transformation
Training Data Efficiency
Tooling Proficiency
Process Improvement Opportunities
Why Training Data Matters
ML Applications Are Becoming Mainstream
The Foundation of Successful AI
Training Data Is Here to Stay
Training Data Controls the ML Program
New Types of Users
Training Data in the Wild
What Makes Training Data Difficult?
The Art of Supervising Machines
A New Thing for Data Science
ML Program Ecosystem
Data-Centric Machine Learning
Failures
History of Development Affects Training Data Too
What Training Data Is Not
Generative AI
Human Alignment Is Human Supervision
Summary
2. Getting Up and Running
Introduction
Getting Up and Running
Installation
Tasks Setup
Annotator Setup
Data Setup
Workflow Setup
Data Catalog Setup
Initial Usage
Optimization
Tools Overview
Training Data for Machine Learning
Growing Selection of Tools
People, Process, and Data
Embedded Supervision
Human Computer Supervision
Separation of End Concerns
Standards
Many Personas
A Paradigm to Deliver Machine Learning Software
Trade-Offs
Costs
Installed Versus Software as a Service
Development System
Scale
Installation Options
Annotation Interfaces
Modeling Integration
Multi-User versus Single-User Systems
Integrations
Scope
Hidden Assumptions
Security
Open Source and Closed Source
History
Open Source Standards
……
3.Schema
4.Data Engineering
5.Workflow
6.Theories, Concepts, and Maintenance
7.AI Transformation and Use Cases
8.Automation
9.Case Studies and Stories