幫助中心 | 我的帳號 | 關於我們

設計機器學習系統(影印版)(英文版)

  • 作者:(越)奇普·胡岩|責編:張燁
  • 出版社:東南大學
  • ISBN:9787576602241
  • 出版日期:2022/10/01
  • 裝幀:平裝
  • 頁數:367
人民幣:RMB 138 元      售價:
放入購物車
加入收藏夾

內容大鋼
    機器學習系統既複雜又獨特。複雜是因為包含大量組件,涉及許多不同的利益方;獨特是因為其依賴於數據,不同用例之間的數據差異很大。在本書中,你將學習以一種整體方法來設計兼具可靠性、可伸縮性、可維護性,並能適應不斷變化的環境和業務需求的機器學習系統。
    作者Chip Huyen是CIaypot AI的聯合創始人,她在如何幫助系統作為一個整體實現其目標的背景下考慮了每一種設計決策,例如如何處理和創建訓練數據,使用哪些特性,重新訓練模型的頻率,以及監測哪些內容。書中的迭代框架採用了真實的案例研究,並輔以大量參考資料。
    這本書將幫助你處理以下情況:
    工程化數據並選擇正確的指標來解決業務問題
    實現持續開發、評估、部署和更新模型的流程自動化
    開發監控系統,快速檢測和解決模型在生產中可能遇到的問題
    構建跨用例服務的機器學習平台
    開發可靠的機器學習系統

作者介紹
(越)奇普·胡岩|責編:張燁
    奇普·胡岩(Chip Huyen)是實時機器學習平台Claypot AI的聯合創始人。在NVIDIA、Netflix和Snorkel AI工作期間,她幫助多家大型機構開發和部署了機器學習系統。這本書是基於她在斯坦福大學教授的機器學習系統設計課程(CS 239S)撰寫的。

目錄
Preface
1. Overview of Machine Learning Systems
   When to Use Machine Learning
  Machine Learning Use Cases
   Understanding Machine Learning Systems
  Machine Learning in Research Versus in Production
  Machine Learning Systems Versus Traditional Software
   Summary
2. Introduction to Machine Learning Systems Design
   Business and ML Objectives
   Requirements for ML Systems
  Reliability
  Scalability
  Maintainability
  Adaptability
   Iterative Process
   Framing ML Problems
  Types of ML Tasks
  Objective Functions
   Mind Versus Data
   Summary
3. Data Engineering Fundamentals
   Data Sources
   Data Formats
  ISON
  Row-Major Versus Column-Major Format
  Text Versus Binary Format
   Data Models
  Relational Model
  NoSQL
  Structured Versus Unstructured Data
   Data Storage Engines and Processing
  Transactional and Analytical Processing
  ETL: Extract, Transform, and Load
   Modes of Dataflow
  Data Passing Through Databases
  Data Passing Through Services
  Data Passing Through Real-Time Transport
   Batch Processing Versus Stream Processing
   Summary
4. Training Data
   Sampling
  Nonprobability Sampling
  Simple Random Sampling
  Stratified Sampling
  Weighted Sampling
  Reservoir Sampling
  Importance Sampling
   Labeling
  Hand Labels

  Natural Labels
  Handling the Lack of Labels
   Class Imbalance
  Challenges of Class Imbalance
  Handling Class Imbalance
   Data Augmentation
  Simple Label-Preserving Transformations
  Perturbation
  Data Synthesis
   Summary
5. Feature Engineering
   Learned Features Versus Engineered Features
   Common Feature Engineering Operations
  Handling Missing Values
  Scaling
   Discretization
   Encoding Categorical Features
   Feature Crossing
   Discrete and Continuous Positional Embeddings
   Data Leakage
   Common Causes for Data Leakage
   Detecting Data Leakage
   Engineering Good Features
   Feature Importance
   Feature Generalization
   Summary
6. Model Development and 0ffline Evaluation
   Model Development and Training
   Evaluating ML Models
   Ensembles
   Experiment Tracking and Versioning
   Distributed Training
   AutoML
   Model Offline Evaluation
   Baselines
   Evaluation Methods
   Summary
7. Model Deployment and Prediction Service
   Machine Learning Deployment Myths
   Myth 1: You Only Deploy One or Two ML Models at a Time
   Myth 2: If We Don't Do Anything, Model Performance Remains the Same
   Myth 3: You Won't Need to Update Your Models as Much
   Myth 4: Most ML Engineers Don't Need to Worry About Scale
   Batch Prediction Versus Online Prediction
   From Batch Prediction to Online Prediction
   Unifying Batch Pipeline and Streaming Pipeline
   Model Compression
   Low-Rank Factorization
   Knowledge Distillation
   Pruning

   Quantization
   ML on the Cloud and on the Edge
   Compiling and Optimizing Models for Edge Devices
   ML in Browsers
    Summary
8. Data Distribution Shifts and Monitoring
  Causes of ML System Failures
    Software System Failures
    ML-Specific Failures
  Data Distribution Shifts
    Types of Data Distribution Shifts
    General Data Distribution Shifts
    Detecting Data Distribution Shifts
    Addressing Data Distribution Shifts
  Monitoring and Observability
    ML-Specific Metrics
    Monitoring Toolbox
    Observability
  Summary
9. Continual Learning and Test in Production
  Continual Learning
    Stateless Retraining Versus Stateful Training
    Why Continual Learning?
    Continual Learning Challenges
    Four Stages of Continual Learning
    How Often to Update Your Models
    Test in Production
    Shadow Deployment
    A/B Testing
    Canary Release
    Interleaving Experiments
    Bandits
  Summary
10. Infrastructure and Tooling for MLOps
  Storage and Compute
    Public Cloud Versus Private Data Centers
  Development Environment
    Dev Environment Setup
    Standardizing Dev Environments
    From Dev to Prod: Containers
  Resource Management
    Cron, Schedulers, and Orchestrators
    Data Science Workflow Management
    ML Platform
    Model Deployment
   Model Store
   Feature Store
    Build Versus Buy
    Summary
11. The Human Side of Machine Learning

    User Experience
   Ensuring User Experience Consistency
   Combatting "Mostly Correct" Predictions
   Smooth Failing
    Team Structure
   Cross-functional Teams Collaboration
   End-to-End Data Scientists
    Responsible AI
   Irresponsible AI: Case Studies
   A Framework for Responsible AI
    Summary
Epilogue
Index

  • 商品搜索:
  • | 高級搜索
首頁新手上路客服中心關於我們聯絡我們Top↑
Copyrightc 1999~2008 美商天龍國際圖書股份有限公司 臺灣分公司. All rights reserved.
營業地址:臺北市中正區重慶南路一段103號1F 105號1F-2F
讀者服務部電話:02-2381-2033 02-2381-1863 時間:週一-週五 10:00-17:00
 服務信箱:bookuu@69book.com 客戶、意見信箱:cs@69book.com
ICP證:浙B2-20060032