幫助中心 | 我的帳號 | 關於我們

面向數據科學家的軟體工程(影印版)(英文版)

  • 作者:(英)凱瑟琳·納爾遜|責編:張燁
  • 出版社:東南大學
  • ISBN:9787576617672
  • 出版日期:2025/02/01
  • 裝幀:平裝
  • 頁數:238
人民幣:RMB 99 元      售價:
放入購物車
加入收藏夾

內容大鋼
    數據科學離不開代碼。編寫可復現、穩健、可伸縮代碼的能力是數據科學項目成功的關鍵,對於那些和生產代碼打交道的人來說,這一點至關重要。這本實用書籍填補了數據科學與軟體工程之間的空白,清晰地解釋了如何將軟體工程的最佳實踐應用於數據科學。
    本書提供的示例基於Python,取材於NumPy和pandas等流行的包。如果你想編寫更好的數據科學代碼,本指南涵蓋了數據科學入門或編碼課程中經常缺失的重要主題,包括如何:
    理解數據結構和面向對象編程
    清晰且熟練地記錄代碼
    打包並共享你的代碼
    將數據科學代碼集成到更大的代碼庫中
    學習編寫API
    創建安全的代碼
    將最佳實踐應用於測試、錯誤處理、日誌記錄等常見任務
    更高效地與軟體工程師合作
    編寫更高效、可維護、穩健的Python代碼
    將你的數據科學項目投入生產
    等等

作者介紹
(英)凱瑟琳·納爾遜|責編:張燁
    凱瑟琳·納爾遜(Catherine Nelson),是一名自由數據科學家和作家。此前,她曾擔任SAP Concur的首席數據科學家,開發生產級機器學習應用,並創建了嶄新的商務旅行功能。她還是O』Reilly出版的Building Machine Learning Pipelines一書的合著者。

目錄
Preface
1. What Is Good Code?
  Why Good Code Matters
  Adapting to Changing Requirements
  Simplicity
    Don't Repeat Yourself (DRY)
    Avoid Verbose Code
  Modularity
  Readability
    Standards and Conventions
    Names
    Cleaning up
    Documentation
  Performance
  Robustness
    Errors and Logging
    Testing
  Key Takeaways
2. Analyzing Code Performance
  Methods to Improve Performance
  Timing Your Code
  Profiling Your Code
    cProfile
    line_profiler
    Memory Profiling with Memray
  Time Complexity
    How to Estimate Time Complexity
    Big O Notation
  Key Takeaways
3. Using Data Struaures Effeaively
  Native Python Data Structures
    Lists
    Tuples
    Dictionaries
    Sets
  NumPy Arrays
    NumPy Array Functionality
    NumPy Array Performance Considerations
    Array Operations Using Dask
    Arrays in Machine Learning
  pandas DataFrames
    DataFrame Functionality
    DataFrame Performance Considerations
  Key Takeaways
4. Object-Oriented Programming and Functional Programming
  Ob)ect-Oriented Programming
    Classes, Methods, and Attributes
    Defining Your Own Classes
    OOP Principles
   Functional Programming

    Lambda Functions and map()
    Applying Functions to DataFrames
  Which Paradigm Should I Use?
   Key Takeaways
5. trr0rs, togging, and Debugging
   Errors in Python
    Reading Python Error Messages
    Handling Errors
    Raising Errors
   Logging
    What to Log
    Logging Configuration
    How to Log
   Debugging
    Strategies for Debugging
    Tools for Debugging
  Key Takeaways
6. Code Formatting, Linting, and Type Checking
  Code Formatting and Style Guides
    PEP8
    Import Formatting
    Automatic Code Formatting with Black
  Linting
    Linting Tools
    Linting in Your IDE
  Type Checking
    Type Annotations
    Type Checking with mypy
  Key Takeaways
7. Testing Your Code
  Why You Should Write Tests
  When to Test
  How to Write and Run Tests
    A Basic Test
    Testing Unexpected Inputs
    Running Automated Tests with Pytest
  Types of Tests
    Unit Tests
    Integration Tests
  Data Validation
    Data Validation Examples
    Using Pandera for Data Validation
    Data Validation with Pydantic
   Testing for Machine Learning
    Testing Model Training
    Testing Model Inference
   Key Takeaways
8. Design and Refactoring
  Project Design and Structure
    Project Design Considerations

    An Example Machine Learning Project
  Code Design
    Modular Code
    A Code Design Framework
    Interfaces and Contracts
    Coupling
  From Notebooks to Scalable Scripts
    Why Use Scripts Instead of Notebooks?
    Creating Scripts from Notebooks
  Refactoring
    Strategies for Refactoring
    An Example Refactoring Workflow
   Key Takeaways
9. Documentation
  Documentation Within the Codebase
    Names
    Comments
    Docstrings
    Readmes, Tutorials, and Other Longer Documents
  Documentation in Jupyter Notebooks
  Documenting Machine Learning Experiments
  Key Takeaways
10. Sharing Your Code: Version Control, Dependencies, and Packaging
  Version Control Using Git
    How Does Git Work?
    Tracking Changes and Committing
    Remote and Local
    Branches and Pull Requests
  Dependencies and Virtual Environments
    Virtual Environments
    Managing Dependencies with pip
    Managing Dependencies with Poetry
  Python Packaging
    Packaging Basics
    pyproject.toml
    Building and Uploading Packages
  Key Takeaways
11. APIs
  Calling an API
    HTTP Methods and Status Codes
    Getting Data from the SDG API
  Creating Your Own API Using FastAPI
    Setting Up the API
    Adding Functionality to Your API
    Making Requests to Your API
  Key Takeaways
12. Automation and Deployment
  Deploying Code
  Automation Examples
    Pre-Commit Hooks

    GitHub Actions
  Cloud Deployments
    Containers and Docker
    Building a Docker Container
    Deploying an API on Google Cloud
    Deploying an API on Other Cloud Providers
  Key Takeaways
13. Security
  What Is Security?
  Security Risks
    Credentials, Physical Security, and Social Engineering
    Third-Party Packages
    The Python Pickle Module
    Version Control Risks
    API Security Risks
  Security Practices
    Security Reviews and Policies
    Secure Coding Tools
    Simple Code Scanning
  Security for Machine Learning
    Attacks on ML Systems
    Security Practices for ML Systems
    Key Takeaways
14. Working in Software
  Development Principles and Practices
    The Software Development Lifecycle
    Waterfall Software Development
    Agile Software Development
    Agile Data Science
  Roles in the Software Industry
    Software Engineer
    QA or Test Engineer
    Data Engineer
    Data Analyst
    Product Manager
    UX Researcher
    Designer
  Community
    Open Source
    Speaking at Events
    The Python Community
  Key Takeaways
15. Next Steps
  The Future of Code
  Your Future in Code
  Thank You
Index

  • 商品搜索:
  • | 高級搜索
首頁新手上路客服中心關於我們聯絡我們Top↑
Copyrightc 1999~2008 美商天龍國際圖書股份有限公司 臺灣分公司. All rights reserved.
營業地址:臺北市中正區重慶南路一段103號1F 105號1F-2F
讀者服務部電話:02-2381-2033 02-2381-1863 時間:週一-週五 10:00-17:00
 服務信箱:bookuu@69book.com 客戶、意見信箱:cs@69book.com
ICP證:浙B2-20060032