
AWS Certified ML Specialty Guide
Arun Arunachalam
This audiobook is narrated by a digital voice.
DESCRIPTION
Amazon Web Services is the world's most comprehensive and broadly adopted cloud computing platform, providing on-demand access to IT resources, such as computing power, database storage, and...
Location:
United States
Networks:
Arun Arunachalam
Digital Voice Madison G
BPB Publications
English Audiobooks
INAudio Audiobooks
Description:
This audiobook is narrated by a digital voice. DESCRIPTION Amazon Web Services is the world's most comprehensive and broadly adopted cloud computing platform, providing on-demand access to IT resources, such as computing power, database storage, and other essential services, over the internet with pay-as-you-go pricing. With its vast array of services and tools, AWS provides a scalable and flexible environment for developing, deploying, and managing ML models. The purpose of the book is to empower individuals with basic AWS Cloud knowledge to leverage this advanced technology and obtain the coveted AWS Certified Machine Learning - Specialty certification. By mastering the intricacies of AWS ML services, readers can unlock new career opportunities and contribute to the ever-evolving field of ML. It guides the readers through the domains of data engineering, exploratory data analysis, modeling, and ML implementation and operations. Covering key concepts and practices, this guide equips individuals with fundamental AWS Cloud knowledge. By the end of this book, readers will learn to create efficient data repositories, perform data transformation, sanitize and prepare data, engineer features, select and train ML models, optimize performance, build scalable solutions, leverage AWS ML services, apply security practices, and deploy operational ML solutions. WHAT YOU WILL LEARN ● Design secure S3, EFS, and EBS repositories, implement data ingestion solutions, and perform data transformation. ● Frame business problems; select supervised, unsupervised, or ensemble models. ● Sanitize and prepare data for modeling, perform feature engineering, and analyze data for ML. ● Solving ML problems by selecting and training appropriate ML models. ● Perform hyperparameter optimization, evaluate ML models, and build performant ML solutions. Duration - 13h 16m. Author - Arun Arunachalam. Narrator - Digital Voice Madison G. Published Date - Thursday, 02 January 2025. Copyright - © 2026 BPB ©.
Language:
English
Title Page
Duration:00:00:18
Copyright Page
Duration:00:01:21
Dedication
Duration:00:00:11
About the Author
Duration:00:01:09
About the Reviewers
Duration:00:03:43
Acknowledgement
Duration:00:01:04
Preface
Duration:00:13:50
Table of Contents
Duration:00:22:53
1. Creating Data Repositories for Machine Learning
Duration:00:00:06
Introduction
Duration:00:01:12
Structure
Duration:00:00:18
Objectives
Duration:00:00:48
Introduction to data in ML
Duration:00:04:22
Identifying data sources
Duration:00:00:30
Identifying location of data
Duration:00:00:58
Collecting data
Duration:00:01:43
File formats for ML
Duration:00:03:57
Types of data involved
Duration:00:00:57
Analyzing data characteristics
Duration:00:02:12
Determining storage mediums
Duration:00:10:58
Conclusion
Duration:00:02:26
Multiple choice questions
Duration:00:03:29
Answer key
Duration:00:00:42
2. Implementing Data Ingestion Solutions
Duration:00:00:05
Introduction to data ingestion on AWS
Duration:00:01:24
Understanding data ingestion
Duration:00:01:08
Data ingestion in ML workflows
Duration:00:00:59
Overview of AWS services for data ingestion
Duration:00:02:42
Data processing type
Duration:00:01:20
Batch load vs. streaming
Duration:00:00:26
Batch load
Duration:00:02:32
Streaming
Duration:00:02:44
Choosing between batch load and streaming
Duration:00:01:10
Use cases and implications for ML
Duration:00:00:27
Services for batch data ingestion
Duration:00:01:03
Services for real-time data ingestion
Duration:00:02:29
Orchestrating data ingestion pipelines
Duration:00:00:48
Principles of data pipeline orchestration
Duration:00:01:21
Batch-based ML workloads
Duration:00:02:39
Streaming-based ML workloads
Duration:00:02:37
Understanding AWS services for data ingestion
Duration:00:00:53
Real-time data streaming
Duration:00:00:42
Concepts of Kinesis data streams
Duration:00:01:33
Creating and using a data stream
Duration:00:01:25
Scaling your stream
Duration:00:02:52
Simplifying data loading
Duration:00:00:36
Concepts of Kinesis Data Firehose
Duration:00:01:47
Automating data loading
Duration:00:01:37
Processing large datasets
Duration:00:00:40
Concepts of Amazon EMR
Duration:00:01:54
Scaling and optimization
Duration:00:05:39
Serverless data integration
Duration:00:00:28
Concepts of AWS Glue
Duration:00:01:44
Using AWS Glue for data integration
Duration:00:01:30
Leveraging AWS Glue for scalable data integration
Duration:00:05:05
Advanced stream processing
Duration:00:00:34
Concepts of Apache Flink
Duration:00:01:28
Building a stream processing application
Duration:00:01:16
Scaling and monitoring your application
Duration:00:05:04
Scheduling jobs
Duration:00:00:44
Strategies for job scheduling
Duration:00:01:21
Tools for job scheduling in AWS
Duration:00:02:29
Best practices for job management
Duration:00:01:52
3. Transforming Data into Insights
Duration:00:00:05
Understanding data transformation needs
Duration:00:01:48
Data transformation techniques
Duration:00:00:55
Different data transformation techniques
Duration:00:08:15
AWS Glue and its role in data transformation
Duration:00:06:21
Functioning of AWS Glue Data Catalog
Duration:00:02:07
Practical example of using AWS Glue Data Catalog for a data lake
Duration:00:02:33
AWS Glue Data Catalog crawlers
Duration:00:04:16
AWS Glue best practices
Duration:00:04:53
Handling ML-specific data
Duration:00:02:38
Data structures for ML
Duration:00:02:04
Big data processing frameworks overview
Duration:00:02:46
Handling large datasets using SageMaker and EMR
Duration:00:02:46
Optimizing data for ML algorithms
Duration:00:00:32
Techniques to optimize data
Duration:00:01:04
Best practices in data transformation for ML
Duration:00:00:37
Impact of data quality on ML model performance
Duration:00:03:41
Data transformation in action
Duration:00:03:59
4. Data Sanitization and Preparation
Duration:00:00:05
Introduction to data understanding
Duration:00:03:06
Handling unstructured data on AWS
Duration:00:02:47
Descriptive statistics and data exploration
Duration:00:05:11
Identifying and handling missing or corrupt data
Duration:00:00:23
Identifying missing data
Duration:00:00:55
Handling missing data
Duration:00:09:17
Identifying corrupt data
Duration:00:00:40
Handling corrupt data
Duration:00:01:08
Data preprocessing steps
Duration:00:00:21
Data formatting
Duration:00:04:47
Data normalization
Duration:00:02:36
Data augmentation
Duration:00:11:02
Data scaling
Duration:00:01:52
File formats for ML workflows
Duration:00:01:29
Data encryption and security services
Duration:00:02:52
Navigating labeled data challenges
Duration:00:04:25
5. Feature Engineering
Duration:00:00:04
Definition and importance of feature engineering
Duration:00:01:35
ML pipeline
Duration:00:02:43
Identifying and extracting features from text data
Duration:00:00:36
Tokenization
Duration:00:00:18
Bag of Words
Duration:00:02:32
Word embeddings
Duration:00:01:24
N-grams
Duration:00:00:20
Part-of-speech tagging
Duration:00:00:19
Named entity recognition
Duration:00:00:23
Sentiment analysis
Duration:00:00:17
Tools and libraries
Duration:00:00:54
Identifying and extracting features from speech data
Duration:00:00:32
Techniques for feature extraction
Duration:00:00:13
Mel-frequency cepstral coefficients
Duration:00:02:02
Spectrogram
Duration:00:02:44
Pitch and fundamental frequency
Duration:00:02:22
Identifying and extracting features from an image
Duration:00:03:23
Identifying and extracting features from numerical data
Duration:00:06:08
Comparing feature engineering techniques
Duration:00:00:15
6. Data Analysis and Visualization
Duration:00:00:05
Creating graphs
Duration:00:00:28
Scatter plots
Duration:00:01:55
Time series plots
Duration:00:02:19
Histograms
Duration:00:01:39
Box plots
Duration:00:02:44
Interpreting descriptive statistics
Duration:00:00:29
Correlation
Duration:00:02:26
Summary statistics
Duration:00:03:06
Calculating the correlation coefficient
Duration:00:01:53
P-value
Duration:00:04:55
Performing cluster analysis
Duration:00:01:27
Hierarchical clustering
Duration:00:04:43
Diagnosis of clusters
Duration:00:03:26
Elbow plot
Duration:00:02:08
Determining cluster size
Duration:00:04:14
7. Framing Business Problems as ML Problems
Duration:00:00:05
Identifying ML applicability in business scenarios
Duration:00:04:43
Supervised vs. unsupervised learning
Duration:00:00:30
Supervised learning
Duration:00:00:25
Working of supervised learning
Duration:00:01:10
Types of supervised learning models
Duration:00:06:44
Unsupervised learning
Duration:00:00:21
Working of unsupervised learning
Duration:00:00:54
Techniques used in unsupervised learning
Duration:00:16:37
Hybrid learning
Duration:00:06:32
Comparison of supervised and unsupervised learning
Duration:00:00:17
8. Selecting Appropriate ML Models
Duration:00:00:05
Overview of common ML models
Duration:00:00:11
XGBoost
Duration:00:00:24
Working of XGBoost
Duration:00:01:35
Key features and advantages
Duration:00:02:02
Best use cases and practical examples
Duration:00:02:06
Disadvantages of XGBoost
Duration:00:03:44
Logistic regression
Duration:00:02:01
Working of logistic regression
Duration:00:01:58
Advantages of logistic regression
Duration:00:01:22
Log odds interpretation
Duration:00:00:44
Limitations of logistic regression
Duration:00:01:10
Suitable applications and examples
Duration:00:02:04
Use cases not suitable for logistic regression
Duration:00:04:13
Decision trees
Duration:00:00:29
Working of decision trees
Duration:00:01:15
Disadvantages of decision trees
Duration:00:02:04
Random forests
Duration:00:00:31
Working of random forests
Duration:00:01:05
Disadvantages of random forests
Duration:00:01:08
Understanding neural networks
Duration:00:00:36
Recurrent neural networks
Duration:00:01:31
Disadvantages of RNNs
Duration:00:01:40
Convolutional neural networks
Duration:00:02:11
Disadvantages of CNNs
Duration:00:03:45
Insights into ensemble and transfer learning techniques
Duration:00:00:56
Ensemble methods
Duration:00:02:57
Disadvantages of ensemble methods
Duration:00:01:55
Transfer learning
Duration:00:02:13
Disadvantages of transfer learning
Duration:00:03:19
Model selection criteria based on data and problem type
Duration:00:01:43
AWS tools and services for model implementation
Duration:00:01:09
AWS SageMaker
Duration:00:00:23
Key features of AWS SageMaker
Duration:00:01:28
Best use cases
Duration:00:00:54
AWS Deep Learning AMIs
Duration:00:00:21
Key features of AWS Deep Learning AMIs
Duration:00:01:17
AWS Lambda and other services
Duration:00:00:20
Key features of AWS Lambda
Duration:00:01:22
Other AWS services for model implementation
Duration:00:01:48
9. Training ML Models
Duration:00:00:04
Data splitting
Duration:00:00:38
Importance of data splitting
Duration:00:00:38
Basic approach to training and validation sets
Duration:00:00:51
Real-world scenario
Duration:00:00:56
Advanced considerations in cross-validation
Duration:00:00:54
Implementing k-fold cross-validation
Duration:00:00:44
Pitfalls to avoid
Duration:00:01:18
Best practices for data splitting
Duration:00:01:52
Optimization techniques for ML training
Duration:00:00:34
Role of optimization in ML training
Duration:00:00:52
Understanding gradient descent as foundation of optimization
Duration:00:02:05
Practical application of mini-batch gradient descent
Duration:00:00:40
Advanced optimization techniques
Duration:00:00:17
Momentum
Duration:00:00:41