ml-engineer
0.1.0
Python
Makefile
PyTorch
CUDA
OpenAI Triton
Kubenetes
Inference engine
LLM serving
Diffusion serving
LLM applications
AIGC model
ml-engineer
Welcome to MLE documentation!
View page source
Welcome to MLE documentation!
MLE
is a personal knowledge base for machine learning development.
Contents
Python
Docstrings
Decorators
Hooks
Typing hints
Logging
Virtual environments
Debugging
Testing
Packaging
Makefile
Manage ML experiments
PyTorch
Model visualization
torch.compile
PyTorch profiling
Use PyTorch Profile
Use NVIDIA Nsight Systems
Use PyTorch Lightning
CUDA
GPU architecture
Common CUDA optimizations
Hardware-aware algorithms
CUDA extension for PyTorch
References
OpenAI Triton
Design rationales
Tutorials
Examples
Kubenetes
Docker
Install the latest Docker
Install nvidia-docker2
Setup proxy for docker
Kind
Install kind
Use kind to create a k8s cluster
Load local images to kind
Troubleshooting kind
Known issues
Docker configuration not working
Timeout when pulling images in kind
Inference engine
TorchScript
Microsoft ONNX
NVIDIA TensorRT
LLM serving
Milestone models
Main challenges
Popular optimizations
Inference engine
vLLM
SGLang
Best practices
Diffusion serving
Milestone models
Main challenges
Popular optimizations
Inference engines
LLM applications
LLM and its limitations
RAG
Fine-tuning vs. RAG
Measurements of RAG
Challenges for RAG
Recent advancements
Agentic AI
Recipe for AI assistant
Recipe for AI scientist
References
AIGC model
Modeling techniques for AIGC
Foundation model development
References