DevOps

Learning about Distributed Inference with DeepSpeed ZeRO-3 and Docker Compose

Today, we’re going to test out DeepSpeed ZeRO-3 in docker-compose. Perhaps in a future blog post, I’ll cover DeepSpeed-FastGen or how to deploy this on a real multi-node/multi-gpu cluster. I also aim to compare this method vs Multi-Node Inference with vLLM. If you’re setting up a local cluster, consider checking out high bandwidth networking with InfiniBand. It’s surprisingly affordable.

Read More

How to Set Up a Serverless Home Lab with AWS CDK, Lambda, and LocalStack

Are you looking to develop and your cloud application locally without incurring AWS costs? In this tutorial, we’ll guide you through setting up a local serverless environment using AWS CDK, LocalStack, and a simple Python Lambda function. We’ll leverage the PythonFunction construct from the aws-cdk.aws-lambda-python-alpha module to streamline the process.

Read More