Effortless ML jobs in the cloud 🚀
Optumi is a cloud service for model training, monitoring and insights
Problem we solve
How do you run and monitor jobs in the cloud?
Data scientists and ML engineers often have to SSH into instances to run scripts or stand up Kubernetes clusters. They have to periodically check CloudWatch or hunt down logs (from multiple locations) to see why jobs failed. They leave expensive GPU instances running or have to manually shut them down.
We believe this experience could and should be 10x better. Instead of buying and learning a full-lifecycle ML platform, Optumi offers a lightweight alternative that integrates and plays nicely with your favorite workflow tools.
A simple Python library and cloud-based UI
Resource automation ⚙️
Select from a wide variety of GPU and CPU instances. Optumi automatically shuts them off when jobs finish.
Optumi makes it easy to connect to cloud data sources like S3, RedShift, BigQuery & more.
Job execution ⚡️
Run Jupyter notebooks or Python scripts. You can optionally store output files & logs for easy download.
Status notifications 📲
Get notified via SMS, email or Slack channel when jobs start, succeed or fail. No more babysitting scripts.
Job insights 📊
Optumi generates reports that automatically diagnose & summarize job failures (e.g. ran out of GPU memory)
Machine suggestions 🔮
Optumi gives insight into resource usage and suggests better fitting instances - optimized for performance or speed.