Show HN: Tune LLaMa3.1 on Google Cloud TPUs

Hey HN, we wanted to share our repo where we fine-tuned Llama 3.1 on Google TPUs. We’re building AI infra to fine-tune and serve LLMs on non-NVIDIA GPUs (TPUs, Trainium, AMD GPUs).

The problem: Right now, 90% of LLM workloads run on NVIDIA GPUs, but there are equally powerful and more cost-effective alternatives out there. For example, training and serving Llama 3.1 on Google TPUs is about 30% cheaper than NVIDIA GPUs.

But developer tooling for non-NVIDIA chipsets is lacking. We felt thi

6mo | Hacker news
Show HN: Clace – Application Server with support for scaling down to zero

I have been building the open source project https://github.com/claceio/clace. Clace is an application server that builds and deploys containers, allowing it to manage webapps in any language/framework.

Compared to application servers like Nginx Unit, Clace has the advantage of being able to work with any application, without requiring any dependency or packaging changes. Clace provides a blue-green staged deployment model for apps. Not just c

6mo | Hacker news

Ricerca