Josh Starmer and Luis Serrano Live Q/A from Uphill at Bern! Live with Jay Alammar, Josh Starmer, and Luis Serrano Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning GRPO - Group Relative Policy Optimization - How DeepSeek trains reasoning models 4h | Louis Serano Josh Starmer and Luis Serrano Live Q/A from Uphill at Bern! 14d | Louis Serano Discrete Dynamical Systems - Eigenvalues and Eigenvectors 1mo | Louis Serano Mean, Variance, Skewness, and Kurtosis - Math for ML with Deeplearning.ai 2mo | Louis Serano The three steps to make a reliable chatbot: Preamble, Fine-tuning, and RAG 2mo | Louis Serano Newton's method for approximating zeros of polynomials - Math for ML with Deeplearning.ai 2mo | Louis Serano The Stone-Weierstrass Theorem - How to approximate functions 2mo | Louis Serano Keys, Queries, and Values: The celestial mechanics of attention 3mo | Louis Serano Why is ChatGPT so bad at telling jokes (yet so good at writing poems?) 3mo | Louis Serano Why is DeepSeek so good? 3mo | Louis Serano 1 2 3 4 5 > >> Gruppe beitreten Mitglieder Suche ErstelltNach einem TagNach vier TagenLetzten Monat Choose a GroupLouis Serano Choose a User Sortierennach RelevanzUpvotedNeu zuerstLesezeichenanzahlAnzahl Kommentare Suche