Serving 70B-Scale LLMs Efficiently on Low-Resource Edge Devices [pdf]