
Scaling AI Models Like You Mean It

A solution is to self-host an open-sourced or custom fine-tuned LLM. Opting for a self-hosted model can reduce costs dramatically - but with additional development time, maintenance overhead, and possible performance implications. Considering self-hosted solutions requires weighing these different trade-offs carefully.
Developing Rapidly with Generative AI
Several engineers also maintained fallback models for reverting to: either older or simpler versions (Lg2, Lg3, Md6, Lg5, Lg6). Lg5 mentioned that it was important to always keep some model up and running, even if they “switched to a less economic model and had to just cut the losses.” Similarly, when doing data science work, both Passi and Jackson... See more
Shreya Shankar • "We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning.
- Microsoft introduced Phi 1.5 – a compact AI model with multimodal capabilities, meaning it can process images as well as text. Despite being significantly smaller than OpenAI's GPT-4, with only 1.3 billion parameters, it demonstrates advanced features like those found in larger models. Phi 1.5 is open-source, emphasizing the trend towards efficient
FOD#27: "Now And Then"
When we deliver a model we make sure we don't reach X seconds of latency in our API. Before even going into performance of LLMs for classification, I can tell you that with the current available tech they are just infeasible.
Reply
reply
LinuxSpinach
•
5h ago
^ this. And especially classification as a task, because businesses don’t want to pay llm buck... See more
Reply
reply
LinuxSpinach
•
5h ago
^ this. And especially classification as a task, because businesses don’t want to pay llm buck... See more