GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

langfuse GitHub - langfuse/langfuse: Open source observability and analytics for LLM applications

okuvshynov GitHub - okuvshynov/slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization

[1hr Talk] Intro to Large Language Models

youtube.com