Together serves models optimized for inference speed. They're not Groq but Toget...

		anybodyz on Feb 21, 2024 \| parent \| context \| favorite \| on: HuggingChat: Chat with Open Source Models Together serves models optimized for inference speed. They're not Groq but Together (and Perplexity Labs) have the lowest latencies and fastest tokens per second of any commercial services available right now. Also the lowest prices afaik.