Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Where to Buy or Rent GPUs for LLM Inference: The 2026 GPU Procurement Guide (bentoml.com)
3 points by sherlockxu 10 days ago | past | discuss
ChatGPT Usage Limits: What They Are and How to Get Rid of Them (bentoml.com)
1 point by bbzjk7 18 days ago | past
LLM Benchmark and Optimization Explorer (bentoml.com)
1 point by tanelpoder 60 days ago | past
AMD Data Center GPUs Explained: MI250X, MI300X, MI350X and Beyond (bentoml.com)
1 point by djhu9 67 days ago | past
Nvidia Data Center GPUs Explained: From A100 to B200 and Beyond (bentoml.com)
4 points by bbzjk7 74 days ago | past
Benchmarks Show Speculative Decoding Needs the Right Draft Model for 3× Gains (bentoml.com)
1 point by bbzjk7 3 months ago | past
LLM Inference Handbook (bentoml.com)
366 points by djhu9 4 months ago | past | 26 comments
What Is InferenceOps (bentoml.com)
2 points by sherlockxu 4 months ago | past
The Shift to Distributed LLM Inference (bentoml.com)
4 points by djhu9 5 months ago | past
How to Beat the GPU CAP theorem in AI Inference (bentoml.com)
3 points by sherlockxu 6 months ago | past
Cold-Starting LLMs on Kubernetes in Under 30 Seconds (bentoml.com)
2 points by djhu9 7 months ago | past
Six Infrastructure Pitfalls Slowing Down Your AI Progress (bentoml.com)
2 points by djhu9 7 months ago | past
The Complete Guide to DeepSeek Models: From V3 to R1 and Beyond (bentoml.com)
2 points by bbzjk7 8 months ago | past
Survey shows 65% of organizations are still establishing their AI foundations (bentoml.com)
1 point by sherlockxu 8 months ago | past | 1 comment
2024 State of AI Inference Infrastructure Survey Results (bentoml.com)
2 points by bbzjk7 8 months ago | past
Secure and Private DeepSeek Deployment (bentoml.com)
1 point by djhu9 8 months ago | past
A Guide to ComfyUI Custom Nodes (bentoml.com)
1 point by bbzjk7 10 months ago | past
A List of Top Open-Source Embedding Models (bentoml.com)
5 points by bbzjk7 on Oct 30, 2024 | past | 1 comment
Top Open-Source Vision Language Models (bentoml.com)
1 point by sherlockxu on Oct 11, 2024 | past
Exploring the World of Open-Source Text-to-Speech Models (bentoml.com)
2 points by sherlockxu on Sept 20, 2024 | past
Tuning TensorRT-LLM for Optimal Serving (bentoml.com)
1 point by djhu9 on Sept 20, 2024 | past
Compound AI Systems (bentoml.com)
1 point by AnhTho_FR on Aug 24, 2024 | past
Why should you care about compound AI? (bentoml.com)
1 point by bbzjk7 on Aug 16, 2024 | past
From Ollama to OpenLLM: Running LLMs in the Cloud (bentoml.com)
3 points by sherlockxu on July 18, 2024 | past
Benchmarking LLM Inference Back Ends: VLLM, LMDeploy, MLC-LLM, TensorRT-LLM, TGI (bentoml.com)
15 points by chaoyu on July 5, 2024 | past | 1 comment
Is LMDeploy the Ultimate Solution? Why It Outshines VLLM, TRT-LLM, TGI, and MLC (bentoml.com)
16 points by helloericsf on June 20, 2024 | past | 8 comments
Stable Diffusion 3: Text Master, Prone Problems? (bentoml.com)
3 points by sherlockxu on June 18, 2024 | past
Building a RAG App with BentoCloud and Milvus Lite (bentoml.com)
1 point by fzliu on June 14, 2024 | past
Comparing LLM Optimization Tools: VLLM, LMDeploy, MLC-LLM, TensorRT-LLM, and TGI (bentoml.com)
2 points by bbzjk7 on June 8, 2024 | past
Benchmarking LLM Inference Back Ends: VLLM, LMDeploy, MLC-LLM, TRT-LLM, and TGI (bentoml.com)
12 points by sherlockxu on June 6, 2024 | past | 2 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: