No, they switched to per-hour billing for their dedicated servers a while ago. I also tested this last week when I cancelled a dedicated server we had with them and they did indeed only charge for part of the month.
There's a difference in being charged a partial amount for the month vs an hourly amount for something you need for just 2 hours.
I have a dedicated machine with Hetzner, and AFAIK they charge me the full amount regardless of whether it's on or off. If I cancel on day 3 of the month, it makes sense to be charged for only those days' hours. However that's different to "turn it on for 6 hours" kind of hourly pricing.
Either way... it's a mess. They added hourly billing for dedicated servers six months ago, there's not much excuse for still having contradictory information hanging around.
Yes, the GEX130 has a *one-time* setup fee. And yes, it is possible to cancel it on an hourly basis in most situations. For example, if you only had it for 3 days of a billing period, you would pay the hourly rate. We are currently in the process of updating our billing systems. There are a few situations where the 30 day to the end of the month policy still applies. For most situations the current cancellation process is more generous and flexible for the customer. And for the rest, it has stayed the same. If customers want to stop paying for a server, they need to cancel it.
We have a number of products that do not include setup fees, including our Server Auction dedicated servers and our cloud servers.
To be frank, after reading your reply I'm still not 100% sure what the policy is. What are the situations where there are 30 days of notice? How does that interact with hourly billing? If I'm a week through the second month and cancel, does that mean I get charged 2 months 1 week plus the setup fee in total?
I'm a big fan of Hetzner and have used it off and on for ten years now, including auction, dedicated and cloud. I've had very few complaints over the years. But I stand by my post above - the information on your site is not clear and contradictory, and really needs to be cleaned up.
We are currently in the process of updating our billing system. A big part of that process is already done, but there are still some changes to come. That is why we have not yet updated our terms and conditions. The new policies are more generous to the customers, and for the most part allow to cancel most products on an hourly basis, including dedicated servers like the GEX130. --Katie
Even when it does, GPUs with more VRAM than a flagship gaming model have always and will always come at a massive price premium. That ceiling is currently 24GB, if you need more than that it's going to cost you.
I don't see why. AMD could buy a lot of free community work just by putting out a 32 GB version of their next gen flagship at around 1k. Not chasing the margin for a generation would buy them a decent amount of community support - they need it if they want to compete with Nvidia at any point.
Ahh I just saw OP was saying sub 100$ - yeah that's never going to happen.
Thing is AMD can't have Nvidia margins because nobody supports them - take a hit for a generation on pro sales - get market/mind share and then you have a chance to play the same game.
Have you seen GPU prices lately (last 5 years)? A RTX 4080 super with merely 16 GB of VRAM is at least 1k, there's no way a 32 GB next gen flagship would be released at this price range.
What’s the most cost effective option for hosting an llm these days? I don’t need to train, I just want to use one of the llama models for inference to reduce my reliance on 3rd parties.
If you don't need a big model and are fine with hosting locally, an RTX3060 with 12GB VRAM is going to do just fine. Can be bought for about 200-300 USD.
I've been pleasantly surprised by what such a mediocre GPU and Llama3 8B can do for certain (simple) use cases. Ollama makes it all pretty easy.
Depends on the specific model and your perf requirements, but lots of them will run on a single box with a middle of the road GPU. If your invocation rate is low, hosted solutions like AWS Bedrock or using hosted APIs might be cheaper.
Consider also an online llama as a service like deepinfra. I have a local 3090 for playing around with the smaller models, but it's nice having the option of calling the 405b.
Ooh, I like that. Can see using them as a stepping stone where I'm using an open source model but without the hassle of setting up my own machine (but that I could later).
CoCalc offers On-Demand GPU servers with H100s starting at $2.01 per hour (metered per second) through its integration with Hyperstack... It also has more budget-friendly options, like RTX A4000s at $0.18 per hour.
In case you are not familiar, CoCalc is a real-time collaborative environment for education and research that you can access via your web browser at https://cocalc.com/
What's currently the cheapest/easiest way to deal with relatively lightweight GPU tasks, that are not lightweight enough for my PC?
Consider this use case: I want to upload 50 GB of audio somewhere and run whisper (biggest model) on it. I imagine the processing should be doable in minutes for a powerful GPU and must be very cheap, the script will be like 20 LOC, but I'll spend some time setting stuff up, uploading this and so on (which for example, makes colab a no-go for this). Any recommendations?
Also, when they say it's "per hour" do they mean an hour of GPU-time, or an hour of me "renting the equipment", so to say?
Pricing is surprising, typically Hetzner has extremely low prices, yet here there are 50%-70% more expensive then the competition, and you also pay a one time setup cost.
Do any of these offer training data as a service? Seems like they could charge a premium for a continuous multicast of a large dataset on say a 10g or higher connection. A one-to-many reply and charge the customer to sit under the firehose.
I use runpod or vast for training my (small) models (a few million parameters) mostly using RTX4090 up to 4 GPUs. Training is a sporadic task. Is not worth it for me to book it monthly (at these prices)
That's the second time I read this comment and I still don't believe it: it's listed as a "dedicated root server" (usually billed by the month) with no mention of typical cloud offers.
This is a monthly reservation for a single 6000 Ada for $940. You can get the same on RunPod for $670.
And to actually train stuff you'd likely want nodes with more of them, like 8, or just different GPUs all together (like A100/H100/etc).