It is possible to fine-tune CodeGen using Huggingface Transformers! Then you'd b...

Mo3 · on Aug 3, 2022

I train models 24/7 right now and PLEASE do not use AWS for it. You're going to pay out of your backside for it.

Better alternatives: Google Colab, Paperspace Gradient, Lambdalabs Cloud, Vultr GPU instances

Colab will give you a T4, K80, V100 or P100 (alternatively their own TPUs) for free - $50 for 24h uninterrupted background jobs, Gradient will give you free A6000s and sometimes even free A100s for a $40 subscription for 6 hours (repeatable ad infinitum), Lambdalabs gives you a RTX 6000 for 0.50/hour and A6000 for 0.80/hour and Vultr GPU will give you 1/7th of an A100 for 0.37/hour

bp89 · on Aug 6, 2022

Thank you for sharing the command for finetuning! Is it possible to share your ds_config.json? I tried to finetune the 2B model on A100 (40GB) using your command, but got a CUDA out of memory error. The ds_config I used was the one from huggingface (https://github.com/huggingface/transformers/blob/main/tests/...).

ensemblehq · on Aug 3, 2022

A friend of mine runs Sushi Cloud (https://www.sushi.cloud/), which could help make things cheaper than AWS for training purposes.

Mo3 · on Aug 3, 2022

I can't see how this is relevant to the discussion. There is no mention of GPU instances in the first place.

p1esk · on Aug 3, 2022

How do I create a dataset?

moyix · on Aug 3, 2022

Have a look at the datasets library [1], but as a shortcut, you can just create a file named "my_code.json" in jsonlines format with one line per source file that looks like:

   {"text": "contents_of_source_file_1"}
   {"text": "contents_of_source_file_2"}
   ...

And then pass that my_code.json as the dataset name.

[1] https://github.com/huggingface/datasets