I was creating BLOOM-based models in my spare time earlier this year and I needed a script to run tests so I created one to work in Command Prompt. Here are some of the functions of the script:
- Text Generation – Run as the base command it can generate text using a prompt that it asks you to enter.
- Bulk Text Generation – If you add a csv file to the command it will generate text for each prompt in a list and save it on the csv file.
- Save Model – You can also add an argument that will save the model you’re using locally.
- Help File – If you add -h to the base command it will give you a help file that tells you about the various arguments.
Command Line Script
You can download the script here: telavir/bulk_generate_text
Libraries Used
I used the following libraries in the CLI script:
- transformers – This is to handle the model and tokenizer and functions related to them.
- huggingface_hub – For downloading and saving of models from HuggingFace.
- datasets – To wrangle the csv file.
- torch – To handle devices (cpu & cuda).