an image

Locally running Transformers

I watched a video called Rust Artificial Intelligence (The Simple Way) about rust-bert and decided to follow along because I'd like to understand how to explore and play with the models on Hugging Face.

The video is a few months old, so there were some minor tweaks I needed to make along the way.

I needed libtorch 2.0.0, and a download link was not available on the website, so I copied the download link from pytorch.org getting started page:

I changed 2.0.1 in the download url with 2.0.0 before downloading the archive. I unpacked the archive and set my environment variables in .zshrc:

export LIBTORCH='/home/john/Downloads/libtorch/libtorch'
export LD_LIBRARY_PATH=$LIBTORCH/lib:$LD_LIBRARY_PATH

I didn't notice at first that LD_LIBRARY_PATH needed the /lib folder inside of libtorch.

The shape of the TextGenerationConfig struct had changed since the time of the video, so I needed to change it slightly to:

    let generate_config = TextGenerationConfig {
        model_type: rust_bert::pipelines::common::ModelType::GPTNeo,
        model_resource: rust_bert::pipelines::common::ModelResource::Torch(model_resource),
        config_resource,
        vocab_resource,
        merges_resource: Some(merges_resource),
        num_beams: 5,
        no_repeat_ngram_size: 2,
        max_length: Some(100),
        ..Default::default()
    };

This allowed me to get the rest of the code working without issues, and to start playing around with the models on Hugging Face.

Whole code


use rust_bert::{
    gpt_neo::{
        GptNeoConfigResources, GptNeoMergesResources, GptNeoModelResources, GptNeoVocabResources,
    },
    pipelines::text_generation::{TextGenerationConfig, TextGenerationModel},
    resources::RemoteResource,
};

fn main() {
    let model_resource = Box::new(RemoteResource::from_pretrained(
        GptNeoModelResources::GPT_NEO_2_7B,
    ));
    let config_resource = Box::new(RemoteResource::from_pretrained(
        GptNeoConfigResources::GPT_NEO_2_7B,
    ));
    let vocab_resource = Box::new(RemoteResource::from_pretrained(
        GptNeoVocabResources::GPT_NEO_2_7B,
    ));
    let merges_resource = Box::new(RemoteResource::from_pretrained(
        GptNeoMergesResources::GPT_NEO_2_7B,
    ));

    let generate_config = TextGenerationConfig {
        model_type: rust_bert::pipelines::common::ModelType::GPTNeo,
        model_resource: rust_bert::pipelines::common::ModelResource::Torch(model_resource),
        config_resource,
        vocab_resource,
        merges_resource: Some(merges_resource),
        num_beams: 5,
        no_repeat_ngram_size: 2,
        max_length: Some(100),
        ..Default::default()
    };

    let model = TextGenerationModel::new(generate_config).unwrap();

    loop {
        let mut line = String::new();
        std::io::stdin().read_line(&mut line).unwrap();
        let split = line.split('/').collect::<Vec<&str>>();
        let slc = split.as_slice();
        let output = model.generate(&slc[1..], Some(slc[0]));
        for sentence in output {
            println!("{}", sentence);
        }
    }
}

What now?

Now that I have it up and running, I'll be playing with some of the models I find to see what they're good at. So far, generating text has been very slow (1-2 minutes per response), and I don't love waiting around that long. I don't know enough about transformers or models to guess whether that can be meaningfully improved. Are bard and chatgpt and github copilot running on some super computers or are they just tuned to reply faster?

Maybe at some point I'll find out.