I am following the instruction in ./inference/README.MD
I use the instructions to build and then launch the inference stack.
I changed the model in docker-compose.yaml like so:
inference-worker:
build:
dockerfile: docker/inference/Dockerfile.worker-full
context: .
image: oasst-inference-worker:dev
environment:
API_KEY: "0000"
#MODEL_CONFIG_NAME: ${MODEL_CONFIG_NAME:-distilgpt2}
MODEL_CONFIG_NAME: "OA_SFT_Pythia_12Bq"
When I run
python " __main__.py" in the text-***** folder
I seem to get an infinite loop about the message pending, after I enter my question.
What am I missing?
I have around 330MiB on the host.