Querying finetuned models is very simple. You just replace the model name in your existing OpenAI code with the finetuned model. Here is a simple example:

  1. Get the ID of the finetuned model from the dashboard.

[clip]

  1. Use it in your existing code:

Note that, for querying LoRA models, there won’t be any cold start delay. If you are querying a full-finetune model, there will be a cold start delay of 10-15 seconds.