GPT-4o, OpenAI’s latest and most powerful artificial intelligence (AI) model which was released in May, is getting a new upgrade. On Tuesday, the company released a new fine-tuning feature for the AI model that will allow developers and organisations to train it using custom datasets. This will allow users to add more relevant and focused data pertaining to their usage, and make the generated responses more accurate. For the next month, the AI firm has also announced that it will provide free training tokens to organisations to enhance the GPT-4o models.
GPT-4o Gets Fine-Tuning Feature
In a post, OpenAI announced the launch of the new feature and highlighted that it will allow developers and organisations to get higher performance at lower costs for specific use cases. Calling it “one of the most requested features from developers”, the AI firm explained that fine-tuning will enable the model to customise the structure and tone of the responses. It will also allow GPT-4o to follow complex domain-specific instructions.
Additionally, the company also announced that it will be providing organisations with free training tokens through September 23 for the AI models. Enterprises using GPT-4o will get one million training tokens per day, while those using GPT-4o mini will get two million training tokens per day.
Beyond this, fine-tuning training the models will cost $25 (roughly Rs. 2,000) per million tokens. Further, inference will cost $3.75 (roughly Rs. 314) per million input tokens and $15 (roughly Rs. 1,250) per million output tokens, OpenAI said.
To fine-tune GPT-4o, users can go to the fine-tuning dashboard that opens in a new window, click on Create, and select “gpt-4o-2024-08-06” from the base model drop-down menu. To do the same for the mini model, users will have to select the “gpt-4o-mini-2024-07-18” base model. These AI models will only be available to developers who are subscribed to the paid tiers of OpenAI.
Fine-tuning, in this context, is essentially a method to get the full processing capabilities of the large language model (LLM) while curating specific datasets to make it more receptive to niche workflows. It works like an AI agent or GPTs, but it is not limited in processing power, resulting in faster and generally more accurate responses.