OpenAI introduces new tools to fast-track building of AI voice assistants

OpenAI unveiled a host of new tools on Tuesday that would make it easier for developers to build applications based on its artificial intelligence technology, as the ChatGPT maker wrestles with tech giants to keep up in the generative AI race.

The Microsoft-backed (MSFT.O), opens new tab startup said a new real-time tool, rolling out immediately for testing, would allow developers to create AI voice applications using a single set of instructions.

The process earlier required developers to go through at least three steps: first transcribing audio, then running the generated-text model to come up with an answer to the query and finally using a separate text-to-speech model.

A large chunk of OpenAI’s revenue comes from businesses that use its services to build their own AI applications, making the rollout of advanced capabilities a key selling point.

Competition has also been heating up as technology giants, including Google-parent Alphabet (GOOGL.O), opens new tab, integrate AI models capable of crunching different forms of information such as video, audio and text across their businesses.

OpenAI expects its revenue to jump to $11.6 billion next year from an estimated $3.7 billion in 2024, Reuters reported last month. The company is also in the middle of a $6.5 billion fundraise that could value it at $150 billion.

As part of Tuesday’s rollout, OpenAI introduced a fine-tuning tool for models after training that would allow developers to improve the responses generated by models using images and text.

This fine-tuning process can include feedback from humans who feed the model examples of good and bad answers based on its responses.

Using images to fine-tune models would give them stronger image understanding capabilities, enabling applications such as enhanced visual search and improved object detection for autonomous vehicles, OpenAI said.

The startup also unveiled a tool that would allow smaller models to learn from larger ones, along with “Prompt Caching” that cuts some development costs by half by reusing pieces of the text AI has previously processed.