NExT-GPT: The Evolution of Generative AI Beyond Text


As we delve deeper into 2023, artificial intelligence (AI) continues to expand, introducing new capabilities and reshaping our understanding of the technology. While text-based outputs have been the mainstay for leading language models like ChatGPT and Google Bard, a new contender is set to redefine the landscape: NExT-GPT. Offering a versatile range of outputs, including text, image, audio, and video, NExT-GPT is pushing the boundaries of what generative AI can achieve.

“The future of generative AI lies in the ability to interpret and generate various formats – a capability that NExT-GPT is pioneering.”

Introducing NExT-GPT: A Multi-Format Generative AI

Developed by researchers from the National University of Singapore and Tsinghua University, NExT-GPT is an ‘any-to-any’ system capable of accepting inputs in different formats and delivering responses in the desired format. This versatility means that users can input a text prompt and have NExT-GPT generate a corresponding video or provide an image for the AI to convert into an audio output.

While ChatGPT recently gained the capability to ‘see, hear, and speak,’ it is still developing video output features. On the other hand, NExT-GPT brings this capability to the table from the get-go, setting it apart as one of the few language models that can rival ChatGPT’s text-based outputs and even surpass it by offering creations beyond text.

First Impressions of NExT-GPT

Upon interacting with NExT-GPT on the demo site, the technology, while impressive, shows room for improvement. The image generation feature, for instance, was able to convincingly transform a photo of a cat into an image of the cat as a librarian. Although not on par with dedicated image generators like Midjourney or Stable Diffusion, the result was nonetheless charming.

However, the video and audio generation features did not fare as well. The videos generated had the unmistakable ‘made by AI’ look that characterizes many AI-generated images and videos, with slightly distorted and skewed elements. It was an uncanny experience.

The Future of NExT-GPT and Generative AI

Despite its shortcomings, NExT-GPT holds immense potential to fill the void in audio and video generation capacities within major AI platforms like OpenAI and Google. As NExT-GPT continues to evolve, the hope is that it will produce higher quality outputs, bringing us closer to the day when we can create top-notch home movies of our pets or any other content seamlessly using AI.

Watching how NExT-GPT and similar technologies develop and reshape our interaction with AI will be fascinating as the AI landscape continues to evolve. From language models that could only generate text, we are now witnessing AI models that can interpret and generate multimedia content. This, indeed, is a testament to the boundless potential of AI.



Source link