
With probably the most recent models in its Qwen series of open-source AI models, Alibaba Cloud is pushing the boundaries of AI technology even further. Alibaba has expanded its AI solutions with the discharge of Qwen-1.8B and Qwen-72B, in addition to specialized chat and audio models. Alibaba’s dedication to developing AI capabilities is demonstrated by these models, which offer improved performance and flexibility in language and audio processing.
With the discharge of the Qwen-1.8B and its larger equivalent, the Qwen-72B, the Qwen series—which already comprises the Qwen-7B and Qwen-14B—has been significantly enhanced. Pretrained on an enormous corpus of greater than 2.2 trillion tokens, Qwen-1.8B is a transformer-based model with 1.8 billion parameters. This model outperforms many similar-sized and even larger models in various language tasks in each Chinese and English. It also supports a protracted context with 8192 tokens.
Notably, Qwen-1.8B, with its quantized variants int4 and int8, provides an inexpensive deployment solution. These characteristics make it a wise option for various applications by drastically lowering memory needs. Its extensive vocabulary of greater than 150K tokens further improves its linguistic ability.
The larger model, Qwen-72B, has been trained on 3 trillion tokens. This model outperforms GPT-3.5 in most tasks and outperforms LLaMA2-70B in all tested tasks. Alibaba has designed the models to enable low-cost deployment despite their large parameters; quantized versions allow minimum memory use of around 3GB. This breakthrough significantly reduces the obstacles to working with massive models that used to cost thousands and thousands of dollars on cloud computing.
Alibaba introduced Qwen-Chat, optimized versions designed for AI support and conversational capabilities, along with Qwen base models. Along with generating material and facilitating natural conversation, Qwen-Chat can execute code interpretation and summarization tasks.
With its ability to handle various audio inputs along with text to generate text outputs, Alibaba’s Qwen-Audio represents a noteworthy advancement in multimodal AI. Remarkably, Qwen-Audio achieves state-of-the-art performance in speech recognition and a wide range of audio understanding standards without the necessity for fine-tuning.
Within the audio arena, Qwen-Audio establishes a brand new benchmark as a foundation audio-language model. It uses a multi-task learning framework to handle many audio formats. It achieves impressive results on multiple benchmarks, including state-of-the-art scores on tasks similar to AISHELL-1 and VocalSound.
Wen-Audio’s adaptability includes operating several chat sessions from text and audio inputs, with features starting from speech editing tools to music appreciation and sound interpretation.
Take a look at the Paper, Github, and Model. All credit for this research goes to the researchers of this project. Also, don’t forget to affix our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.
In case you like our work, you’ll love our newsletter..
Dhanshree
” data-medium-file=”https://www.marktechpost.com/wp-content/uploads/2022/11/20221028_101632-Dhanshree-Shenwai-169×300.jpg” data-large-file=”https://www.marktechpost.com/wp-content/uploads/2022/11/20221028_101632-Dhanshree-Shenwai-576×1024.jpg”>
Dhanshree Shenwai is a Computer Science Engineer and has an excellent experience in FinTech firms covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is smitten by exploring recent technologies and advancements in today’s evolving world making everyone’s life easy.