TinyLlama: the mini AI model worth three billion tokens. Credit: SUTD
It is called TinyLlama and has taken the research world by storm due to the amount of energy it contains.
Developed by Associate Professor Lu Wei of Singapore University of Technology and Design (SUTD), Research Assistant Mr. Zhang Peiyuan and Ph.D. students, Mr. Zeng Guangtao and Mr. Wang Tianduo, TinyLlama is a 1.1 billion-parameter small open source language model that has outperformed other open source models of comparable sizes in various benchmarks. A total of three billion dataset tokens were pre-trained on TinyLlama in just four months.
Today’s large language models (LLMs), such as ChatGPT or Google Bard, developed by large technology companies such as OpenAI or Google, are managed by thousands or even tens of thousands of graphics processing units (GPUs) and require users to connect online to their massive systems. servers. TinyLlama, on the other hand, relies on only 16 GPUs and takes up only 550 MB of random access memory (RAM). In other words, TinyLlama can be easily deployed on mobile devices, allowing everyone to carry a “mini ChatGPT” in their pocket wherever they go.
According to Marktechpost, a California-based AI news platform with a community of over 1.5 million AI professionals and developers, TinyLlama’s performance on common-sense reasoning and problem-solving tasks highlights the potential of smaller models to achieve high performance when trained with a substantial amount of data. It also opens new possibilities for research and application in natural language processing, especially in scenarios where computational resources are limited.
Said Professor Lu, also director of the StatNLP research group, which focuses on natural language processing research: “The importance of small language models cannot be underestimated, and why TinyLlama was created specifically for Being open source was that it will democratize language models by allowing smaller technology companies and research labs to build and develop their own models for a variety of applications. As researchers, our plan is to lay the foundation for small language models , with the aim of achieving significant scientific advances in the field.
“Smaller technology companies, as well as individual researchers and developers, are increasingly demanding small language models that require fewer resources to run. Therefore, these models, such as TinyLlama, are more feasible to build and more optimal for cutting-edge devices such as mobile phones. The compactness of these models also allows them to serve a multitude of applications that require automatic translation in real time without an Internet connection. This means that users can access the language model offline. They do not need send their personal information to the server when they use it and through the technique called ‘fine-tuning’ we can improve it further,” Professor Lu added.
The team behind TinyLlama—from left to right: SUTD Ph.D. students, Zeng Guangtao and Wang Tianduo, associate professor Lu Wei and research assistant, Zhang Peiyuan. Credit: SUTD
TinyLlama’s innovative approach lies in its construction. It is based on the Llama 2 architecture and tokenizer and incorporates several cutting-edge technologies. One such technology is FlashAttention, which improves computational efficiency. Despite its smaller size than some of its predecessors, TinyLlama exhibits exceptional performance in various downstream tasks. It has successfully challenged the notion that larger models are always better, demonstrating that models with fewer parameters can still achieve high levels of effectiveness when trained on large and diverse data sets.
With its compact architecture and exceptional performance, TinyLlama can enable end-user applications on mobile devices and serve as a lightweight platform for language model research.
Companies such as Sea Limited, the world’s leading consumer Internet company, and DSO National Laboratories, a national defense research and development organization, have downloaded TinyLlama’s source code from GitHub for research purposes.
Dr. Liu Qian, research scientist and team leader of the Natural Language Processing Group at Sea AI Lab, said: “In our language model research projects, we have used the TinyLlama project as an agile and efficient testbed Its code base follows a compact format and well-organized structure, allowing easy modifications for various purposes. With access to several checkpoints of the 1B model, we quickly validate hypotheses, obtaining faster feedback compared to the Llama-7b models. .
“In particular, TinyLlama’s optimization improvements significantly increase GPU utilization, outperforming the Hugging Face transformer library. This combination of rapid prototyping and efficient training positions TinyLlama as a valuable tool, facilitating accelerated iterations in the research community.
TinyLlama is currently available on GitHub, a cloud-based platform and service for developers to store and manage their code. It trended as the number one code on Hugging Face, a platform for hosting AI-related projects, among more than 460,000 models for about a week starting January 3, 2024. Plans are underway to improve TinyLlama even further.
Citation: Research team launches first-of-its-kind mini AI model with score of three trillion tokens (2024, January 31) retrieved January 31, 2024 from https://techxplore.com/news/2024- 01-team-kind-mini -ai-trillion.html
This document is subject to copyright. Apart from any fair dealing for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.