Stability AI, the company responsible for the Stable Diffusion image generation AI, has unveiled two new large language models (LLMs) known as FreeWilly1 and FreeWilly2. These models, based on Meta’s LLaMA and LLaMA 2 open-source models, have been trained on a smaller dataset that includes synthetic data. Despite their reduced size, the FreeWillys excel in intricate reasoning, linguistic subtleties, and answering complex questions in specialized domains such as law and mathematics.
CarperAI, a subsidiary of Stability AI, has released the FreeWillys under a non-commercial license to promote open access and advance research in the AI community. This means that the models cannot be used for profit-making purposes but are intended for academic and research endeavors.
The names FreeWilly1 and FreeWilly2 are a playful reference to the “Orca” AI training methodology developed by Microsoft researchers. This methodology enables smaller models exposed to limited data to achieve the performance of larger models trained on extensive datasets. The FreeWillys were trained using just 10% of the original Orca dataset, resulting in significant cost and energy savings compared to other leading LLMs.
Despite their smaller size, the FreeWillys have demonstrated outstanding performance, even surpassing ChatGPT on GPT-3.5 in some cases. Stability AI trained these models using instructions from four datasets created by Enrico Shippole, leveraging synthetic data. This approach not only reduced costs but also minimized the environmental impact by consuming less energy and having a lower carbon footprint.
The Challenge of Model Collapse
As the use of LLMs becomes more widespread, concerns have arisen regarding the generation of content using these models. Specifically, there is a worry about how future updates to these models, as well as future models, will be trained on AI-generated content and data.
Researchers have described a phenomenon called “model collapse,” wherein LLMs trained on increasing amounts of AI-generated data perform worse than models trained on human-generated data. However, Stability AI addressed this issue by using two other LLMs to generate 500,000 examples and 100,000 synthetic examples during the training of the FreeWillys. The results showed that the FreeWillys still performed well, suggesting that synthetic data could be a solution to model collapse while avoiding copyright and proprietary data concerns.
Stability AI envisions the FreeWilly models as pioneers in the field of open-access LLMs, empowering natural language understanding and enabling complex tasks. The team expresses excitement about the limitless possibilities these models will bring to the AI community and the potential for new applications inspired by their capabilities. They also extend their gratitude to the dedicated researchers, engineers, and collaborators whose efforts made this milestone possible.
Researchers and developers can access the weights for FreeWilly2 as they are, while FreeWilly1’s weights have been released as deltas over the original model. This allows for further experimentation and fine-tuning by the AI community, fostering collaborative progress in the field of language models.
Leave a Reply