The open source AI movement took another step forward with the release of the Dolly Large Language Model (DLL) created by the Databricks enterprise software company. This new ChatGPT clone, named after the famous cloned sheep, is the latest manifestation of the growing open source AI movement that seeks to offer greater access to the technology so that it’s not monopolized and controlled by large corporations.
The Inspiration Behind Dolly
Dolly was created from an open source model developed by the non-profit EleutherAI research institute and the Stanford University Alpaca model. The Alpaca model was itself created from the 65 billion parameter open source LLaMA model developed by Meta. LLaMA, which stands for Large Language Model Meta AI, is a language model that is trained on publicly available data.
According to Weights & Biases, LLaMA can outperform many of the top language models (OpenAI GPT-3, Gopher by Deep Mind, and Chinchilla by DeepMind) despite being smaller. Another inspiration came from the academic research paper, “SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions,” which outlined a way to create a high-quality autogenerated question and answer training data that is better than the limited public data.
How Dolly Works
Dolly works by taking an existing open source 6 billion parameter model from EleutherAI and modifying it slightly to elicit instruction-following capabilities such as brainstorming and text generation not present in the original model, using data from Alpaca. Databricks observes that anyone can take a dated off-the-shelf open source large language model (LLM) and give it magical ChatGPT-like instruction-following ability by training it in 30 minutes on one machine using high-quality training data.
Surprisingly, instruction-following does not seem to require the latest or largest models. Dolly’s model is only 6 billion parameters, compared to 175 billion for GPT-3.
The Importance of Dolly
The importance of Dolly is that it demonstrates that a useful large language model can be created with a smaller but high-quality dataset. The open-source nature of Dolly allows anyone to use it without having to rely on a third-party that controls the AI technology. This aspect of the open-source AI movement is critical, as businesses may be hesitant to hand over sensitive data to a third-party that controls the AI technology.
The Impact of Dolly on AI Democratization
Dolly is said to democratize AI. It’s a part of a growing movement that was recently joined by the non-profit Mozilla organization with the founding of Mozilla.ai. Mozilla is the publisher of the Firefox browser and other open source software. With Dolly, businesses and individuals now have access to powerful language models that can be used for a variety of tasks, from brainstorming and text generation to language translation and sentiment analysis.
The Dolly Large Language Model is a game-changer in the open-source AI movement. It demonstrates that useful language models can be created with smaller datasets, and it provides businesses and individuals with access to powerful language models without having to rely on third-party control of AI technology.
What is Dolly Large Language Model?
Dolly Large Language Model is an open source chatbot created by Databricks using an existing 6 billion parameter model from EleutherAI.
What inspired the creation of Dolly Large Language Model?
The creation of Dolly Large Language Model was inspired by the Self-Instruct research paper, which outlines a way to create a high quality auto generated question and answer training data.
How does Dolly Large Language Model work?
Dolly Large Language Model works by taking an existing open source 6 billion parameter model from EleutherAI and modifying it slightly to elicit instruction following capabilities such as brainstorming and text generation using data from Alpaca.
What is the importance of Dolly Large Language Model?
The importance of Dolly Large Language Model is that it demonstrates that a useful large language model can be created with a smaller but high quality dataset. It also supports the open source AI movement, which seeks to offer greater access to the technology.
Who created Dolly Large Language Model?
Dolly Large Language Model was created by Databricks, an enterprise software company, using an existing open source model from EleutherAI and data from Alpaca.