πWhy Internet of LLMs
Recently, Large Language Models (LLMs) have demonstrated their powerful semantic understanding capabilities for text and multimodal information. LLMs have shown strong problem-solving abilities across various domains, becoming foundational blocks for the development of general-purpose AI agents or Artificial General Intelligence (AGI).
However, we have noticed that there are still some important issues that need to be addressed by the AI community.
Issue 1. Most LLMs tend to focus on specific domains, and no single model consistently performs better than all others across various tasks. Although many studies have explored the cooperation among LLMs, these frameworks can only accommodate a limited number of LLMs, similar to the Local Area Network (LAN) in computer networking. Can we create an Internet of Large Language Models that allows for the free transfer of knowledge among any LLMs?
Issue 2. Given the large number of parameters and the substantial file sizes of LLMs, and the complexity of configuring development environments for LLMs, could we provide developers with a convenient model sharing and rapid environment configuration solution?
In the AI community, several initiatives such as Hugging Face, Ollama, AutoGe, and Langchain have been undertaken to address the aforementioned challenges. Hugging Face serves as a significant model-sharing platform, hosting a vast array of open-source models and datasets for machine learning. Ollama facilitates the local configuration and running of large models through its excellent environment design. AutoGen, an open-source framework by Microsoft, assists developers in building, managing, and optimizing multi-agent systems. Langchain is a framework for developing applications driven by language models, providing capabilities for building workflows as well as supporting the combination of agents.
Despite the powerful capabilities of these tools, mastering and configuring the entire toolkit requires not only extensive knowledge and a strong hardware setup, but also significant patience. The high requirements for expertise and the complexity of the operations hinder the further adoption of the aforementioned tools among the general public. According to Cognitive Load Theory, users can only process a limited amount of information at one time. Therefore, when designing tools, it is essential to consider elements such as a simple user interface, automation of repetitive tasks, and clear, visual feedback. Based on this, two new issues have arisen:
Issue 3. Can we develop a tool that can achieve one-click operationsβincluding environment deployment, model downloads, development debugging, and publishing sharing?
Issue 4. After optimizing based on the Baseline, could this tool automatically explore and combine LLMs to form the optimal Agent path?
With these issues in mind, to understand the real needs from the perspective of front-line developers and researchers in identifying workable solutions, we engaged in comprehensive discussions with 63 experts working in fields such as Large Language Models, Reinforcement Learning, Robotics, and Computer Graphics, including employees of leading companies in the industry and graduate students. Through the interactions, we identified another pressing issue:
Issue 5. The computational costs for training models are excessively high. Can we significantly reduce the cost of training by designing a new distributed training framework?
These issues led us to realize that the AI-driven industrial revolution of this century still has a considerable distance to travel. From the development of tools and replication of baselines to the implementation of upper-layer applications, each step encounters significant obstacles, whether they be in technical requirements, time investment, or financial costs. These are challenges that the AI community must address and overcome collectively.
Last updated