General VS vertical, the large model approached the first match point

2023-05-18 07:22:13

Source: Shenmou Finance, author | Zhang Wei

Image credit: Generated by Unbounded AI tools

The battleground for AI megamodels is fragmenting.

As the fuse, Chatgpt opened the door to the era of AI2.0, and AI2.0 is characterized by "industrial intelligence and digitalization", which can efficiently replace labor and be widely used in all walks of life. Exploring the Metaverse, which has already passed the runaway period, the implementation of the AI large model is more realistic.

The most typical manifestation is that the big AI model goes out of the circle more widely, not just at the B side. For example, even though chatGPT has been released for more than half a year, the author can still hear the voices of migrant workers talking about chatGPT in the coffee shop downstairs in Shanghai CBD; according to media reports, some companies also use AIGC as a productivity tool.

As Zhang Yong, chairman and CEO of Alibaba Group and CEO of Alibaba Cloud Intelligence Group, said: Facing the AI era, all products are worth redoing with a large model.

Big factories, scientific research institutions and entrepreneurs have all come to an end.

Major manufacturers such as Baidu Wenxin Yiyan, Huawei Pangu, 360 Zhinao, Shangtang Rixin, Ali Tongyi Qianwen, Jingdong Lingxi, Kunlun Wanwei Tiangong and other large models have appeared successively, followed by Tencent Hunyuan, HKUST Large models such as Xunfei Xinghuo are waiting in line to go online.

Entrepreneurs also have celebrities. Wang Xiaochuan, founder of Sogou, Wang Huiwen, co-founder of Meituan, Kaifu Li, chairman of Sinovation Works, and others made high-profile appearances in AI large models.

The AI large-scale model craze that lasted for more than a few months has spawned two paths.

AI arms race, large model differentiation

AI large models have entered the competition stage, and the paths are gradually diverging.

As the AI model gradually heats up, according to media statistics, at the beginning of February, there were only 29 stocks in the "ChatGPT" section of Oriental Fortune, and now it has reached 61 stocks, and the number is still rising. According to incomplete statistics, as of now, more than 40 companies and institutions in my country have released large-scale model products or announced large-scale model plans.

Among them, players participating in the "arms race" of AI large-scale models have also developed two development directions. Vertical large models and general large models are becoming the two main development directions in the field of artificial intelligence.

Vertical large models refer to models optimized for specific domains or tasks, such as speech recognition, natural language processing, image classification, etc.

Currently, more and more companies are joining the track of vertical large-scale models. Xueersi announced that it is developing a large self-developed mathematical model, named MathGPT, for mathematics enthusiasts and scientific research institutions around the world; on May 6, Taoyun Technology announced the launch of a large cognitive model for children - Alpha Egg Children's Cognitive Big The model brings a new interactive experience for children in terms of practicing expression, cultivating EQ, inspiring creativity, and helping learning.

General large models refer to models that can handle multiple tasks and domains, such as BERT, GPT, etc.

Due to the advantages of capital and talents, major manufacturers mainly aim at the track of general-purpose large models.

Large manufacturers aim at general-purpose large models. On the one hand, they can combine AI capabilities with their own products. More representative Internet companies and technology giants such as Alibaba, Huawei, and Baidu.

For example, following Microsoft's integration of GPT-4 into the Office family bucket, Ali's "Tongyi Qianwen" has also begun to access DingTalk. Users can generate content in documents, and in video conferences, they can generate each Personal views and content.

For example, Baidu's large model can also be combined with its own business. "Wen Xin Yi Yan" can have a qualitative transformation in the iteration of search engines. NetEase’s “Yuyan” and JD.com’s “ChatJD” can be used first in their own industries.

On the other hand, the general-purpose large model has wide applicability, and those who outperform first can establish a first-mover advantage and become the leader in the AI2.0 era. After all, everyone knows the truth that "the ones who run fast get the meat, and the ones who run slowly can only eat the leftovers".

The vertical application large model can be described as a "clear stream". Since the vertical application large model is more in line with the needs of vertical scenarios and has higher quality than the general large model, many companies have also seen the opportunities. For example, Shenlan, Mobvoi, Youdao and other companies that focus on specific AI tracks.

The development of large vertical models is mainly reflected in the continuous improvement of model performance in various fields. For example, the error rate of speech recognition has decreased year by year, and the semantic understanding ability of natural language processing has continued to improve. The general large model has made remarkable progress in multi-task learning and transfer learning, and has become an important research direction in the field of natural language processing.

For example, large biological models can improve the efficiency of AI pharmaceuticals. Foreign research reports show that AI can increase the success rate of new drug research and development by 16.7%, and AI-assisted drug research and development can save US$54 billion in research and development costs every year, and save 40% to 60% of time and cost in the main process of research and development. According to Nvidia's public information, the use of AI technology can shorten the time required for early drug discovery to one-third and save costs to one-two-hundredth.

From an industry point of view, the general model is an "encyclopedia", which can answer every question and apply to different industrial soils, while the vertical model is similar to an expert in a single field. Although it is professional, its audience is destined to be a small number of people.

Data is fatal

The advantage of the vertical large model is that it is not "big" enough: the computing power is not large enough, and the difficulty of the algorithm is low.

After Wang Xiaochuan entered the large-scale model track, he has always emphasized that the direction of future efforts is not to do AGI (General Artificial Intelligence) like OpenAI, but to make large-scale models vertically in certain specific fields and realize practical applications .

A large model in a broad sense actually describes a general-purpose large model. Just like a "big" model, the reason why a large model is "big" is because of the large number of parameters and the huge amount of data, which have great impact on algorithms, computing power, and data storage space. Big requirements, and these are not only people who can make up, but also need a lot of money. You know, the success of Open AI was also built by Microsoft with billions of dollars. The huge capital demand is also a test for the determination of major manufacturers in research and development.

In the past five years, the parameter volume of AI large models has increased by an order of magnitude every year. For example, the parameter volume of GPT-4 is 16 times that of GPT-3, reaching 1.6 trillion; and with the introduction of multimodal data such as images, audio and video , the data volume of large models is also expanding rapidly. This means that if you want to play with a large model, you must have a large computing power.

Compared with large manufacturers, companies that make vertical large-scale models have relatively scarce funds, computing power, and data, so they are actually not on the same starting line as general-purpose large-scale model players.

Just as new energy vehicles are inseparable from the three major components of motors, batteries, and electronic controls, AI large models cannot be separated from the support of computing power, algorithms, and data.

Among computing power, algorithms, and data, data is the difficulty of large vertical models.

Among the three elements, the research and development difficulty of the algorithm is relatively low. Current companies have their own path algorithms for implementing large models, and there are many open source projects for reference.

The chip determines the computing power. The overall large AI model needs a higher-performance chip to complete the training and construction of the overall model neural network. However, the current chip is less self-developed, and it is still mainly externally sourced. For example, the chip that is most suitable for ChatGPT is from Nvidia. The flagship chip H100 and the sub-flagship chip A100.

The difficulty lies in the data. High-quality data is the key to assisting AI training and tuning. Sufficient and rich data is the foundation of generative AI large models.

According to OpenAI's previous disclosure, the number of ChatGPT3 parameters alone has reached 175 billion, and the training data has reached 45TB.

Due to the relatively mature development of China's mobile Internet, a large amount of Chinese data resources are stored in various enterprises or institutions, making it difficult to share.

"Since a lot of business data, logistics data, financial data, etc. of the enterprise are very core private domain data, it is hard to imagine that China Star Optoelectronics or PetroChina will use the data for others to train." Xu Hui, CEO of Chuangxinqizhi, was recently interviewed by securities In an interview with the Times, he also said bluntly.

Taking the AI pharmaceutical industry as an example, large biological models face the problem of being "stuck" by technology. The cost of obtaining high-precision experimental data for drug research and development is relatively high, and there are a large number of unlabeled data in the public database. It is necessary to make good use of both a large amount of unlabeled data and a small amount of high-precision data, so higher requirements are put forward for model construction.

Who will earn the first pot of gold?

Regardless of the model, commercialization is the core issue. Judging from the current AI players with large models, they are rapidly advancing empowerment and commercialization.

Although the general-purpose large-scale model and the vertical large-scale model take different paths, they are still "family" in essence and are in the same track, so the problem of competition cannot be avoided.

For the general large-scale model, the vertical large-scale model lands first, and the path of the general large-scale model will be narrower. Similarly, after general-purpose large-scale models quickly seize the market, it will be more difficult for vertical large-scale models with narrow business lines to make money.

In the ideal stage, whether it is an economic model or a universal value, the general-purpose large-scale model is better than the vertical large-scale model. However, real life is not a utopia. Whoever runs faster between the general-purpose large-scale model and the vertical large-scale model depends on the competition among various enterprises.

Judging from the hot AIGC last year. Compared with allowing users to use AI to generate content with a lower threshold on the C-end, some market participants believe that the B-end will be the more important business model of AIGC.

Huawei also pays more attention to its own ToB business. At the press conference, Huawei stated that the Huawei Pangu large model mainly uses AI to empower industries and is used in many industries such as electric power, finance, and agriculture. Among them, the CV large model is used in mines, and the NLP large model is used in intelligent document retrieval.

For example, Baidu, which specializes in search engines, has launched Wenxin Yiyan with search attributes like GPT-3.

In addition to ChatGPT, in fact, before the gust of AI large-scale models, there were landing scenes. These "big" models are actually mainly vertical large-scale models.

Language model: such as GPT, BERT, etc., mainly used in the field of natural language processing, such as machine translation, text generation, sentiment analysis, etc. Image models: such as ResNet, Inception, etc., which are mainly used in the field of computer vision, such as image classification, target detection, image segmentation, etc. Recommendation model: such as DNN, RNN, etc., which are mainly used in the field of recommendation systems, such as product recommendation and advertisement recommendation. Chatbots: such as Seq2Seq, Transformer, etc., which are mainly used in scenarios such as intelligent customer service and intelligent assistants. Financial risk control: such as XGBoost, LightGBM, etc., which are mainly used in risk control scenarios of financial institutions such as banks and securities, such as credit scoring and anti-fraud. Medical image diagnosis: such as DeepLung, DeepLesion, etc., which are mainly used in the field of medical image diagnosis, such as lung cancer diagnosis and pathological analysis.

Making money is more important than landing.

According to the Guosheng Securities report "How Much Computing Power Needed for ChatGPT", it is estimated that the cost of GPT-3 training is about 1.4 million U.S. dollars, and for some larger LLM (Large Language Model), the training cost is between 2 million U.S. dollars and 12 million U.S. dollars between. Based on the average number of unique visitors of ChatGPT in January of 13 million, the corresponding chip demand is more than 30,000 NVIDIA A100 GPUs, the initial investment cost is about 800 million U.S. dollars, and the daily electricity cost is about 50,000 U.S. dollars.

There is no doubt that general-purpose large-scale models are more widely used in landing scenarios. For players who are confident in general-purpose large-scale models, commercialization is second. Vertical large-scale models need faster commercialization to cover the bottom line, so vertical large-scale models have more advantages High probability and faster adoption rate.

There is no definite answer as to who can form an absolute advantage first. This "arms race" of AI large models is just like the butterfly change from web1 to web2. Enterprises are racing against time, and whoever seizes the opportunity first will seize the market.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

0/400

No comments

巴比特_

Trending TopicsView More
#Joingrowthpointsdrawtowiniphone17
20.6K Popularity
#Gatelayerofficiallylaunches
4.7M Popularity
#BtcPriceAnalysis
126.3K Popularity
#AreYouBullishOrBearishToday?
78.3K Popularity
#ShowMyAlphaPoints
164.8K Popularity

Sitemap

General VS vertical, the large model approached the first match point

AI arms race, large model differentiation

Data is fatal

**Who will earn the first pot of gold? **

Who will earn the first pot of gold?