当前位置:
Google AI Chronicles: Search giants' innovator dilemma

Google AI Chronicles: Search giants' innovator dilemma

Google AI Chronicles: Search giants' innovator dilemma

2026-01-29 10:20
QR code
6

2025-11-04 16:01

Google AI Chronicles: Search giants' innovator dilemma

 Playing attention

This article is from the WeChat public account: Silicon Star GenAICompiled by Zhou Huaxiang, original title: Google AI Chronicles: From Search Giants to Innovators' Dilemma in 25 Years, title source: Visual China (Google publicly listed on NASDAQ, August 19, 2004)


Today, I finished listening to the Acquired.fm podcast "Google: The AI Company" complete audio, a full four hours,

extremely high information density, very shocking.

This episode takes a 25-year time span to fully restore how Google gathered the world's top AI talents and invented the world-changing

technology of Transformer, but watched the talents it cultivated create OpenAI and Anthropic.Eventually,

he fell into the most classic innovator dilemma in history.

After listening, I have compiled this detailed compilation,

hoping to help you understand this most fascinating case in the history of technology.



The most classic innovator dilemma in history


Imagine a scenario like this:

You have an extremely profitable company that has a 90 percent share of one of the world's largest markets

and is considered a monopoly by the U.S. government.

Then, your research lab invents a revolutionary technology that is much better than your existing products in most application scenarios.

Out of "pure goodwill," your scientists published their research. Soon, startups started building products based on this technology.


What would you do? Of course, it's all about new technology, right?

But the problem is: You haven't found a way to make this new technology as profitable as the old business.


This is Google today.

2017,The Google Brain team published a Transformer paper, which led to the surge in OpenAI's ChatGPT, Anthropic,

and NVIDIA's market value. The entire AI revolution is built on this invention by Google.

What's even more surprising is: 10 Almost all the top AI talent in the field worked at Google years ago-Ilya Sutskever

(Chief Scientist of OpenAI), Dario Amodei (founder of Anthropic), Andrej Karpathy (former head of Tesla AI),

Andrew Ng, and all of them. The founder of DeepMind...

It's like the dawn of the computer age,IBM employs everyone in the world who can program.

In the heavens,Google still has the best portfolio of AI assets: The top model Gemini,

a cloud service with annual revenue of $50 billion, the only TPU chip that can compete with NVIDIA GPUs,

and the world's largest search traffic entrance.

However, the problem remains: How should Google decide? It's taking a risk to invest in AI,Or to protect the search advertising money tree?

Let's go back to the beginning of the story and see how Google has come to where it is today.

Key Timeline

Chapter 1: Origins (2000-2007)

The conversation in the micro kitchen changed everything

The story begins on a day in 2000 or 2001.

In one of Google's micro kitchens, three engineers are having lunch: George Herrick, one of Google's first 10 employees,

and Ben Gomes, a famous engineer,And the new Noam Shazeer.

George casually said something that changed history:

"I have a theory that compressing data is technically equivalent to understanding data. "

His logic is: If you can compress a piece of information into a smaller form and store it back to its original form,

then the system that performs this process must "understand" the information. It's like a student studying a textbook,

storing knowledge in the brain, and then proving that they understand the content through exams.

The young Noam Shazeer stopped his hand: "Wow, if that's true, that's so profound."

This idea foreshadowed today's large language model-compressing the world's knowledge into a few terabytes of parameters

and then "decompressing" it back.

The birth of Phil: The first language model

Over the next few months, Noam and George did one of the most "Google" things:

They stopped all other work and devoted themselves to the idea.

This was the time when Larry Page fired all the engineering managers in 2001-everyone was doing what they wanted to do.


A lot of people think they're wasting time. But Sanjay Ghemawat (Jeff Dean's legendary partner) said: "I think it's cool."

George's response was impressive: "Sanjay thinks it's a good idea, and no one in the world is smarter than Sanjay,

so why accept the idea that you think it's a bad idea?"


They delved into the probabilistic model of natural language-what is the probability of any word sequence appearing

on the Internet? (Sounds familiar? This is the basic principle of LLM today.)

Their first result was the "Are you looking for" spelling correction feature of Google search.


They then created a relatively "large" (at the time) language model, affectionately called the Phil

(Probabilistic Hierarchical Inferential Learner).Probabilistic Hierarchical Inference Learner.


Jeff Dean's Weekend Project


In 2003,Susan Wojcicki and Jeff Dean are preparing to launch AdSense.

They need to understand the content of third-party web pages in order to deliver relevant ads.


Jeff Dean borrowed from Philip and spent a week writing the implementation of AdSense (because he is Jeff Dean).


Boom, AdSense was born. This brought billions of dollars in new revenue to Google overnight-as

they put their existing AdWords ads on third-party pages, instantly expanding their inventory.


Jeff Dean's legendary moment


In the era of "Chuck Norris Facts,"The "Jeff Dean Facts" has become popular within Google:



The speed of light in a vacuum used to be 35 miles per hour, until Jeff Dean spent a weekend optimizing physics.




Jeff Dean's PIN code is the last four digits of pi.




For Jeff Dean,The NP problem means "No Probl is not in a good mood."



By the mid-2000s, Philip had taken up 15% of Google's data center infrastructure-for AdSense ads, spelling correction,

and other applications.


Chapter 2: The Golden Decade (2007-2012)


The miracle from 12 hours to 100 milliseconds


In 2007,Google has launched a translation product. Chief architect Franz Och participated in DARPA's machine translation Challenge.


Franz built a larger language model, trained on a two trillion-word Google search index, and achieved astronomical scores.


After hearing this, Jeff Dean asked, "Great!" When will you go online?"


Franz replied, "Jeff,You don't understand. This is a research project, not a product. This model takes 12 hours to translate a sentence."


The rules of the DARPA Challenge are: Give you a set of sentences on Monday and submit the translation results on Friday.

So they have enough time to get the server running.


Jeff Dean's response is: "Let me see your code."


A few months later,Jeff restructured the algorithm so that it could process sentences and words in parallel.

Because translation does not necessarily need to be done in order-the problem can be broken down into separate parts.


Google's infrastructure (which Jeff and Sanjay basically built) is extremely good at parallelization-it can break down workloads

into small pieces, send them to data centers, and then reassemble them back to users.


The result: The average translation time dropped from 12 hours to 100 milliseconds.


Then they launched the model in Google Translate, and it worked amazingly.


This is the first "large" language model that Google uses in its products.


The secrets of the Stanford AI lab


April 2007,Larry Page recruited Sebastian Thrun from Stanford, the head of the Stanford artificial intelligence Lab (SAIL).


Interestingly,Sebastian was almost "acquired"-he and a few graduate students are starting a business and

have already obtained Benchmark and Sequoia's term sheets. Larry said directly:

"Why don't we buy your unestablished company in the form of a signature fee? "


Not only did the world's best AI professors gather together, but there was also a group of Stanford undergraduates working

as research assistants there.


One of them is Chris Cox, chief product officer of Meta.


Another freshman and sophomore who later dropped out of school, joined the first batch of YC in the summer of 2005,

and founded a failed local mobile social network.


That's Sam Altman.The company is called Loopt.


Yes,Sam Altman had done research at Sail, and he was on stage with Steve Jobs at WWDC,

wearing a double collar shirt-it was a different era of technology.


The birth of Google X and Google Brain


After joining Google, Sebastian's first project was Ground Truth-to recreate all map data,

freeing himself from the dependence on Tele Atlas and Navtech.


The project was very successful.Sebastian started lobbying Larry and Sergey:


"We should do this on a large scale-bring AI professors into Google for part-time jobs.

They can keep academic positions and come here to participate in projects.

They will love it-seeing their work being used by millions of people, making money,

getting stocks, and continuing to be a professor."


Win-win-win。


December 2007,He invited a relatively unknown machine learning professor, Geoff Hinton,

to Google to talk about his new work with his students at neural network.


Geoff Hinton,Now known as the "Godfather of neural network," he was a marginal scholar at the time.

neural network is not respected-the hype of 30 to 40 years ago has been shattered.


Interesting cold knowledge: Geoff Hinton is the great-great-grandson of George Boole,

the inventor of Boolean algebra and Boolean logic. Ironically, Boolean logic is the symbolic,

deterministic basis of computer simulations, whereas neural network is the opposite-non-deterministic.


Geoff and former postdoctoral fellow Yann LeCun have been spreading:

If we can achieve multi-layer deep neural network (deep learning), we can fulfill the promise of this field.


By 2007, the development of Moore's law made it possible to test these theories.


Geoff's speech caused a sensation at Google-it could make their language model work better.Sebastian introduced Geoff to Google.

First as a consultant, then Geoff became a summer intern at Google around 2011-2012-when he was 60 years old.


By the end of 2009,Sebastian, Larry, and Sergey decided to create a new department: Google X.The moon factory.


The first project is led by Sebastian himself (we'll talk about it later).


The second project will change the world-Google Brain.


Google Brain: Distbelief and Cat Paper


After Andrew Ng took over as the head of Sail, he was also recruited by Sebastian to work part-time.


One day between 2010 and 2011,Andrew met Jeff Dean at the Google campus.

They discussed the work of language model and Geoff Hinton on deep learning.


They decided it was time to try to build a really large deep learning model on Google's highly parallel infrastructure.


In 2011,Andrew Ng, Jeff Dean, and Dr. Greg Corrado, a neuroscience doctor,

have launched the second official project of Google X: Google Brain.


Jeff built a system for this and called it Distbelief-both a pun of "distributed" and "disbelief" because most people

thought it would not work.


Technology breakthrough: asynchronous distributed learning


At the time, all the research considered the need for synchronous training-all the computations needed to be done intensively

on a single machine, just like a GPU.


But Jeff Dean did the opposite: Distbelief runs distributedly on a large number of CPU cores, possibly across the entire data center,

or even across different data centers.


In theory, this is bad-each machine needs to wait for the other machine to synchronize its parameters.


However, Distbelief uses asynchronous updates-not waiting for the latest parameters, but based on outdated data.


It should not be effective in theory. But it worked.


The cat paper that changed the world


At the end of 2011, they submitted a paper: "Building advanced features with large-scale unsupervised learning"-but everyone calls it

"Cat Paper."


They trained a nine-layer neural network using 16,000 CPU cores (over 1,000 machines) to identify cats from unlabeled frames of

YouTube videos.


Sundar Pichai later recalled that seeing the cat paper was one of the most critical moments in his Google history.


When this result was presented at the later TGIF (All-employee Conference), all Google employees realized: "God, everything has changed."


The commercial impact of cat papers


Jeff Dean's description:


"The system we built unsupervised learning on 10 million random YouTube frames. After a period of training,

the model constructed a representation at the top level-with a neuron excited about the cat's image.

It has never been told what a cat is, but it has seen enough frontal photos of cats, and that neuron will light up for the cat,

not for anything else."



This proves that: Large neural network can learn meaningful patterns without supervision, without labeling data.


And it can run on a distributed system built by Google itself.


The problem with YouTube is: People upload videos, but they are not good at describing the content of the video.

recommendation system can only be based on title and description.


The cat paper proves that you can use deep neural network to "see" what's inside a video,

and then use that data to decide what videos to recommend.


This led to everything on YouTube. It has put YouTube on the path to becoming the world's largest internet asset and media company.


From 2012 to 2022, ChatGPT is released.AI is already shaping all of our existence, driving hundreds of billions of dollars in revenue.


It's in the YouTube feed, and then Facebook borrowed it. (They hired Yann LeCun to start Facebook AI Research)

Then to Instagram, then TikTok and ByteDance adopted, then back to Facebook and YouTube's Reels and Shorts.


This is the main way humans will spend their leisure time on Earth for the next 10 years.


Key point: The AI era began in 2012


Everyone says that after 2022 is the era of AI. But for any company that can take full advantage of recommendation system

and classification systems,The AI era began in 2012.


Chapter 3: The Big Bang Moment (2012-2017)


AlexNet: deep learning's "Big Bang"


In 2012, in addition to the cat paper, there was what Jensen (Nvidia CEO) called the "AI Big Bang Moment": AlexNet.


Back to University of Toronto,Geoff Hinton has two graduate students: Alex Krizhevsky and Ilya Sutskever

(co-founder and chief scientist of Future OpenAI).


The trio used Geoff's deep neural network algorithm to compete in the prestigious ImageNet competition,

the annual machine vision algorithm competition organized by Stanford Fei Fei Li.


Fei Fei Li has built a database of 14 million hand-labeled images (annotated with Amazon Mechanical Turk).


The competition content is as follows: Which team can write an algorithm that

most accurately predicts labels without looking at the labels-just the images?


The key role of the GPU


The Toronto team went to the local Best Buy and bought two GeForce GTX 580 graphics cards-the top-of-the-line gaming cards at the time.


They rewritten the neural network algorithm using Nvidia's programming language CUDA and trained on these two ready-made GTX 580s.


The result: They are 40% better than any other contestant.


This is AlexNet-the moment that sparked the deep learning revolution.


First AI auction


The three of them did something very natural: The company DNN Research (Deep Neural Network Research) was established.


The company has no products, only AI researchers.


Predictably, it was acquired almost immediately-but here's an interesting story:


The first bid was actually Baidu.Geoff Hinton did what any academic would do to determine the market value of a company:


"Thank you very much. I'm going to hold an auction now."


He contacted Baidu, Google, Microsoft, and DeepMind.The auction ended at $44 million,Google won. The team joins Google Brain directly.


A few years later, Astro Teller, who was in charge of Google X, was quoted in The New York Times as saying:


"Google Brain has brought in more revenue to Google's core businesses (search, advertising,

YouTube) than the sum of all other investments made by Google X and the entire company over the years."


DeepMind: The acquisition of YouTube in the AI world


But there is another important part of Google's AI story-an external acquisition, equivalent to Google's acquisition of YouTube in AI:

DeepMind.


January 2014,Google spent $550 million to acquire an unknown AI company in London. People are confused:

Did Google buy something in London that I've never heard of for AI?


As it turns out, this acquisition was a moment when butterflies flapped their wings, directly leading to OpenAI, ChatGPT,

Anthropic,It basically leads to everything.


The Origin of DeepMind


DeepMind was founded in 2010 by Dr. Demis Hassabis, a neuroscientist who previously founded a video game company,

and Shane Legg of University of London College.Co-founded by a third co-founder, Mustafa Suleyman.


Company slogan: "Solve intelligence, and then use it to solve everything."


Founders Fund led a seed round of about $2 million. Elon Musk later became an investor (after a conversation about AI risk and Mars).


The acquisition war


At the end of 2013,Mark Zuckerberg called to buy. But Demis insists on independence and a specific governance structure.

Facebook disagrees.


Elon Musk offered to buy Tesla stock, but wanted them to work for Tesla, which doesn't fit DeepMind's vision.


Larry Page (supposedly on the plane with Elon) learns about this and establishes contact with Demis.Demis felt that

Larry understood their mission.


After negotiations,Google offered $550 million, and the deal was struck, setting up an independent ethics committee

(including Reid Hoffman of the PayPal gang). The DeepMind team remains independent and focuses on AGI research.


The acquisition went well, including significant savings in operating costs

(a 40% reduction in energy consumption for data center cooling), and later AlphaGo shocked the world.


Chapter 4: Transformer Revolution (2017)


The team of eight that changed everything


2017,Eight researchers from the Google Brain team have published a paper.


Google's reaction was, "Cool." This is the next iteration of our language model work. Very good."


But this paper and its publication actually gave OpenAI the opportunity to take the ball and run to build the next Google.


Because this is a transformer paper.


The evolution from RNN to transformer


Before the Transformer paper,Google has rewritten its translation system with neural network,

based on recurrent neural network and LSTM-the most advanced technology at the time.


But continuing research has found limitations-particularly a big problem: They "forgotten" the context too quickly.


Google Brain's internal team began to look for a better architecture that would not forget context too quickly,

like LSTM, but would also be more parallel and scalable.


Researcher Jakob Uszkoreit has been exploring the idea of broadening the scope of "attention" in language processing.


If you don't just focus on the adjacent words, but tell the model: Notice what happens to the entire text corpus?


Jakob started working together, and they decided to call this new technology Transformer.


Noam Shazeer's Magic


Remember Noam Shazeer? The creator of the early language model Phil,The key person in AdSense.


When Noam heard about the project, he said: "Hey, I have experience in this area. That sounds cool.LSTM does have a problem.

This could have a future. I want to join."


Before Noam joined, they had a working implementation of Transformers, but in fact nothing produced better results than LSTM.


Noam joined the team and basically "pulled a Jeff Dean"-rewriting the entire code base from scratch.


Once completed, Transformer now overwhelms the LSTM-based Google Translate solution.


And they found that: The larger the model, the better the results.


They published papers: "Attention Is All You Need"-a clear echo of the Beatles classic song.


The Transformer produces state-of-the-art results, extremely efficient, and serves as the basis for GPT, etc.


As of 2025, this paper has been cited more than 173,000 times, making it the 7th most cited paper of the 21st century.


The beginning of the brain drain


Of course, within a few years, all eight authors of the Transformer paper left Google.Either start or join an AI startup, including OpenAI.


Cruel.


Chapter 5: The Rise of OpenAI (2018-2022)


The birth of the GPT series


June 2018,OpenAI published a paper describing how they developed a new approach using Transformer:


pre-training on a large number of common texts on the Internet;

This general pre-training is then fine-tuning to a specific use case;

They also announced the training and running of the first proof-of-concept model for this approach:

GPT-1 (Genetically Pre-trained Transformer version 1)


In 2019, after the first Microsoft collaboration and a $1 billion investment,OpenAI released GPT-2-still early but very promising.


June 2020,GPT-3 was introduced. There's still no user-facing front-end interface, but it's pretty good. A lot of hype started.


After that, Microsoft invested another $2 billion.


In the summer of 2021, Microsoft released GitHub Copilot using GPT-3-the first, not just a Microsoft product,

but the first anywhere to incorporate GPT.


ChatGPT: A game-changing moment


By the end of 2022,OpenAI has GPT-3.5. But there's still a problem: How do I actually use it?


Sam Altman said: We should make a chatbot. This seems to be a natural interface.


Within a week, it was done internally. They just turn the call to the ChatGPT 3.5 API into a product-you chat with it and

call the GPT-3.5 API every time you send a message.


Turns out this is a magical product.


November 30, 2022,OpenAI launches a research preview version of the new interface of GPT-3.5: ChatGPT.


That morning,Sam Altman tweeted: "Today we launched ChatGPT.Try talking to it: chat.openai.com"


Less than a week: 1 million users;


One month later (December 31): 30 million users;


Two months later (end of January): 100 million registered users-the fastest product ever to reach this milestone.


Completely crazy.


Chapter 6: Google's Code Red (2023-2025)


Missed opportunities


Ironically, before ChatGPT,Google has chatbots.


Noam Shazeer-the incredible engineer who restructured Transformer, one of the main authors of the Transformer paper,

and had a legendary career at Google-immediately began advocating to Google's leadership after the Transformer paper was published:


"I think the Transformer will be so big that we should consider abandoning the search index and the 10 blue link model and

turning the whole Google into a giant Transformer model."


Noam actually built a large Transformer model chatbot interface.


Google continued to work on the Mina project and developed into Lambda-also a chatbot, and also internal.


In May 2022, they released the AI Test Kitchen, an AI product testing area open to the public,

where people can try out Google's internal AI products, including the Lambda chat interface.


But there is a limitation: Google has limited Lambda Chat conversations to five rounds. After five rounds, it's over. Thank you, bye.


The reason is: security concerns.


Time of threat


ChatGPT was launched, becoming the fastest product in history to reach 100 million users.


For Sundar, Larry, Sergey,For all Google leaders, it's obvious: This is an existential threat to Google.


ChatGPT does the same job as Google search, but with a better user experience.


In December 2022, even before the mass launch but after the ChatGPT moment, Sundar released Code Red (red alert)

within the company.


Bard's Catastrophe Release


First thing: They took out the Lambda model and chatbot interface and rebranded it as Bard.


In February 2023, it will be released immediately and open to everyone.


Maybe it's the right move, but my God, it's a bad product.


It obviously lacks some of the magic of ChatGPT-reinforcement learning (RLHF) with human feedback to really adjust the appropriateness,

tone, sound, correctness of the response.


Worse: In Bard's release video-a carefully curated pre-recorded video-Bard gave an inaccurate factual response to one of the queries.


Google's stock price plummeted by 8% in a single day, with a market value evaporating by $100 billion.


In May, they replaced Lambda with the new model PaLM from the Brain team. A little better, but still significantly behind GPT-3.5.


And in March 2023,OpenAI has launched GPT-4-better.


Chapter 7: Gemini Era (2023 to present)


Sundar's two major decisions


At this point, Sundar made two very, very important decisions:


Decision 1: Unify the AI team



"We can no longer have two AI teams within Google. We want to merge Brain and DeepMind into one entity: Google DeepMind."


Demis Hassabis became CEO, and Jeff Dean continued as chief scientist.


Decision 2: One model rules everything


"I want you to make a new model, we only have one model. This will be a model for all Google's internal use,

all external AI products. It will be called Gemini.There are no more different models, no more different teams.

There is only one model for everything."


It's also a huge decision.


Gemini's rapid development


Jeff Dean and Brain's Oriol Vinyals teamed up with the DeepMind team to start researching Gemini.


Later, when they brought Noam back through the Character AI deal, Noam joined the Gemini team. Now Jeff and

Noam are the two co-technology leaders at Gemini.


Key features: Gemini will be multimodal-text, image, video, audio, one model.

Timeline:


May 2023: Announcing the Gemini project at the Google I/O keynote speech;


December 2023: Early public access;


February 2024: Gemini 1.5,1 million tokens context window;


February 2025: Gemini 2.0 was released.


March 2025: A month later, Gemini 2.5 Pro experimental mode was launched.


June 2025: G.A. (full availability).



Six months to build, train, release. Crazy.


They announced that Gemini now has 450 million monthly active users.


AI comprehensive integration


AI Overview: First launched as a Labs product and later became the standard for everyone who uses Google search.


AI Mode: deep AI search mode


Multimodal generation tool: Veo (video), Genie (game), etc.;


Enterprise applications: Google Workspace is fully AI-based


Chapter 8: The Innovator's Dilemma


Bull Case: Google's Advantages


1 Unmatched distribution channels


It is still the "internet entrance" of the world;


You can guide traffic at will (AI Overviews, AI Mode);


Google search traffic is still at an all-time high.



2. Full-stack AI capabilities (unique)


Top model: Gemini;


Self-developed chip: TPU (The only large-scale AI chip that can compete with NVIDIA GPUs);


Cloud infrastructure: Google Cloud (annual revenue of $50 billion)


Self-sustaining funds: It does not rely on external financing.



Someone told me: If you don't have a basic cutting-edge model or AI chip, you may be just a commodity in the AI market.

Google is the only company that has both.


3 Infrastructure advantages


Fiber optic networks between private data centers


Customized hardware architecture


On a scale that no one can reach.



4 Data and personalization potential



Massive personal and corporate data;


It is possible to achieve deep personalized AI.


Google One has 150 million paid users and is growing rapidly.


5 New market opportunities


Waymo self-driving;


Video AI


Enterprise AI solutions;


Far beyond the application boundaries of traditional search.


6 The only self-sufficient model manufacturer


Cloud vendors have self-sufficient funding,NVIDIA does, but all model makers rely on external capital-except Google.


Bear Case: Big challenge


1 Monetization Challenge


The AI product form is not suitable for advertising.


More value creation, less value capture;


Google earns about $400 per user per year in the United States (search advertising);


Who will pay $400 per year for AI? Only a small number of people.



2 Market share decline



The search market accounts for 90%;


How much is the AI market? It may be only 25%, but at most 50%;


No longer the leader.


3 High-value scenario loss



AI is eating into the most valuable search scenarios.


Travel planning? Now using AI;



No more clicks on Expedia's ads.

4 Product advantages are not obvious


In 1998 Google was immediately the best product when it was launched;



It's definitely not the case today;


There are 4 to 5 equally excellent AI products.


The first version of Bard was obviously inferior, and now it's just "eating."


5 Losing ecological support


He is now in power, no longer a challenger;


People and the ecosystem no longer cheer for Google as they did when it started.


It's not like the mobile transition.


6. brain drain


All eight authors of Transformer have left;


Top AI talents continue to flow to OpenAI, Anthropic, etc.


Start-up companies are more attractive.

The nature of the strategic dilemma


Core viewpoints of the podcast:

"This is the most fascinating case of the innovator's dilemma ever. "


Larry and Sergey control the company. They have been quoted many times as saying they would rather go bankrupt than fail in AI.


But do they really?

If AI isn't as good business as search-although it certainly feels like it is, it certainly has to be;

Just because of the enormous value it creates-if not, they choose between two outcomes:


Achieving our mission: Organizing the world's information to make it universally accessible and useful;


The world's most profitable technology company.


Which one will win?


Because if it were just a mission, they should be much more aggressive in AI models than they are now, and move entirely to Gemini.


This is a very difficult needle to pass through.


Chapter 9: Future prospects


Google is doing the right thing


"If you look at all the large technology companies,Google-despite how unlikely

it may seem at the beginning-may be the best attempt to thread the needle in AI at present."


"This is incredibly commendable for Sundar and their leadership."

They're making tough decisions:


Unify DeepMind and Brain;


Integrate and standardize into a model;


Quickly release products;


And don't make reckless decisions.

"Rapid but not rash."


Strategic recommendations


1 Continue to integrate boldly


Adhere to the Gemini unified strategy;


Maintain a fast iteration pace.


Don't be swayed by short-term stress.


2 Explore new monetization models


New forms of AI advertising;


Personalized service payment;


The enterprise solution is deepened.

3 Activating the culture of innovation


Maintaining the innovative DNA of engineers;


Continuing the original intention of "preferring to sacrifice profits rather than losing to AI";


Encouraging internal experimentation and risk-taking.


4 Leverage full-stack advantages


Closed loop of hardware + model + data + distribution


Building the platform endgame of the AI era;


Infrastructure leadership translates into product advantages.

5 Practical expectation management


No longer pursue "exclusive market";

With the advantage of scale, it can still win in the long term;


Embrace the new normal of multi-polar competition.


6 Proactively predicting risks


Beware of "boiling frogs in warm water";


Continuously monitor the progress of AI replacement search.


Strategic innovation rather than passive response.


Summary: The cycle of an era


25 years ago,Larry Page said:

"artificial intelligence will be the ultimate version of Google. If we had the ultimate search engine,

it would understand everything on the web, understand what you want, and give you the right thing.

It's obviously artificial intelligence. We are far from doing it now. But we can gradually approach it,

and that's basically what we're working here."

It was 2000.

In the heavens,Google has one of the best AI models in the world, the most powerful AI chips, the largest cloud infrastructure,

and distribution channels for billions of users.

But they also face the most classic innovator dilemma of all time:


Inventing technology that changed the world (Transformer);


Watch the talent you cultivate to create competitors (OpenAI, Anthropic);


Having the best resources but being bound by their own success;


A choice must be made between protecting the cash cow and embracing the future.


This will be one of the most fascinating case studies in business history.


Can Google successfully pass through this needle?

Can we win the AI era without sacrificing the search business?

Can we prove that the people in power can also dominate the next era?


The answer will be revealed in the next few years.

And no matter what the outcome is,The Google AI chronicle has told us:

Sometimes, inventing the future and owning the future are two very different things.

Key Timeline

core person

lovart drawing

Technology milestone explanation

lovart drawing

This article is from the WeChat public account: Silicon Star GenAICompiled by: Zhou Huaxiang

Channel: Viewpoint

This content is authorized for publication by the author, and the views expressed only represent the author himself, not the position of Huxiu.
If you have any objections or complaints about this article, please contact tougao@huxiu.com.

Those who are changing and want to change the world are all on the Huxiu app




2025-11-04 16:01

Google AI编年史:搜索巨头的创新者困境

 品玩关注

本文来自微信公众号:硅星GenAI,整理:周华香,原文标题:《Google AI编年史:从搜索巨头到创新者困境的25年》,题图来自:视觉中国(谷歌在纳斯达克公开上市,2004.8.19)


今天听完了Acquired.fm播客发布的《Google: The AI Company》完整音频,整整四个小时,信息密度极高,非常震撼。这期节目用25年的时间跨度,

完整还原了Google如何汇聚全球最顶尖的AI人才、发明了Transformer这个改变世界的技术,却眼看着自己培养的人才创建OpenAI和Anthropic,最终陷入史上最经典的创新者困境。

听完后我整理了这份详细的编译,希望能帮你理解这个科技史上最引人入胜的案例。



史上最经典的创新者困境


想象这样一个场景:

你拥有一家极其赚钱的公司,在全球**的市场之一中占据90%的份额,被美国政府认定为垄断企业。

然后,你的研究实验室发明了一项革命性技术——这项技术比你现有的产品在大多数应用场景中都要好得多。

出于“纯粹的善意”,你的科学家们将研究成果发表了出来。很快,创业公司们开始基于这项技术构建产品。


你会怎么做?当然是全力转向新技术,对吧?

但问题是:你还没有找到让这项新技术像旧业务那样赚钱的方法。


这就是今天的Google。

2017年,Google Brain团队发表了Transformer论文——这篇论文催生了OpenAI的ChatGPT、Anthropic、以及NVIDIA市值的暴涨。

整个AI革命都建立在Google的这一项发明之上。

更令人惊讶的是:10年前,几乎所有AI领域的顶尖人才都在Google工作——Ilya Sutskever(OpenAI首席科学家)、Dario Amodei(Anthropic创始人)、Andrej Karpathy(前Tesla AI负责人)、Andrew Ng、所有DeepMind创始人……

这就像在计算机时代的黎明,IBM雇佣了全世界每一个会编程的人。

今天,Google依然拥有**的AI资产组合:**模型Gemini、年收入500亿美元的云服务、**可与NVIDIA GPU抗衡的TPU芯片,

以及全球**的搜索流量入口。

但问题依然存在:Google应该如何抉择?是冒险全力投入AI,还是保护搜索广告这棵摇钱树?

让我们回到故事的起点,看看Google如何走到今天这一步。

关键时间线

**章:起源(2000—2007)

微厨房里的对话改变了一切

故事要从2000年或2001年的某一天说起。

在Google的某个微厨房(micro kitchen)里,三个工程师正在吃午餐:Google最早的10名员工之一George Herrick、**工程师Ben Gomes,以及新入职的Noam Shazeer。

George随口说了一句改变历史的话:

“我有个理论——压缩数据在技术上等同于理解数据。”

他的逻辑是:如果你能把一段信息压缩成更小的形式存储,然后再还原成原始形态,那么执行这个过程的系统一定“理解”了这些信息。

就像学生学习教科书,在大脑中存储知识,然后通过考试证明理解了内容。

年轻的Noam Shazeer停下了手中的动作:“哇,如果这是真的,那太深刻了。”

这个想法预示了今天的大型语言模型——将全世界的知识压缩到几TB的参数中,然后“解压”还原出知识。

PHIL的诞生:**个语言模型

接下来的几个月里,Noam和George做了一件最“Google”的事情:他们停下了所有其他工作,全身心投入研究这个想法。

这恰好是2001年Larry Page解雇了所有工程经理的时期——每个人都在做自己想做的事。


很多人觉得他们在浪费时间。但Sanjay Ghemawat(Jeff Dean的传奇搭档)说:“我觉得这很酷。”

George的回应令人印象深刻:“Sanjay认为这是好主意,而世界上没人比Sanjay更聪明,所以为什么要接受你认为这是坏主意的观点?”


他们深入研究自然语言的概率模型——对于互联网上出现的任何词序列,下一个词序列出现的概率是多少?(听起来是不是很熟悉?这就是今天LLM的基本原理。)

他们的**个成果是Google搜索的“你是不是要找”拼写纠正功能


然后他们创建了一个相对“大型”(在当时)的语言模型,亲切地称之为PHIL(Probabilistic Hierarchical Inferential Learner,概率层次推理学习器)


Jeff Dean的周末项目


2003年,Susan Wojcicki和Jeff Dean准备推出AdSense。他们需要理解第三方网页的内容,以便投放相关广告。


Jeff Dean借用了PHIL,用一周时间写出了AdSense的实现(因为他是Jeff Dean)


Boom,AdSense诞生了。这为Google一夜之间带来了数十亿美元的新收入——因为他们把原有的AdWords广告投放到了第三方网页上,瞬间扩大了库存。


Jeff Dean传奇时刻


在那个“Chuck Norris Facts”流行的年代,Google内部流行起了“Jeff Dean Facts”:



真空中的光速曾经是每小时35英里,直到Jeff Dean花了一个周末优化了物理学。




Jeff Dean的PIN码是圆周率的最后四位数字。




对Jeff Dean来说,NP问题意味着“No Problemo”(没问题)



到2000年代中期,PHIL已经占用了Google数据中心15%的基础设施——用于AdSense广告投放、拼写纠正等各种应用。


第二章:黄金十年(2007—2012)


从12小时到100毫秒的奇迹


2007年,Google推出了翻译产品。首席架构师Franz Och参加了DARPA(美国国防高级研究计划局)的机器翻译挑战赛。


Franz构建了一个更大的语言模型,在两万亿词的Google搜索索引上训练,取得了天文数字般的高分。


Jeff Dean听说后问:“太棒了!你们什么时候上线?”


Franz回答:“Jeff,你不明白。这是研究项目,不是产品。这个模型翻译一个句子需要12小时。


DARPA挑战赛的规则是:周一给你一组句子,周五提交翻译结果。所以他们有足够时间让服务器运行。


Jeff Dean的回应是:“让我看看你的代码。”


几个月后,Jeff重新架构了算法,让它可以并行处理句子和单词。因为翻译时不一定需要按顺序处理——可以把问题分解成独立的部分。


而Google的基础设施(Jeff和Sanjay基本上参与构建了)极其擅长并行化——可以把工作负载分解成小块,发送到各个数据中心,然后重新组装返回给用户。


结果:平均翻译时间从12小时降到了100毫秒。


然后他们在Google翻译中上线了这个模型,效果惊人。


这是Google在产品中使用的**个“大型”语言模型 。


斯坦福AI实验室的秘密


2007年4月,Larry Page从斯坦福挖来了Sebastian Thrun——斯坦福人工智能实验室(SAIL)的负责人。


有趣的是,Sebastian几乎是被“收购”进来的——他和几个研究生正在创业,已经拿到了Benchmark和Sequoia的term sheet。

Larry直接说:“不如我们用签字费的形式收购你们还未成立的公司?”


SAIL不仅汇聚了世界上***的AI教授,还有一批斯坦福本科生在那里做研究助理。


其中一位是Meta的首席产品官Chris Cox。


另一位大一大二学生后来辍学了,参加了YC 2005年夏季**批,创办了一个失败的本地移动社交网络……


那就是Sam Altman,公司叫Loopt。


是的,Sam Altman曾在SAIL做研究,还在WWDC上和乔布斯同台展示,穿着双层翻领衬衫——那是不同的科技时代。


Google X与Google Brain的诞生


Sebastian加入Google后,首个项目是Ground Truth——重新创建所有地图数据,摆脱对Tele Atlas和Navtech的依赖。


这个项目非常成功。Sebastian开始游说Larry和Sergey:


“我们应该大规模做这件事——把AI教授们引入Google做兼职。他们可以保留学术职位,来这里参与项目。

他们会喜欢的——看到自己的工作被数百万人使用,赚钱,拿股票,还能继续当教授。”


Win-win-win。


2007年12月,Sebastian邀请多伦多大学一位相对不知名的机器学习教授Geoff Hinton来Google做技术演讲——谈谈他和学生们在神经网络方面的新工作。


Geoff Hinton,现在被称为“神经网络教父”,在当时是边缘学者。神经网络不受尊重——30~40年前的炒作已经破灭。


有趣的冷知识:Geoff Hinton是George Boole的曾曾孙——布尔代数和布尔逻辑的发明者。

讽刺的是,布尔逻辑是符号化、确定性的计算机科学基础,而神经网络恰恰相反——是非确定性的。


Geoff和前博士后Yann LeCun一直在传播:如果我们能实现多层深度神经网络(深度学习),就能实现这个领域的承诺。


到2007年,摩尔定律的发展让测试这些理论成为可能。


Geoff的演讲在Google引发轰动——这可以让他们的语言模型工作得更好。Sebastian把Geoff引入Google,先是顾问,

后来Geoff在2011~2012年左右成为Google的暑期实习生——当时他已经60岁了。


到2009年末,Sebastian、Larry和Sergey决定成立一个新部门:Google X,登月工厂。


**个项目Sebastian自己领导(我们稍后再说)


第二个项目将改变整个世界——Google Brain。


Google Brain:Distbelief与猫论文


Andrew Ng接任SAIL负责人后,也被Sebastian招募来兼职。


2010~2011年间的某天,Andrew在Google园区碰到了Jeff Dean。他们讨论了语言模型和Geoff Hinton的深度学习工作。


他们决定:是时候在Google高度并行化的基础设施上,尝试构建一个真正大型的深度学习模型了。


2011年,Andrew Ng、Jeff Dean和神经科学博士Greg Corrado三人启动了Google X的第二个官方项目:Google Brain。


Jeff为此构建了一个系统,命名为Distbelief ——既是双关“分布式”(distributed),也是“难以置信”(disbelief),因为大多数人认为这不会成功。


技术突破:异步分布式学习


当时所有研究都认为需要同步训练——所有计算需要密集地在单机上进行,就像GPU那样。


但Jeff Dean反其道而行之:Distbelief在大量CPU核心上分布式运行,可能跨越整个数据中心,甚至不同数据中心。


理论上这很糟糕——每台机器都需要等待其他机器同步参数。


但Distbelief采用异步更新——不等待最新参数,基于过时数据更新。


理论上不应该有效。但它成功了。


改变世界的猫论文


2011年底,他们提交了一篇论文:《使用大规模无监督学习构建高级特征》——但所有人都叫它“猫论文”(Cat Paper)


他们训练了一个九层神经网络,使用16000个CPU核心(跨1000台机器)从YouTube视频的未标注帧中识别猫。


Sundar Pichai后来回忆说,看到猫论文是他在Google历史上最关键的时刻之一。


后来的TGIF(全员大会)上展示这个结果时,所有Google员工都意识到:“天啊,一切都变了。”


猫论文的商业影响


Jeff Dean的描述:


“我们构建的系统在1000万个随机YouTube帧上进行无监督学习。经过一段时间训练后,模型在最高层构建了一个表示——有一个神经元会对猫的图像兴奋。它从未被告知什么是猫,但它见过足够多的猫正面照片,那个神经元就会为猫点亮,基本不为其他东西点亮。 ”



这证明了:大型神经网络可以在没有监督、没有标注数据的情况下学习有意义的模式。


而且可以在Google自己构建的分布式系统上运行。


YouTube的问题是:人们上传视频,但不擅长描述视频内容。推荐系统只能基于标题和描述。


猫论文证明,可以用深度神经网络“看懂”视频内部的内容,然后用这些数据决定推荐什么视频。


这导致了YouTube的一切。 让YouTube走上了成为全球**互联网资产和**媒体公司的道路。


从2012年到2022年ChatGPT发布,AI已经在塑造我们所有人的存在,驱动着数千亿美元的收入。


它在YouTube feed中,然后Facebook借鉴了(他们雇佣Yann LeCun成立Facebook AI Research),然后到Instagram,然后TikTok和字节跳动采用,然后回到Facebook和YouTube的Reels和Shorts。


这是接下来10年人类在地球上度过闲暇时间的主要方式。


重要观点:AI时代始于2012年


所有人都说2022年后是AI时代。但对于任何能充分利用推荐系统和分类系统的公司来说,AI时代始于2012年。


第三章:大爆炸时刻(2012—2017)


AlexNet:深度学习的“宇宙大爆炸”


2012年,除了猫论文,还有Jensen(NVIDIA CEO)所说的“AI大爆炸时刻”: AlexNet 。


回到多伦多大学,Geoff Hinton有两个研究生:Alex Krizhevsky和Ilya Sutskever(未来OpenAI的联合创始人和首席科学家)


三人用Geoff的深度神经网络算法参加**的ImageNet竞赛 ——斯坦福李飞飞组织的年度机器视觉算法竞赛。


李飞飞组建了1400万张手工标注图像的数据库(用Amazon Mechanical Turk标注)


竞赛内容是:哪个团队能写出算法,在不看标签的情况下——仅看图像——最准确地预测标签?


GPU的关键作用


多伦多团队去本地百思买(Best Buy)买了两块 NVIDIA GeForce GTX 580显卡 ——当时NVIDIA的**游戏卡。


他们用NVIDIA的编程语言CUDA重写神经网络算法,在这两块现成的GTX 580上训练。


结果:他们比任何其他参赛者好40%。


这就是AlexNet——引发深度学习革命的时刻。


**次AI拍卖


三人做了很自然的事:成立公司 DNN Research(Deep Neural Network Research)


这家公司没有产品,只有AI研究人员。


可以预见的是,它几乎立即被收购了——但有个有趣的故事:


**个出价的实际上是百度。Geoff Hinton做了任何学者都会做的事来确定公司的市场价值:


“非常感谢。我现在要举办一场拍卖。”


他联系了百度、Google、微软和DeepMind。拍卖以4400万美元结束,Google赢了。团队直接加入Google Brain。


几年后,负责Google X的Astro Teller在《纽约时报》上被引用说:


“Google Brain为Google核心业务(搜索、广告、YouTube)带来的收益,已经远超Google X和整个公司多年来所有其他投资的总和。”


DeepMind:AI界的YouTube收购


但Google的AI故事还有另一个重要部分——一次外部收购,相当于Google AI领域的YouTube收购: DeepMind。


2014年1月,Google花5.5亿美元收购了一家伦敦的不知名AI公司。人们困惑:Google在伦敦买了个我从未听说过的做AI的东西?


事实证明,这次收购是蝴蝶扇动翅膀的时刻,直接导致了OpenAI、ChatGPT、Anthropic,基本上导致了一切。


DeepMind的起源


DeepMind成立于2010年,由神经科学博士Demis Hassabis(之前创办过视频游戏公司)、伦敦大学学院的Shane Legg,以及第三位联合创始人Mustafa Suleyman共同创立。


公司标语:“解决智能,然后用它解决一切。”


Founders Fund领投了约200万美元的种子轮。后来Elon Musk也成为投资人(经过一次关于AI风险和火星的对话)


收购大战


2013年底,Mark Zuckerberg打电话来要收购。但Demis坚持要独立性和特定的治理结构,Facebook不同意。


Elon Musk提出用特斯拉股票收购,但希望他们为特斯拉工作,这不符合DeepMind的愿景。


Larry Page(据说在和Elon的飞机上)得知此事,与Demis建立了联系。Demis感觉Larry理解他们的使命。


经过谈判,Google提供5.5亿美元 ,交易达成,建立了独立的伦理委员会(包括PayPal黑帮的Reid Hoffman)DeepMind团队保持独立,专注于AGI研究。


收购后进展顺利,包括大幅节省运营成本(数据中心冷却降低40%能耗)后来的AlphaGo震惊世界。


第四章:Transformer革命(2017)


改变一切的八人团队


2017年,Google Brain团队的八名研究人员发表了一篇论文。


Google本身的反应是:“酷。这是我们语言模型工作的下一次迭代。很好。”


但这篇论文和它的发表,实际上给了OpenAI机会——接过球并奔跑,构建下一个Google。


因为这是Transformer论文。


从RNN到Transformer的演进


在Transformer论文之前,Google已经用神经网络重写了翻译系统,基于循环神经网络(RNN)和LSTM——当时的***技术,是巨大进步。


但持续研究发现了局限性——特别是一个大问题:它们太快地“遗忘”上下文。


Google Brain内部团队开始寻找更好的架构,既能像LSTM那样不会太快忘记上下文,又能更好地并行化和扩展。


研究员Jakob Uszkoreit一直在探索拓宽语言处理中“注意力”(attention)范围的想法。


如果不只关注紧邻的词,而是告诉模型:注意整个文本语料库会怎样?


Jakob开始合作,他们决定把这个新技术称为Transformer。


Noam Shazeer的魔法


还记得Noam Shazeer吗?早期语言模型PHIL的创造者,AdSense的关键人物。


Noam听说这个项目后说:“嘿,我在这方面有经验。听起来很酷。LSTM确实有问题。这可能有前途。我要加入。”


在Noam加入之前,他们有Transformer的工作实现,但实际上没有比LSTM产生更好的结果。


Noam加入团队,基本上“pull了一个Jeff Dean”—— 从头重写了整个代码库 。


完成后,Transformer现在碾压了基于LSTM的Google翻译解决方案。


而且他们发现:模型越大,结果越好。


他们发表了论文:《Attention Is All You Need》
(注意力就是你所需要的一切)——明显呼应披头士经典歌曲。


Transformer产生***的结果,极其高效,成为GPT等的基础。


截至2025年,这篇论文被引用超过173000次,是21世纪被引用第7多的论文。


人才流失的开始


当然,在几年内,Transformer论文的全部八位作者都离开了Google,要么创办要么加入AI创业公司,包括OpenAI。


残酷。


第五章:OpenAI的崛起(2018—2022)


GPT系列的诞生


2018年6月,OpenAI发表了一篇论文,描述他们如何采用Transformer,开发了一种新方法:


在互联网大量通用文本上预训练;

然后将这种通用预训练微调到特定用例;

他们还宣布训练并运行了这种方法的**个概念验证模型:GPT-1(Generatively Pre-trained Transformer version 1)


2019年,在**次微软合作和10亿美元投资后,OpenAI发布GPT-2——仍然早期但非常有前途。


2020年6月,GPT-3 问世。仍然没有面向用户的前端界面,但已经非常好。开始出现大量炒作。


之后,微软再投资20亿美元。


2021年夏天,微软使用GPT-3发布GitHub Copilot——这是**个,不仅是微软产品,而是任何地方**个将GPT融入的产品。


ChatGPT:改变游戏规则的时刻


到2022年底,OpenAI有了GPT-3.5。但仍有个问题:我该如何实际使用它?


Sam Altman说:我们应该做一个聊天机器人。这似乎是自然的界面。


一周内,内部就做出来了。他们只是把对ChatGPT 3.5 API的调用变成一个产品——你和它聊天,每次发送消息就调用GPT-3.5 API。


结果证明这是神奇的产品。


2022年11月30日,OpenAI推出GPT-3.5新界面的研究预览版:ChatGPT。


那天早上,Sam Altman发推:“今天我们推出了ChatGPT。试试和它聊天:chat.openai.com”


不到一周:100万用户;


一个月后(12月31日):3000万用户;


两个月后(1月底):1亿注册用户——史上达到这一里程碑最快的产品。


完全疯狂。


第六章:Google的Code Red(2023—2025)


错失的机会


讽刺的是,在ChatGPT之前,Google就有聊天机器人。


Noam Shazeer——那个不可思议的工程师,重新架构了Transformer,Transformer论文的主要作者之一,

在Google拥有传奇职业生涯——在Transformer论文发表后立即开始向Google领导层倡议:


“我认为Transformer将如此重大,我们应该考虑抛弃搜索索引和10个蓝色链接模型,

全力将整个Google转变为一个巨大的Transformer模型。”


Noam实际上构建了一个大型Transformer模型的聊天机器人界面。


Google继续研究Mina项目,发展成Lambda——也是聊天机器人,也是内部的。


2022年5月,他们发布了向公众开放的AI Test Kitchen——一个AI产品测试区,人们可以试用Google的内部AI产品,包括Lambda聊天界面。


但有个限制:Google将Lambda Chat的对话限制在五轮。五轮后,就结束了。谢谢,再见。


原因是:安全考虑。


存在威胁时刻


ChatGPT问世,成为史上最快达到1亿用户的产品。


对Sundar、Larry、Sergey,所有Google领导层来说,显而易见:这是对Google的存在性威胁。


ChatGPT做Google搜索同样的工作,但有更好的用户体验。


2022年12月,甚至在大规模推出之前但在ChatGPT时刻之后,Sundar在公司内部发布了Code Red(红色警报)


Bard的灾难性发布


**件事:他们把Lambda模型和聊天机器人界面拿出来,重新命名品牌为Bard 。


2023年2月,立即发布,向所有人开放。


也许这是正确的举动,但天啊,这是个糟糕的产品。


很明显它缺少ChatGPT拥有的某种魔力——用人类反馈进行强化学习(RLHF)来真正调整响应的适当性、语气、声音、正确性。


更糟糕的是:在Bard的发布视频中——一个精心编排的预录视频——Bard对其中一个查询给出了不准确的事实回应。


Google股价单日暴跌8%,市值蒸发1000亿美元。


5月,他们用Brain团队的新模型PaLM替换Lambda。稍好一点,但仍然明显落后于GPT-3.5。


而且2023年3月,OpenAI推出了GPT-4——更好。


第七章:Gemini时代(2023至今)


Sundar的两个重大决策


此时,Sundar做出了两个非常非常重大的决定:


决策一:统一AI团队



“我们不能再在Google内部有两个AI团队。我们要把Brain和DeepMind合并为一个实体:Google DeepMind。


Demis Hassabis担任CEO,Jeff Dean继续担任首席科学家。


决策二:一个模型统治一切


“我要你们去做一个新模型,我们只有一个模型。这将是Google所有内部使用、所有外部AI产品的模型。它将被称为Gemini。不再有不同的模型,不再有不同的团队。只有一个模型用于一切。”


这也是巨大的决定。


Gemini的快速发展


Jeff Dean和Brain的Oriol Vinyals与DeepMind团队合作,开始研究Gemini。


后来当他们通过Character AI交易把Noam带回来时,Noam加入Gemini团队。现在Jeff和Noam是Gemini的两位联合技术负责人。


关键特性:Gemini将是多模态的——文本、图像、视频、音频,一个模型。

时间线:


2023年5月:在Google I/O主题演讲中宣布Gemini计划;


2023年12月:早期公开访问;


2024年2月:推出Gemini 1.5,具有100万token上下文窗口;


2025年2月:发布Gemini 2.0;


2025年3月:一个月后推出Gemini 2.5 Pro实验模式;


2025年6月:GA(全面可用)



六个月构建、训练、发布。疯狂。


他们宣布Gemini现在有 4.5亿月活跃用户 。


AI全面整合


AI Overviews(搜索AI概览):首先作为Labs产品推出,后来成为所有使用Google搜索的人的标准;


AI Mode:深度AI搜索模式;


多模态生成工具:Veo(视频)、Genie(游戏)等;


企业应用:Google Workspace全面AI化


第八章:创新者困境


Bull Case(乐观情景):Google的优势


①无与伦比的分发渠道


依然是全球“互联网入口”;


可以随意引导流量(AI Overviews、AI Mode)


Google搜索流量仍处于历史高位。



②全栈AI能力(****)


**模型:Gemini;


自研芯片:TPU(**可与NVIDIA GPU抗衡的规模化AI芯片)


云基础设施:Google Cloud(年收入500亿美元)


自给自足的资金:不依赖外部融资。



有人告诉我:如果你没有基础前沿模型或AI芯片,你在AI市场可能只是商品。Google是**两者都有的公司。


③基础设施优势


私有数据中心间光纤网络;


定制化硬件架构;


无人能及的规模。



④数据与个性化潜力



海量个人和企业数据;


可能实现深度个性化AI;


1.5亿Google One付费用户且快速增长。


⑤新市场机会


Waymo自动驾驶;


视频AI;


企业AI解决方案;


远超传统搜索的应用边界。


⑥**自给自足的模型制造商


云厂商有自给自足的资金,NVIDIA有,但所有模型制造商都依赖外部资本——除了Google。


Bear Case(悲观情景):巨大挑战


①变现难题


AI产品形态不适合广告;


价值创造多,价值捕获少;


Google在美国每用户每年赚约400美元(搜索广告)


谁会为AI付费400美元/年?只有很小一部分人。



②市场份额下降



搜索市场占90%;


AI市场占多少?可能只有25%,最多50%;


不再是主导者。


③高价值场景流失



AI正在蚕食最有价值的搜索场景;


旅行规划?现在用AI;



不再点击Expedia的广告。

④产品优势不明显


1998年Google推出时立即明显是**产品;



今天绝对不是这样;


有4—5个同等优秀的AI产品;


Bard初版明显劣质,现在只是“追平”。


⑤失去生态支持


现在是在位者,不再是挑战者;


人们和生态系统不再像Google创业时那样为它加油;


也不像移动转型时那样。


6. 人才流失


Transformer八位作者全部离开;


**AI人才持续流向OpenAI、Anthropic等;


创业公司吸引力更大。

战略困境的本质


播客的核心观点:

“这是有史以来最引人入胜的创新者困境案例。”


Larry和Sergey控制着公司。他们多次被引用说宁愿破产也不愿在AI上失败。


但他们真的会吗?

如果AI不像搜索那样是个好生意——虽然感觉当然会是,当然必须是;仅仅因为它创造的巨大价值——如果不是,他们在两个结果之间选择:


实现我们的使命:组织世界信息,使其普遍可访问和有用;


拥有世界上最赚钱的科技公司。


哪一个会赢?


因为如果只是使命,他们应该在AI模式上比现在激进得多,完全转向Gemini。


这是一根非常难以穿过的针。


第九章:未来展望


Google正在做对的事情


“如果看所有大型科技公司,Google——尽管事情的开始看起来多么不太可能——可能是目前在AI上尝试穿针引线做得**的。”


“这对Sundar和他们的领导层来说令人难以置信地值得赞扬。”

他们在做艰难的决定:


统一DeepMind和Brain;


整合并标准化为一个模型;


快速发布产品;


同时不做鲁莽的决定。

“快速但不鲁莽(Rapid but not rash)。”


战略建议


①继续大胆整合


坚持Gemini统一战略;


保持快速迭代节奏;


不要因短期压力动摇。


②探索新变现模式


AI广告新形式;


个性化服务付费;


企业解决方案深化。

③激活创新文化


保持工程师创新DNA;


延续“宁愿舍利润也不输AI”的初心;


鼓励内部实验和冒险。


④利用全栈优势


硬件+模型+数据+分发的闭环;


构建AI时代的平台终局;


基础设施领先性转化为产品优势。

⑤务实预期管理


不再追求“独占性市场”;

凭借规模优势仍可长期胜出;


接受多极竞争的新常态。


⑥主动预判风险


警惕“温水煮青蛙”;


持续监控AI替代搜索的进度;


战略创新而非被动应对。


总结:一个时代的轮回


25年前,Larry Page说:

人工智能将是Google的**版本。如果我们有**搜索引擎,它将理解网络上的一切,理解你想要什么,并给你正确的东西。这显然是人工智能。

我们现在还远未做到。但我们可以逐步接近,这基本上就是我们在这里工作的内容。”

那是2000年。

今天,Google拥有世界上**的AI模型之一、最强的AI芯片、**规模的云基础设施、以及数十亿用户的分发渠道。

但他们也面临着有史以来最经典的创新者困境:


发明了改变世界的技术(Transformer)


看着自己培养的人才创建竞争对手(OpenAI、Anthropic)


拥有**的资源却被自己的成功束缚;


必须在保护现金牛和拥抱未来之间做出选择。


这将是商业史上最引人入胜的案例研究之一。


Google能否成功穿过这根针?

能否在不牺牲搜索业务的情况下赢得AI时代?

能否证明在位者也可以主导下一个时代?


答案将在接下来几年揭晓。

而无论结果如何,Google AI编年史已经告诉我们:

有时候,发明未来和拥有未来,是两件截然不同的事。

关键时间线

核心人物

lovart制图

技术里程碑解释

lovart制图

本文来自微信公众号:硅星GenAI,整理:周华香

频道: 观点

本内容由作者授权发布,观点仅代表作者本人,不代表虎嗅立场。
如对本稿件有异议或投诉,请联系 tougao@huxiu.com。

正在改变与想要改变世界的人,都在 虎嗅APP


name:
Message:
Verification code:
submit
Comment