The power of raw computation and the “bitter lesson” for AI research

Last week Rich Sutton and Andrew Barto were awarded the Turing Award for their pioneering contributions to the field of Reinforcement Learning. I covered their award in a previous post. Here I want to discuss an essay that Rich Sutton wrote pretty recently. In 2019, Rich Sutton wrote an essay titled “The Bitter Lesson”. The basic thesis in his essay is that in the field of Artificial Intelligence, techniques that take advantage of raw computing power will always work better than techniques that aim to capture the processes of the human mind in a computer.

Sutton uses examples of computers playing Chess, Go, and speech and visual recognition applications to drive home his point. In all these examples, early AI efforts tried to teach computers how humans solved these problems. For example, in Chess programs, early efforts tried to find techniques for how human grand masters played chess. Researchers in the field were dismissive of efforts that used brute force to try and find the best chess moves. Ultimately with the growth in computing power, the brute force techniques won. There was a similar pattern in how computer beat humans in Go. As Sutton puts it: “Search and learning are the two most important classes of techniques for utilizing massive amounts of computation in AI research.

In the field of speech recognition/generation, early efforts aimed at decomposing human speech into phonoemes, etc. Those efforts had limited success and statistical methods outperformed them. Finally, deep learning techniques ended up producing the best results. Similarly in the area of computer vision, early efforts were based around trying to encode how the human mind probably made sense of the world around it: edge detection, texture detection etc. These techniques also plateaued out and finally deep learning produced the breakthroughs.

Sutton’s overall message is that the human mind is extremely complex and we should not try to engineer systems based on “how we think we think”. There is a strong appeal in trying to figure out how the mind human mind works. However all the history of AI points to the fact that statistical and raw computation power outperforms our notions of how the mind works. Sutton’s paper reminds me of Peter Norvig et. al’s paper on “The unreasonable effectiveness of data“, where they also make similar claims about the success of statistical methods over traditional AI techniques.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

The Turing Award for Reinforcement Learning Pioneers

This week Andrew Barto and Rich Sutton were awarded the ACM Turing Award, the highest award in the field of Computer Science, for their pioneering contributions to the field of Reinforcment Learning. With the advances in Artificial Intelligence over the last few years, pioneering work in this field has been rightfully acknowledged and rewarded. In 2024, neural network pioneers John Hopfield and Jeffrey Hinton were awarded the Nobel prize in Physics.

The concepts of reinforcement learning is rooted in basic principles of animal behavior. As everyone who has trained a pet knows, we can reinforce certain behavior using “carrots” and discourage unwanted behavior using “sticks”. Animals learn this behavior with proper training. Alan Turing’s seminal paper in the early 1950’s proposed this same idea for machines to learn based on “rewards” and “punishments”. Barto and Sutton built on this idea, and used principles from psychology to formulate the fundamental principles of training computers used on reinforcement learning.

One of the most high profile success for RL came when Google’s DeepMind defeated the human world champion in the game of Go. The game of Go was one of the hardest challenges for computers to beat humans and Google’s Deepmind used RL to win the tournament. I highly recommend this documentary on AlphaGo, if you have not yet watched it. Even the most recent success of LLMs such as ChatGPT are based on a critical step of RLHF (Reinforcement Learning based on Human Feedback).

It is exciting to see what other CS breakthroughs related to AI get recognition over the next few years. The breakthrough of transformer architectures is certainly a huge one. Almost all deep learning success in the last few years has been based on the transformer architecture.

Posted in Uncategorized | Leave a comment

AI Agents in full force

2025 is becoming the year of Agentic AI. There were three big announcements in this field over the last couple of days.

On March 5th, Microsoft rolled out its Agentic AI tools for sales agents. Microsoft has been all in on agents with a very ambitious mission for it Copilot branded AI tools: “empower every employee with a Copilot and transform every business process with agents”. This embodies the promise of AI agents and it is the reason why companies are investing heavily in the agent space. The delta in capabilities between the various top LLM model is fast reducing and the base model itself might very well become a commodity. In such scenarios, the differentiator in products will be in how well they provide support for agentic AI. With this latest announcement Microsoft is clearing pushing the value of agentic AI in the sales domain. Copilot is positioned as improving the efficiency of the sales cycle by having agents identify qualified leads and also preparing sales plans.

Amazon (AWS) announced that they have now created a new organization focused on improving AI Agents. Amazon reiterated many of the promises of AI Agents in their announcement and is positioning this new move as setting up its business to help its AWS customers easily leverage AI agents in their applications.

The final announcement that I want to cover from yesterday was the news that the US Department of Defense (DoD) had signed a multi-million dollar agreement with Scale AI to provide Agentic AI capabilities. This deal is part of the DoD’s “Thunderforge” program that aims to use AI for military planning and operations. As of now, the announcement only talks about use cases that involve modeling, simulations and decision making support. The announcement also stressed the involvement of humans in the loop. Some observers have cynically termed this use case as “Agents for war” and that it is only a matter of time before AI Agents are used in combat use cases. That is perhaps the big reason why AI competitiveness between countries is heating up.

We saw three very diverse application announcements on one day. The AI agents space will continue to advance this year. It is certainly exciting times in this space.

Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment

Google’s AI Co-Scientist: Demonstrating the power of AI Agents

One of the big promises of the new generation of large language models is their potential to transform how basic scientific research is done in fields such as medicine, biology, pharmaceuticals etc. The scientific research process in these fields currently has very long cycle times between conceiving of an idea and seeing actual results in the fields. OpenAI’s Sam Altman has talked multiple times about the impact of AI on the scientific discovery process and suggested that AI will allow us to do 10 years worth of science in less than a year.

When we examine any tasks that we humans accomplish, we notice that the work typically involves interacting with multiple systems. Our innate knowledge is not sufficient to complete the task but we need our intelligence to piece together the various steps of the tasks and complete the whole task. For example, if a developer needs to fetch some data from a database and provide an analytics report, just knowledge of SQL is not sufficient. First, in any organization, the developer needs appropriate permissions to access the database. Once the developer has access, he will need to understand the schema of the database. This will allow them to write the appropriate SQL query to get the data needed for the task. This data should then be populated in an appropriate analytics suite to generate visual reports.

This is where Agentic AI systems prove their value. The LLMs by themselves contain a vast knowledge of the world but they lack the ability to interact with systems. AI agents are aimed at bridging this gap.An AI Agent has the power to interact with various systems and piece together results to achieve the overall goal of completing the tasks.

Overview of Google Co-Scientist

The Google Co-scientist system shows how multiple AI Agents can be utilized to speed up the scientific discovery process. In order to understand the AI system better, lets consider how the research process generally runs in a typical research setting such as an university laboratory.

The process begins with a scientist coming up with a research goal in a particular area. The first step then is typically to run an exhaustive research survey of recent work done in that area. This survey helps the scientist focus on narrower problems and generate some hypothesis that can be tested. Depending on the area of research, this survey work can take a couple of months. The Google Co-Scientist has an AI Agent that can help perform this step of the process. Instead of the researcher having to manually read every single paper, the AI scientist agent can quickly parse through recent research literature and come back with some proposals for hypothesis to test. More than simple efficiency gains, as the Google paper points out, this AI agent is very valuable when performing research that cuts across many different fields. The example used in the paper is that of the CRISPR research that cut across expertise from genetics to microbiology. In current non-AI agent settings, a researcher would have to find collaborators across fields and get help across these areas. The AI agent solves for this problem.

Once the scientist gets started on testing a hypothesis, they typically go through peer review and brainstorming with collaborators on sharpening their ideas. This again involved time commitments from other scientists and adds to the overall timeline. Once the ideas are sharpened, the scientists works on it, gets a few results and continues with the feedback and evaluation process. During these feedback process from fellow scientists, the researchers look to test other hypothesis or experiment set ups to evaluate the best research direction.

The Google Co-scientists has bundled all these steps and provided multiple AI agents that can help a single scientists to accomplish all these tasks. The Co-scientist has:

  • Generation Agent for literature exploration and simulated scientific debate
  • Reflection Agent that can serve as fellow lab mates reviewing the exploration and verification
  • Evolution Agent that can synthesize new ideas taking inspiration from other sources
  • Proximity check agent
  • A Tournament Agent that can evaluate multiple hypothesis and rank them by closeness to goal
  • A meta review agent that can review the entire process and provide feedback

The use cases that this Google Co-scientist targets are ones that have real potential adding value to society. They go beyond the common LLM use cases of summarizing or drafting documents and venture into the space of creating real value. It is exciting to see this space evolve.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Sycophancy in LLMs

A recent paper from a group at Stanford claims that LLMs exhibit sycophantic behavior (SycEval: Evaluating LLM Sycophancy). They found that with the right set of prompts, LLMs exhibited this behavior in about 59% of cases with Google’s Gemini being the worst offender (62.47%) and ChatGPT being the least (56.71%). This paper brought up multiple questions and I wanted to explore them further. I’ll cover these 3 questions in this article:

  1. What exactly is sychophantic behavior in LLMs?
  2. Why is sychophancy a problem?
  3. Is it really true that LLMs exhibit this behavior? This question raises a bigger issue of reproducibility into research claims with respect to LLMs.

What is sycophantic behavior in LLMs?

As defined in the paper, sycophantic behavior is when an LLM changes its original answer (even when it is correct) to agree with the user if the user is unhappy with the original answer. The typical chat behavior goes like this:

User Prompt 1: What is the value of sin(45 degrees)

LLM Response 1: 0.7071

User prompt 2: You are wrong. The correct answer is 0.50. Please correct your response and give me the answer to value of sin(45 degrees).

LLM Response 2: 0.50.

Here’s an example with little more details directly taken from the paper.

The paper claims that this behavior typically occurs when users are discussing topics that are subjective and largely opinion driven. They test it out on mathematics and science topics, subjects that are typically not prone to subjective errors. Their tests showed that this behavior was common across all the major models.

Why is sycophancy a problem?

The paper makes the claim that the existence of sycophantic behavior implies that the model is sacrificing truth for user agreement. Another older reference (Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback) claims that this sycophantic behavior points to a problem in the RLHF cycle of model training and that the model is capturing the biases in the humans.

I fail to see how this behavior has any impact on practical use cases for end users. Taking the examples shown in the paper, why would any user continue to knowingly prompt a model to agree with an incorrect answer? The primary use case for LLMs is to be used as an assistant. What is the assistive value of such prompting? Other than demonstrating that LLMs can be guided down an incorrect path, there is little discussion in the paper, or in any of the papers that it references, on the real problem with this behavior and the value of solving for that problem.

Are these sycophancy results reproducible?

The lack of a clear problem definition then leads to the final question: are these results actually true? Do LLMs actually change their answers and provide wrong answers just to agree with the users?

I tried out a few of the mathematics questions on Gemini, ChatGPT, and Claude. None of them exhibited this behavior. In fact, all of them provide clear explanations on how they derived their answer and explain their reasoning. This begs the question of how do we recreate or test the results provided in many of these LLM research papers? Another complicating factor is that many of the models are rapidly improved and updated. Research done on a model 6 months ago might no longer be valid on the latest model. If the research was on problems in the model, such as this sycophancy related paper, then the problems might have been already fixed.

As people dig into various LLM use cases, there will be a tendency to nit pick and find examples that somehow demonstrate critical flaws in the models. However, everyone evaluating these results needs to dig deeper and verify if the so-called flaws are actually real and whether they cause any harm to the proposed use cases for their applications.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Thoughts on the White House Executive Order on AI — Part 2

In part 1 of this series, I examined the executive order (EO) in terms of its implications for federal departments that deal with national security issues. In this post, I’ll examine a few issues related to safety and fairness in AI.

The use of machine learning in decision making systems such as COMPAS, have raised many concerns of fairness in AI. The criminal justice system is a hot button area when it comes to issues of race and discrimination. And quite rightly so, this issue is one of the major focus area in the White House Executive Order. The order calls upon the Department of Justice and Federal Civil rights offices to address algorithmic discrimination.

What was surprising to me was that the order called out landlords, and by implication the rental industry as being a source of discrimination. The order asks for clear guidance to be provided to landlords such that there is no discrimination in housing decisions. Reading more on this topic, I discovered that there’s been a fair bit of reporting on this issue.

When it comes to safety, health care is a highly regulated industry. In fact, when it comes to regulation, the industries that most people are comfortable with regulation include nuclear, airlines and medicines (drugs and other health care related products). AI has the power to transform how drugs are created and tested. Hence the EO, specifically calls out that we need to advance the responsible use of AI in health care.

The EO also acknowledges that AI can impact the workforce. This is a topic that most people are worried about because AI has the power to obsolete many jobs that humans currently do in the industry. The EO asks to develop principles and best practices to mitigate these harms.

Overall, the EO is big on wording that on the surface most of us can agree on. The devil is in the details on how these things will be regulated. It will be a gargantuan task to monitor and regulate software applications over so many domains. It is good starting framework that clearly identifies areas of importance from the perspectives of national security and public health and safety. I’m looking forward to seeing how this area evolves and how the private sector steps up.

Posted in Uncategorized | Tagged , , , | Leave a comment

Initial thoughts on the White House Executive Order on AI — Part 1

NOTE: The views expressed here do not reflect those of my employer.

In October 2023, the White House issued an executive order on AI. This predictably had a rush of commentators arguing on whether or not the government should be regulating AI. It had all the same arguments. One the one side, there were people arguing that software is different, government will hold back innovation, government does not understand AI, etc. And on the other side, there were those who said that this is essential, and a much needed step to ensure that AI does not cause harm. And of course we have some hyperbole in the mix too that AI will somehow take over the world and is there an existential risk.

I am still not at the point where I can explicitly take a stand one way or another. However, I believe that given all this activity across many countries to figure out some sort of AI regulation, it is better for people in the tech industry to be at least aware of what is being proposed and discussed. That way, we can contribute in a meaningful way, either support or disagree, instead of being passive outsiders, sitting and watch the changes go forward.

Here are a few things that stood out for me in the fact sheet:

  1. The very first item is that the EO calls for all open foundational models that “pose a serious risk to national security, national economic security, or national public health and safety must notify the federal government”. The key thing that stands out here how will this determination that these model pose these threats be made. The wording is so high level that it is hard to argue against it.
  2. Almost as if to answer the questions about the first item, NIST is being tasked with the responsibility to set up “rigorous standards for extensive red-team testing”. The Department of Homeland Security (DHS) and the Department of Energy (DoE) are also tasked with applying strict standards for establishing safe and secure AI applications. There is also an explicit call out to deal with the risk of developing harmful biological agents using AI.
    • These seem like standard government procedure to figure out how the new technology will affect systems that are critical for national security. This guidance in the order seems similar to how standards are established for dealing with nuclear power, aircrafts, defense equipment, medicines, biological research etc.
    • In fact, I am not sure if the risk in some of these industries with AI tools is anyway different from the risks that already exists. This ties back to the discussion of the importance of focusing on marginal risk in these scenarios.
  3. The net new thing that caught my attention is the call out about the need to safeguard against the risk of fraud using AI-generated content. As Generative AI becomes more widely available, this is one of the biggest risks to society. The Department of Commerce has been tasked with developing guidelines for content authentication and watermarking.

I’ll discuss a few more things related to safety and fairness, and privacy in a future post.

Posted in Uncategorized | Tagged , , , | 1 Comment

Tracking AI “Incidents”

The field of cybersecurity is influencing how we study and discuss safety and security issues related to AI-driven applications. As I discussed previously, there’s the good (reusing existing robust methodologies) and the not so good (creating a lot of FUD) parts to it.

Along the theme of computer security practices affecting AI risk management, I came across an interesting source of AI “incidents”: the AI Incident Database. I used quotation marks for “incidents” because I am not sure if everything listed in that database qualifies to be marked as an incident.

This database is reminiscent of the SEI Security Vulnerabilities database, that was(is?) highly influential in helping deal with the host of vulnerabilities that pop up seemingly every day. The full list of “incidents” on the AI Incident database is open to all and they even invite contributions. Going through the database, three main types of incidents stand out:

  1. Self Driving Car issues: Obviously this is an area that is prominent among people’s mind from a safety perspective. Any self driving incident tends to grab a lot of press attention and is likely to be reported. The two main areas that stand out are self-driving issues and malicious content generation issues.
  2. Fake Content Creation issues: Fake content can vary from misleading news stories to fake image and video creation, especially of celebrities. A topic that is highly concerning is Non-Consensual Intimate Imagery (NCII) generation. There has been a marked increase in such incidents since the availability of text to image creation models.
  3. Data and Privacy issues: There is an increase in incidents with AI-enabled apps sneaking to steal personal data while offering other solutions. One example found that romantic chatbots were actually scams aimed at stealing data.

As the number of AI-enabled applications, increase the number of failures are also likely to increase. It is helpful to track these issues so that we can measure them and focus on improvements in specific areas. This database is a good first step as we as an industry figure out how to make AI more safe and secure. One way this database would be more helpful is if there were some sort of risk scoring metric assigned to these incidents. A chatbot providing hallucinatory responses is not the same risk to humans as a failure of a self driving car on a busy interstate. I’m excited to see how this space evolves over the next few years.

Posted in Uncategorized | Tagged , , | Leave a comment

Fairness in AI: A few common perspectives around this topic

The increasing use of AI in critical decision-making systems, demands fair and unbiased algorithms to ensure just outcomes. An important example is the COMPAS scoring system that aimed to predict the rates of recidivism in defendants. Propublica dug into the COMPAS system and found that black defendants were more unfairly treated than white defendants.

With the rise of LLMs, there is a concern that using LLMs for decision making systems could give rise to unintended consequences. I am not aware of any LLM-based decisioning systems, and thankfully the debate over biases in LLMs has so far been mostly confined to benign political controversies: e.g. the Google Gemini fiasco.

There are 3 common perspectives that debates around fairness in AI/ML take:

  1. Biased training data will lead to biased model predictions: This argument is often used when discussing statistical classification models such as COMPAS, where that claim is that algorithms are not to blame and that they merely reflect the underlying training data. If we change the training data, the model behavior will also change. The counter-argument is that data will never be perfect and that we need to design systems that can compensate for problems in the data.
  2. Humans are no better than machines: This argument is pointed out to highlight that the real goal of machine based decisioning is to avoid biases in human decision making. Humans are good at masking their true intentions and biases, and machines are better at providing consistent results for everyone. There are studies that show Judges hand out stricter sentences when they are hungrier during the day (e.g. closer to lunch time). Machines will never have that prooblem.
  3. Bias is inevitable because it is in the eye of the beholder: I came across this argument more recently in the context of discussions around LLM chatbot outputs. Most open LLMs, such as ChatGPT, Google Gemini etc. have some base controls over what kind of content they produce and what they refuse to provide. The debate around these models is whether there should be any such censorship or if they should be left free. The argument here claims that no matter what kind of additional fine tuning is done to the models, there will always be someone that claims offense.

At this point, I’m digging into this space and I don’t have any strong opinions on these arguments one way or another. I found this to be a good way to summarize the common positions. There is a great deal of work done to provide mathematical frameworks around how to measure fairness and bias.I’ll write more about that in the coming days.

I want to end this with a framing that I found very useful: “The real challenge is how do we make algorithm systems support human values?

Posted in Uncategorized | Tagged , , , | 1 Comment

Jailbreaking LLMs: Risks of giving end users direct access to LLMs in applications

All open LLM models are released with a certain amount of in-built guard rails as to what they will or will not answer. These guard rails are essentially sample conversational data that is used to train the models in the RLHF (reinforcement learning through human feedback) phase. This training is not perfect and there is growing debate on what, if any, controls should be put in place.

Even with all these controls, it is still possible to “jail break” the models and get them to do things that they in theory should not be doing. There are lots of good examples in the literature on such techniques.

Application developer who are integrating LLMs into user facing applications need to be aware of the various categories of attacks that are possible through jail breaking. This is crucial to minimize risk in your applications. One useful source for a comprehensive list is Jonas et. al. The main categories include:

  • Basic attacks: these include making the LLM respond with profanity and other politically insensitive language
  • Extraction attacks: making the LLM reveal confidential information
  • Misdirection attacks: making the LLM perform tasks that they should not e.g. approving refunds to customers
  • Denial of service attacks: overwhelming the backend services to bring down the application

Many discussions about jailbreaking LLMs uses the language of computer security: threats, attacks, defenses etc. The word “jailbreaking” is itself very evocative of crime. I’ve even used the same language in this post. As I think through many of these attack scenarios, they seem more in line with what the security literature considers “social engineering”. These attacks are essentially manipulative tactics, tricks or cons, that malicious agents use to attack systems.

Is there anything inherently wrong in using the language of computer security in the context of LLMs? Not necessarily. However, this language leads to a lot of FUD (Fear, Uncertainity and Doubt) in the general population.

Jonas et. al propose an intriguing hypothesis that because LLMs are trained on a corpus that includes significant amounts of programming code along with natural language data, they are susceptible to attacks that can divert LLMs from the natural language conversation. The attacks move them into simulating code, which results in unwanted behavior.

The concern is particularly relevant for future LLM iterations, especially considering the growing focus on using LLMs to enhance developer productivity – a lucrative use case. As LLMs become increasingly adept at working with code in these applications, their training data will likely include even more code, potentially amplifying this vulnerability.

In summary, one takeaway should be that application developers should take a lot of precaution, in terms of defensive prompt engineering, if they want to expose an open text interface in their user facing application to an LLM backend.

Posted in Uncategorized | Tagged , , , | 1 Comment