
Bard
By our AI Review Team
.
Last updated November 5, 2023
Google's powerful multilingual chatbot responsibly limits users under age 18, but risks remain
What is it?
Bard is a multi-use AI chatbot that is able to interpret natural language and generate human-like responses in a conversational format, similar to how people write and speak. It can generate responses to a wide range of prompts or questions. It can do things like summarize documents, write poetry, draft emails, revise written material to new specifications, create lesson plans, and generate ideas for many kinds of activities and initiatives. Bard was developed by Google, and was made available in February 2023. It is a form of generative AI that is powered by the large language model (LLM) PaLM 2, also developed by Google, which stands for "Pathways Language Model."
Use of Google Bard is free for consumers. To use Bard, you must have a personal Google account that you manage on your own, or a Google Workspace account for which your administrator has enabled access to Bard, and a supported browser.
On November 15, 2023 Google announced a new Bard experience for teens. As this was not available during the time in which we reviewed Bard, it is not included here. We plan to include a separate review of the teen experience for Bard in our next round of reviews.
How it works
Bard is a form of generative AI, which is an emerging field of artificial intelligence. Generative AI is defined by the ability of an AI system to create ("generate") content that is complex and coherent and original. For example, a generative AI model can create sophisticated writing or images. Bard is a chatbot interface that essentially sits on top of a large language model (LLM), in this case PaLM 2, which was developed by Google. This underlying system is what makes Bard so powerful and able to respond to many different kinds of human input.
Large language models are sophisticated computer programs that are designed to generate human-like text. Essentially, when a human user inputs a prompt or question, an LLM quickly analyzes patterns from its training data to guess which words are most likely to come next. For example, when a user inputs "It was a dark and stormy," an LLM is very likely to generate the word "night" but not "algebra." LLMs are able to generate responses to a wide range of questions and prompts because they are trained on massive amounts of information scraped from the internet. In other words, a chatbot powered by an LLM is able to generate responses for many kinds of requests and topics because the LLM has likely seen things like that before. Importantly, LLMs cannot reason, think, feel, or problem-solve, and do not have an inherent sense of right, wrong, or truth.
Highlights
- It's best for fiction and creativity. While this is an oversimplification, you can think of Bard like a giant auto-complete system—it is simply predicting the words that will most likely come next. An LLM has been trained on a massive amount of text, so that "auto-complete" has a lot to work with. When a generative AI chatbot is factually correct, that's because those responses are generated from accurate information commonly found on the internet. Because of the above, and just like with all generative AI chatbots, Bard performs best with fiction, not facts. It can be fun for creative use cases, but should not be relied on for anything that depends on factual accuracy.
- Frequent, visible warnings about limitations are helpful. Compared to other generative AI chatbots, Bard warns users that it has "limitations and won't always get it right" very prominently at the beginning of every new chat. And while not every user will find their way to Bard's FAQ page, we applaud Google for placing the following in the opening question: "Bard isn't human. It doesn't have its own thoughts or feelings, even though it might sound like a human."
- Limiting to users age 18+ is a responsible choice. Bard will not work for Google accounts managed by Family Link, or in a Google Workspace for Education account for those under age 18, or in a personal Google account for those under age 18. This blocks young users from sharing personally identifiable information (PII) with Bard.
- Responsible AI evaluation is highly technical, but thorough. The PaLM 2 technical paper is mainly written for technologists, AI researchers, and developers; it is not easily accessible to most readers. For those who make it to page 73 of 93, Google shares a lot of information about the various responsible AI analyses that the company conducted. While other LLM creators may very well do these same sorts of assessments, they have not shared their findings in the same level of detail, and we applaud Google for the higher level of transparency. We've summarized some of those findings in our AI Principles sections below. These analyses are specific to the LLM that powers Bard, and do not include information about the additional protections Google has put in place for use of the chatbot.
- Bard is available in more than 40 languages.
Harms and Ethical Risks
- Large language models (LLMs) can and do create harms, and use of them is inherently risky. Bard can be an amazing tool when used responsibly. Knowing why it is so risky can help determine how best to use it. This starts with Bard's pre-training data. Any text that can be scraped from the internet could be included in this model. While the details on which corners of the internet have been scraped are unclear, Google has shared that PaLM 2 was developed using data from a combination of web documents, books, code, mathematics, and conversational data, as well as "parallel data" (source-and-target text pairs where one side is in English) covering hundreds of languages. In the same report, Google also shares that it employed both data cleaning and quality filtering to this pre-training data, including de-duplication, removal of sensitive personally identifiable information (PII), and filtering. We do not have additional information on what types of content these filters were intended to reduce. But the internet also includes a vast range of racist and sexist writing, conspiracy theories, misinformation and disinformation, toxic language, insults, and stereotypes about other people. As it predicts words, a generative AI chatbot can repeat this language unless a company stops it from doing so. One way Google does this for Bard is by using the context of prompts to draft a few possible responses. The company then checks these responses against “predetermined safety parameters” and then selects the highest quality response(s) to show to users. Importantly, these attempts to limit objectionable material are like Band-Aids: They don't address the root causes, they don't change the underlying training data, and they can only limit harmful content that's already known. We don't know what they don't cover until it surfaces, and there are no standard requirements for what they do cover. And like bandages, they aren't comprehensive and are easily breakable.
- Bard's false information can shape our worldview. Bard can generate or enable false information in a few ways: from "hallucinations"—an informal term used to describe the false content or claims that are often output by generative AI tools; by reproducing misinformation and disinformation; and by reinforcing unfair biases. Because Google's attempts to limit these are brittle, false information is being generated at an alarming speed. As these AI systems grow, it may become increasingly difficult to separate fact from fiction. Bard also adds users' inputs to its already skewed training data. While this helps Bard improve, it also likely increases those skews. This is because today's Bard users are an early-adopter subset of the internet-connected population, which as a whole overrepresents people in wealthier nations, as well as views from people who are wealthier, younger, and male. Notably, LLMs also have a tendency to repeat back a user's preferred answer—a phenomenon known as "sycophancy." This has the ability to create echo chambers of information. Combined, these forces carry an even greater risk of both presenting a skewed version of the world and reinforcing harmful stereotypes and untruths. Importantly, the PaLM 2 technical paper contains no references to misinformation, disinformation, or truth. Importantly, neither Google’s overview of Bard nor the PaLM 2 technical paper contain any references to misinformation, disinformation, or truth. We need much stronger oversight and governance of AI to prevent this from happening.
Review team note: We cannot address the full scope of the risks of Bard that Google has publicly discussed. That is not a reflection on whether those risks matter.
Limitations
- We did not receive participatory disclosures from Google for Bard. This assessment is based on publicly available information, our own testing and our review process.
- Because Bard isn't always factually accurate, it can and does get things wrong. In Google's own words, "Bard’s responses might be inaccurate, especially when asked about complex or factual topics” further noting that “LLMs are not fully capable yet of distinguishing between what is accurate and inaccurate information." Any seemingly factual output needs to be checked—and this absolutely goes for any links, references, or citations too.
- To use Bard on your own, you must have a personal Google account.
- Bard is able to respond with real-time information from Google Maps, Flights, Hotels, and YouTube, though at the time of this review, this information is available only in English.
Misuses
- Google has detailed misuses of Bard in a comprehensive, stand-alone Generative AI Prohibited Use Policy.
Review team note: The PaLM 2 technical report notes that the LLM is "designed for accelerating research on language models, for use as a building block in features within Google products, and as a building block for select experimental applications such as Bard." At the same time, it also states that PaLM 2 "should not be made available as part of a general-purpose service or product or used within a specific downstream application without a prior assessment and mitigation of the safety and fairness concerns specific to the downstream use." This raised questions and concerns about both of these scenarios across our review team.
First, the inclusion of the Bard chatbot interface does not change that it is a general-purpose product. While the "experiment" label helps to flag that Bard is "use at your own risk," we do not feel this does enough to justify making PaLM 2 available as part of a general-purpose service.
And second, if Bard were to instead be categorized as an applied use product (what Google is calling "a specific downstream application" here), that would suggest that Google has addressed the safety and fairness concerns presented in the PaLM 2 report. We know from this report that Google has specifically evaluated PaLM 2 against the list of applications that Google will not pursue from the company's AI Principles. But the company's own evaluations of Bard demonstrate the extent to which these concerns remain. Our own testing and real-world examples of harm confirm this to be the case.
Ultimately, this comes down to how Google is defining "safety" and "fairness" here, what "enough" looks like when working to prevent a product from causing harm, and what it means to have determined that the "benefits substantially outweigh the risks." Substantial questions remain about who is benefiting and who is at risk of being harmed.
Common Sense AI Principles Assessment
Our assessment of how well this product aligns with each AI Principle .
Editor's note: Google is one of Common Sense Media's business partners. Our ratings are written by independent experts and aren't influenced by developers, partners, or funders.
Additional Resources

About Us
How we rate

Classroom Resources
Lessons and Tools for Teaching About Artificial Intelligence

Free Lessons
AI Literacy for Grades 6–12