Answer Quality and Limitations

ZeroTwo gives you access to some of the most capable AI models available, but all AI systems have inherent limitations. Understanding those limitations helps you use ZeroTwo more effectively and avoid putting undue trust in incorrect or outdated outputs.

AI Can Make Mistakes

Large language models generate responses by predicting likely continuations of text based on patterns learned during training. This process is powerful but imperfect. Even the most capable models make errors. Common types of errors:

Hallucinations — The model confidently states something factually incorrect. Most common for: obscure topics, precise numerical data (statistics, dates, dollar amounts), specific quotes, and niche subjects.
Outdated information — The model’s training data has a cutoff date. Facts that changed after that cutoff will not be reflected unless you enable Web Search.
Invented citations — Models sometimes generate plausible-looking but non-existent citation titles, paper names, author names, or URLs. Always verify citations before using them in serious work.
Logical errors — Even reasoning-capable models can make mistakes in multi-step logic, especially in long chains of reasoning or unusual problem structures.
Ambiguity misinterpretation — If your question is ambiguous, the model may interpret it differently than you intended and give a technically accurate answer to the wrong question.

Never rely solely on AI-generated content for decisions with significant consequences — medical, legal, financial, safety-critical, or compliance-related. Always verify with authoritative sources and qualified professionals.

Knowledge Cutoffs

Each AI model has a training cutoff date — the point at which its training data ends. The model has no knowledge of events, publications, software releases, or developments that occurred after that date.

Model / Provider	Approximate Knowledge Cutoff
OpenAI GPT-5	Early 2025
Anthropic Claude Sonnet 4.6	Early 2025
Google Gemini 2.5 Pro	2025
DeepSeek Chat / Reasoner	2025
xAI Grok-4	Near real-time (Grok has live X/Twitter data access)
Most other models	Within 6–18 months of current date

For current events, recent research, or live data: Enable ZeroTwo’s Web Search tool to ground the model’s response in real-time information. Perplexity Sonar and Sonar Pro are specifically designed for web-grounded, citation-backed answers.

What to Do When a Response Seems Wrong

Verify with web search

Enable Web Search in the prompt bar and ask the same question. ZeroTwo pulls live sources and cites them inline so you can check the original references directly.

Try a different model

Different models have different strengths, training data, and weaknesses. If one model gives a suspect answer, try the same question with another. Reasoning models (o3, o4-mini, DeepSeek Reasoner, Claude with extended thinking) are more reliable for complex factual or logical questions.

Rephrase your question

Ambiguity is a frequent cause of poor answers. Try asking your question more specifically, breaking it into smaller sub-questions, or providing more context about what you already know and what you specifically need.

Ask for step-by-step reasoning

Add “Think through this step by step” or “Explain your reasoning before giving your final answer” to your prompt. This often surfaces errors in the model’s logic and makes it easier to identify where things went wrong.

Cross-check with multiple models

For critical information, ask the same question to 2–3 different models and compare their answers. Agreement across models increases confidence; significant disagreement signals that external verification is needed.

Code Quality

ZeroTwo’s models generate high-quality code in dozens of languages, but generated code should always be reviewed and tested before production use. Common issues with AI-generated code:

Security vulnerabilities — Models may generate code with subtle security issues (SQL injection risks, insecure credential handling, improper input validation). Always review for security implications, especially for user-facing or data-handling code.
Logic errors — Code that is syntactically correct and looks reasonable may have incorrect logic in edge cases the model did not anticipate.
Outdated APIs — If a library or framework changed its API after the model’s training cutoff, generated code may reference deprecated or removed functions.
Hallucinated package names — Models occasionally suggest npm packages, PyPI libraries, or other dependencies that do not exist. Always verify package names before installing.

Best practice: Test all generated code in a sandboxed environment before deploying. Use the Canvas Code editor in ZeroTwo for quick iteration and review. For security-sensitive code, always have a human review.

Consistency and Variation

The same prompt can produce different outputs on different runs. AI responses are probabilistic — there is inherent variation in every generation. This means:

Running the same prompt twice may produce different outputs
A prompt that worked well yesterday may produce a different result today if the model was updated
High-stakes tasks benefit from multiple runs and comparison

For more consistent results:

Write structured prompts with explicit format requirements (“always output as JSON with these fields…”)
Provide examples of the output you want (few-shot prompting)
Specify length, format, tone, and style explicitly
Use reasoning models for tasks requiring logical consistency across steps

See Prompts: Structured Techniques and Prompts: Few-Shot and Demos for detailed prompting strategies.

Limitations by Task Type

Factual questions about the real world

Models can answer many factual questions accurately, but hallucination risk increases for obscure topics, precise statistics, specific names, and recent events. For factual questions that matter, always use Web Search or verify with authoritative sources. Do not cite AI-generated facts in formal work without independent verification.

Mathematical calculations

Models can handle arithmetic and many math problems, but they can make calculation errors, especially with large numbers, multi-step computations, or unusual problem structures. For reliable arithmetic, use a calculator or code interpreter. For conceptual math, proofs, and logic, reasoning models (o3, o4-mini, DeepSeek Reasoner) are significantly more reliable.

Legal, medical, and financial advice

Models can provide general information about legal, medical, and financial topics, but they are not licensed professionals and cannot provide advice specific to your situation. Information may be incomplete, outdated, or inapplicable to your jurisdiction or circumstances. Always consult a qualified human professional for decisions in these domains.

Real-time and live data

Without Web Search enabled, the model has no access to current prices, live sports scores, breaking news, API status, stock prices, or any data that changes in real time. Enable Web Search or use Perplexity Sonar for any query requiring live or current data.

Personal or private information

Models have no access to your personal information, private documents, or data from external services unless you explicitly provide it — either by pasting it into the conversation or by connecting an integration via Connectors. The model cannot access your email, calendar, files, or accounts without a configured and authorized connector.

Long documents and large codebases

Models with larger context windows (Claude Sonnet/Opus 4.6 at 200k tokens, Gemini 2.5 Pro at 1M tokens) handle long documents much better than models with smaller windows. Even large-context models may lose track of details buried deep in very long inputs. For extremely large documents, use specific section references (“analyze only Section 3”) rather than “analyze everything.” For large codebases, consider file-by-file analysis rather than pasting everything at once.

Improving Output Quality

Be specific. Vague prompts produce vague answers. Instead of “Tell me about marketing,” try “What are the three most effective digital marketing channels for a B2B SaaS company with a $10,000/month budget targeting mid-market companies?”

Provide examples. If you have a specific output format or style in mind, show the model an example of what you want. Few-shot examples are one of the most effective prompting techniques available.

Specify output format. Explicitly stating the format you want — “respond as a JSON object,” “use markdown headers,” “give me a bulleted list,” “write three paragraphs” — dramatically improves consistency.

Break complex requests into steps. Instead of one large, multi-part prompt, break complex tasks into sequential steps in separate messages. Let the model complete each step before moving to the next.

Match model to task. Use reasoning models for complex logic and math. Use web-search-grounded models for current events. Use large-context models for long documents. Standard models are perfectly capable for simple, everyday tasks and preserve your premium quota.

Iterate and correct. AI conversations are designed to be iterative. If the first response is not quite right, follow up with corrections: “The tone is too formal — rewrite it more casually” or “You ignored the constraint about X — please try again with that in mind.”

Models Overview — choosing the right model for each task type
Shared Context and Continuity — how context window size and conversation length affect response quality
Prompts: Overview — strategies for writing better prompts
Web Search — grounding responses in live data to address knowledge cutoffs
Model Picker — how to find and select the best model for a given task

Getting Started

Overview

Core Chat

Tools

Studio

Models & Providers

Projects

Custom Agents

Skills

Connectors & Integrations

Personalization & Memory

Sharing

Workspaces & Business

Account & Billing

Privacy

Prompts

Troubleshooting

FAQ

Changelog

Reference

Answer Quality and Limitations

AI Can Make Mistakes

Knowledge Cutoffs

What to Do When a Response Seems Wrong

Code Quality

Consistency and Variation

Limitations by Task Type

Improving Output Quality

Getting Started

Overview

Core Chat

Tools

Studio

Models & Providers

Projects

Custom Agents

Skills

Connectors & Integrations

Personalization & Memory

Sharing

Workspaces & Business

Account & Billing

Privacy

Prompts

Troubleshooting

FAQ

Changelog

Reference

Documentation Index

​AI Can Make Mistakes

​Knowledge Cutoffs

​What to Do When a Response Seems Wrong

​Code Quality

​Consistency and Variation

​Limitations by Task Type

​Improving Output Quality

​Related Pages

AI Can Make Mistakes

Knowledge Cutoffs

What to Do When a Response Seems Wrong

Code Quality

Consistency and Variation

Limitations by Task Type

Improving Output Quality

Related Pages