Skip to main content
ZeroTwo gives you access to some of the most capable AI models available, but all AI systems have inherent limitations. Understanding those limitations helps you use ZeroTwo more effectively and avoid putting undue trust in incorrect or outdated outputs.

AI Can Make Mistakes

Large language models generate responses by predicting likely continuations of text based on patterns learned during training. This process is powerful but imperfect. Even the most capable models make errors. Common types of errors:
  • Hallucinations — The model confidently states something factually incorrect. Most common for: obscure topics, precise numerical data (statistics, dates, dollar amounts), specific quotes, and niche subjects.
  • Outdated information — The model’s training data has a cutoff date. Facts that changed after that cutoff will not be reflected unless you enable Web Search.
  • Invented citations — Models sometimes generate plausible-looking but non-existent citation titles, paper names, author names, or URLs. Always verify citations before using them in serious work.
  • Logical errors — Even reasoning-capable models can make mistakes in multi-step logic, especially in long chains of reasoning or unusual problem structures.
  • Ambiguity misinterpretation — If your question is ambiguous, the model may interpret it differently than you intended and give a technically accurate answer to the wrong question.
Never rely solely on AI-generated content for decisions with significant consequences — medical, legal, financial, safety-critical, or compliance-related. Always verify with authoritative sources and qualified professionals.

Knowledge Cutoffs

Each AI model has a training cutoff date — the point at which its training data ends. The model has no knowledge of events, publications, software releases, or developments that occurred after that date.
Model / ProviderApproximate Knowledge Cutoff
OpenAI GPT-5Early 2025
Anthropic Claude Sonnet 4.6Early 2025
Google Gemini 2.5 Pro2025
DeepSeek Chat / Reasoner2025
xAI Grok-4Near real-time (Grok has live X/Twitter data access)
Most other modelsWithin 6–18 months of current date
For current events, recent research, or live data: Enable ZeroTwo’s Web Search tool to ground the model’s response in real-time information. Perplexity Sonar and Sonar Pro are specifically designed for web-grounded, citation-backed answers.

What to Do When a Response Seems Wrong

1

Verify with web search

Enable Web Search in the prompt bar and ask the same question. ZeroTwo pulls live sources and cites them inline so you can check the original references directly.
2

Try a different model

Different models have different strengths, training data, and weaknesses. If one model gives a suspect answer, try the same question with another. Reasoning models (o3, o4-mini, DeepSeek Reasoner, Claude with extended thinking) are more reliable for complex factual or logical questions.
3

Rephrase your question

Ambiguity is a frequent cause of poor answers. Try asking your question more specifically, breaking it into smaller sub-questions, or providing more context about what you already know and what you specifically need.
4

Ask for step-by-step reasoning

Add “Think through this step by step” or “Explain your reasoning before giving your final answer” to your prompt. This often surfaces errors in the model’s logic and makes it easier to identify where things went wrong.
5

Cross-check with multiple models

For critical information, ask the same question to 2–3 different models and compare their answers. Agreement across models increases confidence; significant disagreement signals that external verification is needed.

Code Quality

ZeroTwo’s models generate high-quality code in dozens of languages, but generated code should always be reviewed and tested before production use. Common issues with AI-generated code:
  • Security vulnerabilities — Models may generate code with subtle security issues (SQL injection risks, insecure credential handling, improper input validation). Always review for security implications, especially for user-facing or data-handling code.
  • Logic errors — Code that is syntactically correct and looks reasonable may have incorrect logic in edge cases the model did not anticipate.
  • Outdated APIs — If a library or framework changed its API after the model’s training cutoff, generated code may reference deprecated or removed functions.
  • Hallucinated package names — Models occasionally suggest npm packages, PyPI libraries, or other dependencies that do not exist. Always verify package names before installing.
Best practice: Test all generated code in a sandboxed environment before deploying. Use the Canvas Code editor in ZeroTwo for quick iteration and review. For security-sensitive code, always have a human review.

Consistency and Variation

The same prompt can produce different outputs on different runs. AI responses are probabilistic — there is inherent variation in every generation. This means:
  • Running the same prompt twice may produce different outputs
  • A prompt that worked well yesterday may produce a different result today if the model was updated
  • High-stakes tasks benefit from multiple runs and comparison
For more consistent results:
  • Write structured prompts with explicit format requirements (“always output as JSON with these fields…”)
  • Provide examples of the output you want (few-shot prompting)
  • Specify length, format, tone, and style explicitly
  • Use reasoning models for tasks requiring logical consistency across steps
See Prompts: Structured Techniques and Prompts: Few-Shot and Demos for detailed prompting strategies.

Limitations by Task Type

Models can answer many factual questions accurately, but hallucination risk increases for obscure topics, precise statistics, specific names, and recent events. For factual questions that matter, always use Web Search or verify with authoritative sources. Do not cite AI-generated facts in formal work without independent verification.
Models can handle arithmetic and many math problems, but they can make calculation errors, especially with large numbers, multi-step computations, or unusual problem structures. For reliable arithmetic, use a calculator or code interpreter. For conceptual math, proofs, and logic, reasoning models (o3, o4-mini, DeepSeek Reasoner) are significantly more reliable.
Without Web Search enabled, the model has no access to current prices, live sports scores, breaking news, API status, stock prices, or any data that changes in real time. Enable Web Search or use Perplexity Sonar for any query requiring live or current data.
Models have no access to your personal information, private documents, or data from external services unless you explicitly provide it — either by pasting it into the conversation or by connecting an integration via Connectors. The model cannot access your email, calendar, files, or accounts without a configured and authorized connector.
Models with larger context windows (Claude Sonnet/Opus 4.6 at 200k tokens, Gemini 2.5 Pro at 1M tokens) handle long documents much better than models with smaller windows. Even large-context models may lose track of details buried deep in very long inputs. For extremely large documents, use specific section references (“analyze only Section 3”) rather than “analyze everything.” For large codebases, consider file-by-file analysis rather than pasting everything at once.

Improving Output Quality

Be specific. Vague prompts produce vague answers. Instead of “Tell me about marketing,” try “What are the three most effective digital marketing channels for a B2B SaaS company with a $10,000/month budget targeting mid-market companies?”
Provide examples. If you have a specific output format or style in mind, show the model an example of what you want. Few-shot examples are one of the most effective prompting techniques available.
Specify output format. Explicitly stating the format you want — “respond as a JSON object,” “use markdown headers,” “give me a bulleted list,” “write three paragraphs” — dramatically improves consistency.
Break complex requests into steps. Instead of one large, multi-part prompt, break complex tasks into sequential steps in separate messages. Let the model complete each step before moving to the next.
Match model to task. Use reasoning models for complex logic and math. Use web-search-grounded models for current events. Use large-context models for long documents. Standard models are perfectly capable for simple, everyday tasks and preserve your premium quota.
Iterate and correct. AI conversations are designed to be iterative. If the first response is not quite right, follow up with corrections: “The tone is too formal — rewrite it more casually” or “You ignored the constraint about X — please try again with that in mind.”