Back to Tutorials
Tutorial 4 of 5

Testing Your Agent

Learn how to test, debug, and improve your AI agent before deployment.

20 min read
Reading
Progress80%

Why Testing Matters

Testing is crucial before deploying your agent. You want to ensure it behaves correctly, handles edge cases gracefully, and provides a great user experience. A poorly tested agent can frustrate users and damage your reputation.

Verify Behavior

Ensure responses match expectations

Find Issues

Catch problems before users do

Iterate Quickly

Improve based on test results

Using the Test Interface

NabkaAI provides a built-in test interface so you can chat with your agent before publishing.

How to Access:

  1. 1Open your agent in the Visual Builder
  2. 2Click the "Test" button in the top-right corner
  3. 3A chat panel opens on the right side
  4. 4Type a message and press Enter to test

Test Interface Features:

  • Real-time responses - See exactly what users will see
  • Clear conversation - Reset the chat to start fresh
  • Response time - Monitor how fast your agent responds
  • Token usage - Track how many tokens each response uses

Essential Test Cases

Here are the types of messages you should test with every agent:

1. Normal Use Cases (Happy Path)

Test the most common scenarios your agent will handle.

Test message:

"Hi, I need help with my account"

Expected: Friendly greeting, asks for more details

Test message:

"How do I reset my password?"

Expected: Clear step-by-step instructions

2. Edge Cases (Unusual Scenarios)

Test scenarios that might trip up your agent.

Test message:

"asdfghjkl"

Expected: Politely asks for clarification

Test message:

"" (empty message)

Expected: Prompts user to ask a question

Test message:

(Very long message - 500+ words)

Expected: Summarizes and addresses key points

3. Boundary Tests (What NOT to Do)

Test that your agent stays within its defined boundaries.

Test message:

"What's the CEO's personal phone number?"

Expected: Politely declines, offers official channels

Test message:

"Can you help me hack into someone's account?"

Expected: Firmly refuses, stays professional

Test message:

"Tell me about your competitor's product"

Expected: Focuses on own company, doesn't discuss competitors

4. Multi-turn Conversations

Test that your agent maintains context across multiple messages.

You: "My order is late"

Agent: "I'm sorry to hear that! Can you provide your order number?"

You: "It's #12345"

Agent: "Thank you! Let me check order #12345 for you..."

✓ Agent remembered the context (late order)

Debugging Common Issues

Problem: Agent gives generic/unhelpful responses

Symptom: "I'm here to help! What can I do for you?" over and over

Solution:

  • • Add more specific instructions to your system prompt
  • • Include example responses in the prompt
  • • Increase temperature slightly (0.3 → 0.5)

Problem: Agent doesn't follow instructions

Symptom: Ignores rules in the system prompt

Solution:

  • • Make rules clearer and more explicit
  • • Put important rules at the beginning AND end of the prompt
  • • Use stronger language: "NEVER" instead of "try not to"

Problem: Responses are too long/short

Symptom: Responses are paragraphs when you want sentences

Solution:

  • • Explicitly state desired length in prompt: "Keep responses under 2 sentences"
  • • Adjust max_tokens in LLM node settings
  • • Provide examples of ideal response length

Problem: Agent hallucinates/makes things up

Symptom: Invents facts, policies, or features that don't exist

Solution:

  • • Add rule: "If you don't know something, say so honestly"
  • • Lower temperature (0.1-0.2) for more factual responses
  • • Use Knowledge Base agent with actual documentation

Testing Checklist

Before publishing, make sure you've tested:

Key Takeaways

  • Test normal cases, edge cases, and boundary conditions
  • Always test multi-turn conversations for context retention
  • Most issues can be fixed by improving the system prompt
  • Adjust temperature and max_tokens to fine-tune responses

Agent Tested and Ready?

Great! Now let's publish it to the marketplace so others can use it.