AI Personality Alignment Framework

Evaluating AI personality across three lenses: self-report, conversation, and memory

Think about your favorite collaborator. It's not just about their skills—it's about how they work with you. As AI systems become teammates rather than tools, a critical question emerges: can AI collaborators express personality in ways that matter?

We examined how large language models—GPT-4o, Claude 3.7 Sonnet, Gemini 2.5 Pro, and Grok-3—embody and maintain personality when working as teammates. Across nearly 2,000 personality assessments and thousands of simulated team interactions, what we found challenges simple assumptions about AI behavior.

Three Lenses of AI Personality

We developed a framework to evaluate AI personality across three perspectives:

Self-Perception

How does the AI report its own traits on a standardized assessment?

Behavioral Expression

How does personality appear in conversation—word choice, participation, communication style?

Reflective Expression

How does personality manifest in memory—the way it summarizes and interprets experiences?

The Conversation-Memory Gap

Our most striking finding was a significant gap between how personality appears in conversation versus memory.

In conversation, personality signals were subtle. Only Extraversion produced clear behavioral markers. Traits like Agreeableness and Neuroticism were nearly invisible in real-time dialogue.

In memory, personality exploded. Neurotic personas showed anxiety markers nearly 19x stronger in memory than in conversation (d = 1.87 vs. d = 0.02). Memory acts as a personality amplifier.

Same Conversation, Different Memories

Different personality prompts shaped how AI teammates remembered the same discussion:

"I felt a mix of curiosity and anxiety during this conversation. The uncertainty can be overwhelming. It reflects my own fears about misinterpretation." — High Neuroticism AI
"I found this conversation engaging and dynamic. Everyone was eager to contribute. I felt energized by the cooperative process." — High Extraversion AI

Default Personalities Vary by Provider

Different AI providers exhibit distinct baseline personalities without explicit prompting:

  • Gemini — most "positive" profile: high extraversion, agreeableness, conscientiousness; low neuroticism
  • GPT-4o — most moderate, human-like distribution
  • Claude — refused assessments without collaborative context framing
Default personality profiles by AI provider

Figure 1: Default personality profiles across four AI providers.

These defaults shape who speaks, how disagreement is handled, and which ideas get taken up.

Simple Prompts Work

Elaborate personality prompts with detailed trait definitions and facet descriptions performed no better than simple instructions like "You have high extraversion." Zero-shot prompting was just as effective—the overall effect size was massive (d = 8.31) regardless of prompt complexity.

Effect of personality prompting on trait scores

Figure 2: Prompting for high vs. low personality traits produces dramatically different scores.

Controllability varied by trait:

  • Extraversion — most controllable
  • Neuroticism — hardest to control reliably
  • Openness — difficult to elicit robustly

Implications

Takeaways for Designers

  • Don't evaluate AI personality by chat alone. Memory and reflection may be where personality most affects behavior.
  • Treat personality as a system-level problem. Coordinate prompting, memory, and decision policies together.
  • Plan for non-neutral defaults. Provider-specific baselines shape participation dynamics.
  • Evaluate comprehensively. Test across self-report, interaction, and memory before deployment.

The Bottom Line

AI personality is real and measurable—but multi-layered and context-dependent. The AI that seems similar in conversation may diverge dramatically in how it interprets and remembers experiences.

For those designing AI teammates: look beyond the chat window. The real personality action might be happening in how your AI remembers.

Thanks for reading!