Anyone else feel like Fin's analytics are a black box? | Community
Skip to main content

Anyone else feel like Fin's analytics are a black box?

  • March 12, 2026
  • 3 replies
  • 77 views

Forum|alt.badge.img+2

We've been using Fin for about 6 months and genuinely love it — but the analytics drive me crazy.

I can see our resolution rate is 71%. Great. But when my manager asks "why isn't it higher?" I have nothing. Which topics is Fin struggling with? Did that KB article I wrote last week actually help? No idea.

We ended up exporting conversations manually and doing our own analysis in spreadsheets, which is... not ideal.

How are other teams handling this? Are you just accepting the top-line number, or has anyone found a good way to actually dig into where Fin is failing?

Would love to know if there's something I'm missing in the native reporting, or if everyone's just cobbling something together.

3 replies

Forum|alt.badge.img
  • New Participant
  • April 14, 2026

Hi! 

 

I’d be keen to know if you find a way around this. 

I’ve also been struggling with monitoring Fin performances, finding opportunities to improve Fin and also ensuring it follows existing guidance. I’ve been using a spreadsheet and doing manual reviews of conversations where Fin was involved but this isn’t ideal


Forum|alt.badge.img+2

Hi! 

 

I’d be keen to know if you find a way around this. 

I’ve also been struggling with monitoring Fin performances, finding opportunities to improve Fin and also ensuring it follows existing guidance. I’ve been using a spreadsheet and doing manual reviews of conversations where Fin was involved but this isn’t ideal

Hello Andrea,

 

I sent you a dm.

Thanks


Forum|alt.badge.img+1

Hey 👋 The native analytics are helpful directionally, but they don’t really surface where things are breaking down in a meaningful way, so I’d love to share what we do right now to bridge the gap. 

What’s made the biggest difference for us hasn’t been reporting, it’s been how we capture and use agent feedback. On each conversation, we require agents to note whether Fin interacted, and if it did, whether the response was accurate. If it wasn’t, they’re expected to explain what went wrong. That piece is where most of the value comes fron: it turns a vague “this didn’t work” into something you can actually act on.

From there, we review that feedback continuously and start to see patterns emerge. Things like content gaps, missed intent, or edge cases that weren’t accounted for come up pretty quickly. It gives us a much clearer answer when someone asks why performance isn’t higher, because we can point to specific areas instead of guessing.

Measuring the impact of changes like new or updated articles is still a bit manual. We’re usually looking for directional signals: fewer incorrect responses tied to that topic, or a drop in escalations for the same type of question. It’s not perfect, but it’s enough to understand whether something helped.

Also! If you haven’t explored Monitors yet, they’re worth a look. We are still getting set up with ours, but they can help surface failure patterns, especially for conversations that never get escalated and would otherwise go unnoticed. We’ve made a Fin Failure monitor that flags conversations where Fin didn’t successfully resolve something. 

Here’s the prompt I’ve set up: 

This monitor identifies conversations where Fin failed to accurately understand, respond to, or appropriately escalate user requests, resulting in incorrect, incomplete, or misleading support experiences.

The monitor scans for conversations where Fin:

  • Gave incorrect or hallucinated information (made up, outdated, or factually wrong)
  • Misunderstood the user’s intent and answered the wrong question
  • Provided incomplete answers that required follow-up or agent intervention
  • Failed to ask clarifying questions when the request was unclear
  • Lacked necessary knowledge due to gaps in docs, training, or data access
  • Failed to escalate appropriately, including:
  • Ignoring or delaying escalation when needed
  • Pushing back more than once after a user requests a human
  • Continuing to respond during user frustration or repeated failure
  • Key signals
  • User expresses confusion or dissatisfaction (e.g., “that’s not right”)
  • User repeats or clarifies their question
  • Agent intervenes to correct, complete, or replace Fin’s response
  • Conversation escalates after poor or failed AI handling


And then we have a scorecard that assesses why Fin failed. 

It’s still early for us, but even just getting this in place has already made things feel a little more transparent. 

Curious how others are approaching this too, especially if you’ve found a cleaner way to tie these insights back to content updates or performance improvements over time!