Attachment Detection Strategies for Virtual Assistants

calendar Updated May 23, 2024
Alyssa Hurtig
Former Conversation Designer
Attachment Detection Strategies for Virtual Assistants

Virtual assistants or chatbots hosted on channels such as Apple Messages for Business (AMB), Google Business Messages (GBM) and WhatsApp for Business (WA) meet users in apps they have already adopted. Adding virtual assistants in these channels leverages interfaces that users are familiar with, but that can also lead to user assumptions on how chatbots can function in each of these channels.

For example, when using one of these channels for a human-to-human conversation, if a user sends an attachment such as a file, image, video or voice recording, their messaging partner will be able to view, play or otherwise open the attachment.

To further complicate user expectations, Apple Messages for Business, Google Business Messages and WhatsApp chatbots are all capable of sending images to a user as part of a bot flow.

From the user perspective, it’s easy to follow the thought process: if a virtual assistant can send an image, I should be able to send an image back.

However, depending on the conversational AI platform and its complexity, your virtual assistant may be unable to recognize attachments. This often happens when the bot is in its infancy and has limited initial use cases and feature sets. This can easily create user frustration, with a high frequency of utterances like:

Attachment recognition examples within the chatbot conversation
Attachment recognition examples within the chatbot conversation

The Challenge

Often, the user is reaching out to customer support due to a negative experience, and emotions like frustration or stress may influence user behavior. When a user has an issue, they may expect that providing an attachment using the chat is the fastest or most direct way to get connected to customer service. Users also often send attachments so they don’t have to type detailed messages or select multiple menu options. Sharing a photo/video is often faster than typing out a message. Additionally, users may not know the correct words when referring to specialized products or parts, and rely on the visual component. The user may also be anticipating the need to provide “proof”, which is common in replacement or refund customer support cases.

Industry examples of attachment recognition request to chatbot
Industry examples of attachment recognition request to chatbot

Common industries that experience this behavior include retail and food service. Other cross-industry use cases include sending attachments related to billing or product/technical support.

Did you see what I sent?

As a Conversational AI company, Master of Code has conducted our own analysis. We are seeing a growing amount of users expecting the chatbot to be able to parse an intent from the image or attachment they have sent. These users expect the virtual assistant to be able to take action based on the attachment provided, such as detecting a damaged item and triggering a refund flow.

Additionally, even if your chatbot clearly introduces itself as a virtual assistant, sometimes the user gets confused and thinks they are already speaking to an agent: “Did you see what I sent?”

If a user sends an attachment and receives a normal fallback message, they often get frustrated and abandon the conversation. This results in negative containment and negative survey scores for CSAT and NPS.

Your general fallback is therefore not a ‘one prompt fits all use cases’ message. Instead, Master of Code recommends adding in specialized fallback handling.

Our Solution

As a first step, we recommend investigating if your bot platform has the ability to parse attachments and add to your roadmap.

However, even if your virtual assistant is currently unable to parse attachments, that doesn’t mean there’s nothing you can do to improve the user experience. The bot can still recognize a user is attempting to send attachments. By capturing when and where users are attempting to send images, you will also be able to determine opportunities for improvement, including creating or optimizing self-service flows.

To alleviate user frustration and improve user experience, your assistant should have a customized fallback response, so the user feels heard. Users are more forgiving if they learn the bot cannot process the attachment, and that’s why they’re hitting the fallback.

If your chatbot cannot parse any type of attachment, work with your technical team to update the bot to detect when users are sending attachments. The bot should provide a contextual response of what it can and can’t do:

Voice recording recognition within the chatbot conversation
Voice recording recognition within the chatbot conversation

This assures the user that once they are escalated, the agent will be able to view the attachment.

Further mitigate user friction by providing options. Allow the user to decide if they want to escalate immediately, or return to the flow and continue to follow the prompts: “Would you like to chat with an agent, or return to the previous step?”

An experienced conversation designer can help customize these fallback messages to best suit your persona and tone.

Additional conversation design considerations

Work with both your technical and operations teams to verify if agents can view attachments from their agent dashboards, and if the ability is the same across all channels the bot is available on.

If agents cannot view attachments, then we recommend modifying the message to: Thanks for sending an image or attachment, however we are unable to view it. In a few words, how can I assist you today? The bot can then compare the user’s input to your NLU model.

It’s also important to review at what points within the conversation flow that users are sending attachments. Is it their first message? If so, we recommend presenting the customized fallback and then presenting the main menu. Once the user understands that they have to use menus, they are less likely to loop or abandon the conversation.

If your bot can parse some types of attachments but not others, then you should make that explicit: Thanks for sending a video. I can’t view videos, but I can view images. If you’d like to resend as an image, we can try again. If not, we can continue and I’ll pass the video along to the agent.

One of the keys of the Conversation Design process is to be transparent about the virtual assistant’s limitations. Instead of apologizing with a generic “I’m sorry I don’t understand” style prompt, the bot continues to move the conversation forward.


Since Master of Code has started applying this strategy to relevant projects, we’ve seen significant increases in CSAT and containment.

Recognizing that attachments are being sent and creating a transparent fallback about your virtual assistant’s capabilities is a quick win. It’s a small change but has a big impact on the user’s overall experience.

You will be able to measure the chatbot improvement via:

  • Higher CSAT / NPS scores
  • Increase in containment
  • Decrease in users hitting the normal fallback
  • Decrease in looping

Remember, it’s always more important to provide a prescription to the user, even if it means sharing technical limitations of a conversational solution.

Need assistance with designing Conversational AI strategy for your customers? We can help!

    By continuing, you're agreeing to the Master of Code
    Terms of Use and
    Privacy Policy and Google’s
    Terms and
    Privacy Policy

    Also Read

    All articles