Smelling with your EYES and EARS. Digital retail is visual and auditory by nature. But what about the senses we can’t physically activate through a screen? In Full Fathom’s research collaboration with the University of Leeds we explored how multisensory cues work in online environments, particularly for scent-led categories. The standout insight: 76% of participants were able to identify a fragrance using audiovisual cues alone. This reinforces something powerful: the brain doesn’t need direct physical stimulation to construct sensory experience. When cues are designed intentionally, people can perceive scent without actually smelling it. Key takeaways from the study: • Non-olfactory cues can trigger clear scent perceptions • Layered multisensory inputs improve identification accuracy • Sensory congruency strengthens understanding of scent character • Well-matched stimuli can positively influence purchase intent online For sectors like beauty, wellness, and personal care, where scent plays a central role, this opens up important strategic opportunities. Digital environments don’t have to be sensory limitations. They can be carefully orchestrated sensory translations. In increasingly saturated markets, multisensory thinking isn’t a creative extra - it’s a competitive differentiator. How are you translating your physical brand and product experiences into digital ones? #multisensory #branding #digitalexperience #retailinnovation #designstrategy #beauty #personalcare #wellness
Integrating Visual and Auditory Cues
Explore top LinkedIn content from expert professionals.
Summary
Integrating visual and auditory cues means combining what we see and hear to shape how we understand information or experiences. This approach helps the brain build a richer picture, whether online, in presentations, or through new technologies, making communication and sensory perception more engaging and memorable.
- Design for multisensory input: Pair visuals and sounds thoughtfully to help people form stronger associations and recognize details they might otherwise miss.
- Use audio to build trust: Adding voice notes or auditory elements can humanize digital content and make explanations feel more personal and transparent.
- Balance sensory attention: Be aware that even small interruptions—like blinking—can shift how the brain processes sights and sounds, so timing and pacing matter when engaging audiences.
-
-
A voice note on product pages boosted ARPU by 1.72%... for men's apparel. We added a simple audio clip describing key features right under the ratings on men's PDPs. Think boxers and socks: a quick 30-second rundown of fit, fabric, and why it stands out. Hypothesis? Auditory cues pair with visuals to amp up engagement (hello, Dual Coding Theory), cutting decision time and making the pitch more vivid. Results across 66k users: ARPU: +1.72% (from €9.46 to €9.63) Conversion rate: +2.74% AOV: -0.99% (slight dip) All statistically significant. Mobile users loved it most (+2.88% ARPU), while desktop was neutral. Extra revenue during the test? Over €8k. Scaled monthly? Mid-five figures. Why it clicked: Shoppers skim text but tune into a confident voice. It feels personal, builds trust fast... especially for guys who want facts over fluff. Even non-listeners (93% didn't play) sensed the transparency, sparking that "this brand gets me" vibe. Pro tip: For men's lines, voice notes beat walls of copy. They humanize the sell without overwhelming. The lesson? In ecom, ears can out-earn eyes. Test audio on your PDPs... before competitors do. Follow for more CRO wins from Drip.
-
If you want people to pay attention to you, here is what makes a great speaker... It all comes down to how well you make people see, hear, and feel your message. The framework I teach leaders is called V.A.K. - Visual. Auditory. Kinaesthetic. 👉 Visual: Paint the picture for your audience Numbers and abstract statements wash over people. But when you show them where things were and where they are now, this becomes more visual. Use contrast. Use timelines. Give people a mental image they can hold onto. If your audience can't visualise what you're describing, you've probably lost them 👉 Auditory: Your voice is an instrument. Play it. The words you say matter far less than how you deliver them. A well-placed pause creates tension. A shift in pace signals importance. When you speak at the same speed and volume the entire time, your audience stops hearing you. They might be looking at you, but their brain checked out two minutes ago. 👉 Kinaesthetic: Get them to feel something. This is the one that separates forgettable speakers from persuasive ones. Stop telling people something is important. Instead, show them what happens if it goes wrong. What's at risk? What do they stand to lose? Data alone has never changed anyone's mind, but emotion has. See it. Hear it. Feel it. I broke down the full framework in under 90 seconds in the video below 👇 💬 Which one do you struggle with most? Visual, Auditory, or Kinaesthetic? Tell me in the comments. 🔁 Share this with someone preparing for a big presentation this week. 🔔 Follow Will Bremridge for more on how to become one hell of a good communicator.
-
Blinking is usually thought of as a simple reflex that protects and moistens the eyes, but new neuroscience research shows it also affects how the brain handles sound. Each blink briefly suppresses visual input, creating a short interruption in sensory flow. During this moment, the brain does not simply pause. Instead, it rapidly adjusts how attention is distributed across senses, influencing how auditory information is processed immediately afterward. Studies examining brain activity and behavior found that sound perception changes in the moments following a blink. Neural networks involved in attention and sensory integration showed altered timing, suggesting the brain temporarily recalibrates how it prioritizes incoming sounds as vision resumes. Participants in these experiments displayed subtle shifts in reaction speed and auditory sensitivity right after blinking. These effects were small but consistent, demonstrating that even brief interruptions in vision can ripple across other sensory systems. These findings suggest blinking plays a broader role in perception than previously assumed. Rather than being a passive maintenance function, blinking appears to help coordinate sensory processing in dynamic environments. By briefly shifting the balance between visual and auditory attention, the brain may optimize how it samples information from the world. This work adds to growing evidence that perception is highly interconnected, with even simple actions influencing how the brain integrates sights and sounds. Research Paper 📄 DOI: 10.1177/23312165251371118
-
“Find every scene in a 3 hour film where a [lead character] shows [frustration] while mentioning [a specific phrase], then match similar visual style and soundtrack for an epic trailer creation” This used to take weeks. You needed: 📄 text transcription 🎬 video tagging 🎵 audio analysis 💱 multiple models, and a lot of glue code to align everything Each modality lived in its own vector space, and you needed to stitch them together. But a quiet shift is happening. Instead of forcing every piece of data (text, images, video clips, audio) into separate silos and then trying to glue them together, leading models now turn all of them into comparable vectors in a shared space. That’s the power of #multimodal #embedding 🔥 Models like Google Gemini Embedding 2 and TwelveLabs Marengo 3.0 are starting to represent text, images, video, and audio in a shared space. And when paired with a strong vision language model (VLM), you get * More expressive queries across time and modalities * More coherent understanding of unstructured content * Can reduce hallucinations in retrieval-augmented setups 🚘 Another example in autonomous systems: you can now retrieve precise driving footage segments that combine specific visual conditions (rain + pedestrian gesture), audio cues (horn sounds), and telemetry data for better edge-case analysis and simulation. The current leaders in this space are Gemini Embedding 2 Qwen3-VL embeddings TwelveLabs Marengo 3.0 Jina Embeddings v4 Voyage Multimodal VAST Data is integrating with all of them as part of the AI OS! I’m expecting this to quietly become the new foundation for any system that needs to reason over real-world, multi-sensory data. #multimodal #embedding #realworld #contextunderstanding
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development