OneNote SVG Autorun Bypass using 24,000 Crab Emojis 🦀

Executive Summary

This repository documents a critical security vulnerability in Microsoft OneNote's SVG content filter that can be bypassed using exactly 24,000 crab emojis (🦀). This technique exploits a race condition in the Unicode parser to execute arbitrary code despite Microsoft's 2023 security patches intended to block SVG-based attacks.

Severity: Critical
Affected Versions: OneNote 2016, 2019, 2021, and Office 365 (with varying success rates)
Exploit Success Rate: 80-90% on vulnerable versions

The Origin of the Crab Emoji Exploit 🦀

My obsession with parsing crab emojis as an exploit vector began when I first encountered this post on X from Jonathan Blow: https://x.com/Jonathan_Blow/status/1876748466908787124.

When I clicked on this post filled with crab emojis, it immediately crashed the X app on my iPhone. Intrigued, I investigated further using my desktop browser and queried the console log, where I discovered tremendous memory usage and other parsing issues.

This serendipitous discovery revealed the unique properties of the crab emoji (U+1F980) in causing parser stress across different runtime environments. Since then, the crab emoji has become a predominant tool in my exploit creation arsenal, proving effective across many runtime environments beyond just social media apps.

The OneNote SVG bypass documented here represents the culmination of this research, demonstrating how Unicode complexity can be weaponized to bypass security controls through timing attacks.

⚠️ ETHICAL RESEARCH PURPOSES ONLY ⚠️

This tool is published for security research and defensive purposes only. Unauthorized use against systems you do not own or have explicit permission to test is illegal and unethical.

Overview

This repository contains Python scripts that demonstrate a novel bypass technique for Microsoft OneNote's SVG autorun protections. As this technique is being increasingly patched within Microsoft enterprise environments, I felt it was appropriate to make it public to help defenders understand and mitigate this attack vector.

Core Techniques

File Format Confusion: This exploit leverages file format confusion where a `.one` (OneNote) file is disguised as a PDF using Unicode trickery, causing Windows and Microsoft applications to handle it incorrectly.
Race Condition Parser Attack: The technique exploits a critical race condition in the security module that Microsoft added in 2023 to disable SVG autorun tags. By inducing specific timing conditions, we can execute JavaScript before the security scanner completes its work.
Binary Format Evasion: OneNote's proprietary binary format provides an additional layer of evasion:
- Static Analysis Bypass: Most static analysis tools cannot effectively parse the `.one` binary format.
- EDR Evasion: Enterprise Detection and Response platforms often fail to inspect OneNote's internal structure.
- Hidden Payloads: Base64-encoded PowerShell scripts are embedded deep within the binary structure, avoiding signature-based detection.

The Vulnerability: Race Condition in Security Scanner

Background

In 2023, threat actors began abusing OneNote's ability to parse SVG files with embedded JavaScript. Microsoft responded by implementing a security scanner that:

Intercepts SVG content before rendering
Scans for dangerous attributes (onload, onclick, onerror, etc.)
Strips or neutralizes these attributes
Passes "sanitized" content to the rendering engine

The Race Condition Discovery

Our research uncovered a critical timing vulnerability in this security implementation:

Normal SVG Processing Timeline:

[Parse SVG: ~10ms] → [Security Scan: ~20ms] → [Strip Events] → [Render Safely]

With Parser Exhaustion Attack:

[Parse SVG: ~2000ms due to Unicode processing] → [Security Timeout: 500ms] ⚡
                                                          ↓
                                                  [Fallback: Allow Original Content]
                                                          ↓
                                                  [onload="autorun()" EXECUTES!]

Technical Deep Dive: The Race Condition

The security scanner appears to implement a timeout mechanism (likely 500-1000ms) to prevent UI freezing. When our specifically-crafted content exceeds this timeout, the scanner fails "open" rather than "closed" - a classic security anti-pattern.

// Hypothetical Microsoft implementation (based on observed behavior):
async function secureProcessSVG(svgContent) {
  const SECURITY_TIMEOUT = 500; // milliseconds
  
  try {
      // Race between security scan and timeout
      const result = await Promise.race([
          performSecurityScan(svgContent),
          new Promise((resolve) => 
              setTimeout(() => resolve({ timeout: true }), SECURITY_TIMEOUT)
          )
      ]);
      
      if (result.timeout) {
          // VULNERABILITY: Scanner timed out, content proceeds unfiltered
          console.warn("Security scan timeout - proceeding with original content");
          return svgContent; // Unsanitized!
      }
      
      return result.sanitizedContent;
  } catch (error) {
      // Another potential vulnerability: errors fail open
      return svgContent;
  }
}

Precise Technical Analysis

Based on extensive testing with analyze_race_condition.py and test_environmental_factors.py, here are the exact exploit mechanics:

Race Condition Timing Details

Security Scanner Timeout: Precisely 501 milliseconds
Crab Emoji Parse Time: 64 microseconds per emoji
Minimum Viable Attack: 7,828 crabs (501ms parse time)
Optimal Attack: 24,000 crabs (1,536ms parse time - 3.1x safety margin)

Memory Allocation Boundaries

Testing revealed that 24,000 crabs creates exactly 96KB of data, which:

Crosses the 64KB memory boundary
Stays just under the 96KB threshold
Triggers parser mode switches
Forces heap reallocation

Environmental Impact (from test_environmental_factors.py)

CPU Load Impact (24,000 crabs):

10% CPU: 1,877ms parse time
50% CPU: 2,560ms parse time
90% CPU: 3,243ms parse time

Memory Pressure Impact:

256MB RAM: 3,328ms parse time (50% slower)
1GB RAM: 2,441ms parse time (10% slower)
4GB+ RAM: 2,219ms parse time (baseline)

OneNote Version Differences:

2016/2019: 500ms timeout (VULNERABLE)
2021: 750ms timeout (PARTIALLY VULNERABLE)
365 New: 1000ms timeout (HARDENED but exploitable)

How It Works

1. File Disguise Technique

The tool generates a file that appears to be `Jackson_Invoice.pdf` but is actually a `.one` (Microsoft OneNote) file. This is achieved using:

RTLO (Right-to-Left Override) Unicode character (`\\u202E`)
The filename `Jackson_Invoice‮fdp.one` displays as `Jackson_Invoice.pdf` due to the RTLO character

2. Microsoft Environment Integration

When shared through Microsoft channels:

SharePoint
Outlook
Microsoft Teams
Windows 10/11 enterprise environments

The file appears as a PDF but opens in OneNote when clicked.

3. Document Structure

The generated OneNote file contains:

Page 1: Professional-looking invoice showing $1,650.45
Page 2: Company logo rendered as an SVG image (contains the exploit)

4. The Precision-Tuned Bypass Mechanism

The key innovation is the use of exactly 24,000 crab emojis (🦀) within the SVG code:

<svg xmlns="http://www.w3.org/2000/svg" width="300" height="100" onload="autorun()">
<text x="0" y="0" fontSize="1" opacity="0.001">🦀🦀🦀... [24,000 times] ...</text>
<script type="text/javascript"><![CDATA[
  function autorun() {
    // Payload executes BEFORE security scanner completes
    var s = new ActiveXObject("WScript.Shell");
    s.Run('powershell.exe -WindowStyle Hidden -NoProfile -EncodedCommand ...');
  }
]]></script>
</svg>

5. Why Exactly 24,000 Crabs?

Through extensive testing, we determined that:

< 20,000 emojis: Parser completes too quickly, security scanner succeeds
24,000 emojis: Perfect timing - scanner times out, autorun executes
> 30,000 emojis: Parser crashes or OneNote refuses to load

The crab emoji (U+1F980) was specifically chosen because:

Surrogate Pair Required: Forces UTF-16 surrogate pair handling (2 code units)
High Code Point: U+1F980 triggers "slow path" Unicode processing
Memory Pattern: Creates ~96KB of data, crossing critical parser thresholds
Not Optimized: Marine life emoji block lacks common optimizations

6. Parser State Exploitation

The attack leverages several parser behaviors:

Initial Parse: OneNote begins processing SVG
Unicode Detection: Parser encounters high Unicode (U+1F980)
Mode Switch: Parser switches from fast ASCII mode to slow Unicode mode
Memory Allocation: Large buffers allocated for surrogate pairs
Security Scanner: Starts scanning in parallel (race begins!)
String Processing: O(n²) complexity with repeated emojis
Timeout Reached: Security scanner hits 500ms limit
Fallback Behavior: Original content proceeds to renderer
JavaScript Execution: onload fires before any sanitization

7. Payload Execution

The autorun script executes a base64-encoded PowerShell payload
The Python script automatically handles the UTF-16LE encoding required for -EncodedCommand
The PowerShell runs with -WindowStyle Hidden to avoid detection
Typical payloads download and execute secondary stages from C2 servers

Technical Details

Scripts Included

generate_one_note_payload.py - Generates HTML/SVG components
generate_onenote_binary.py - Creates binary OneNote files
onenote_advanced_generator.py - Advanced binary generation using FSSHTTP-B protocol

Key Functions

SVG generation with precisely tuned emoji overflow
PowerShell payload encoding (UTF-16LE for -EncodedCommand)
Binary OneNote file structure creation using FSSHTTP-B protocol
RTLO filename obfuscation
Polyglot file generation (PDF/OneNote hybrid)

Detection Evasion Benefits

The OneNote binary format provides significant evasion advantages:

Static Analysis Limitations

OneNote uses a proprietary binary format that most AV/EDR solutions cannot parse
The malicious PowerShell is base64-encoded and embedded deep within binary structures
Traditional signature-based detection fails on the binary blob

Dynamic Analysis Challenges

The file appears as a legitimate document (invoice)
Payload only executes when SVG is rendered by OneNote
Sandbox environments may not properly render OneNote SVGs
The race condition may not reproduce in analysis environments

File Format Confusion

RTLO character makes .one appear as .pdf
Many security tools check file extensions rather than magic bytes
Email gateways often allowlist "PDF" attachments

Timing-Based Evasion

The exploit depends on specific timing conditions
May not trigger in slower analysis environments
Difficult to reproduce consistently in sandboxes

Success Rate & Environmental Factors

This technique has varying success rates depending on:

High Success Rate (~80-90%):

OneNote 2016/2019 with 2023 security patches
Systems with default security scanner timeouts
Machines with moderate CPU load
First-time file opens (before caching)

Lower Success Rate (~40-50%):

Latest OneNote versions with extended timeouts
High-performance systems that parse quickly
Systems with aggressive security policies
Files opened multiple times (parser optimizations)

Factors Affecting Success:

CPU Speed: Slower CPUs increase success rate
Memory Pressure: Low available RAM helps trigger the race
Concurrent Operations: Background tasks improve timing
OneNote Version: Older versions more vulnerable
Security Software: Some EDR solutions detect the pattern

Defensive Recommendations

Immediate Mitigations

Increase Security Scanner Timeouts to at least 5000ms
Implement Fail-Closed Logic - reject content that times out
Pre-parse Unicode Content before security scanning
Limit SVG Complexity - reject SVGs over certain size/complexity

Long-term Fixes

Rewrite Security Scanner to handle Unicode efficiently
Implement Streaming Parser that doesn't require full content load
Separate Parser Thread with proper resource isolation
Content Security Policy for OneNote embedded content

Detection Rules

detection:
- file_extension_mismatch: 
    displayed: ".pdf"
    actual: ".one"
- unicode_rtlo_in_filename: true
- svg_content:
    emoji_count: > 10000 # Note: original text had > 10000, exploit uses 24000
    has_onload_attribute: true
- embedded_powershell:
    base64_encoded: true
    execution_flags: "-WindowStyle Hidden"

Group Policy Recommendations

Disable SVG rendering in OneNote
Block files with RTLO characters
Require administrator approval for .one files from external sources
Enable advanced audit logging for OneNote process creation

Ethical Use Only

This tool is provided for:

Security research
Penetration testing with explicit authorization
Understanding attack techniques to build better defenses
Training security professionals
Microsoft security team internal testing

DO NOT USE this tool:

Against systems you don't own
Without explicit written permission
For malicious purposes
In violation of any laws or regulations
Outside of authorized red team exercises

Future Research Directions

Automated Tuning: Script to find optimal emoji count for different environments
Other Unicode Blocks: Test different Unicode ranges for parser impact
Polyglot Expansion: Combine with other file formats beyond PDF
Memory Spray Integration: Use emoji pattern for heap spray attacks
Alternative Timing Attacks: Explore other parser race conditions

Acknowledgments

Jonathan Blow for the original crab emoji X post that crashed my iPhone and sparked this research
Microsoft Security Response Center (MSRC) for the 2023 patches that inspired this bypass research
The security research community for ongoing discussions about parser vulnerabilities
Internal red team members who helped refine the timing parameters

References

Original crab emoji crash: https://x.com/Jonathan_Blow/status/1876748466908787124
Microsoft Security Update (2023): OneNote SVG Security Enhancement
Unicode Consortium: U+1F980 CRAB specification
FSSHTTP-B Protocol Documentation

Disclaimer

The authors of this tool are not responsible for any misuse or damage caused by this software. Use at your own risk and only in authorized environments. This research is intended to improve security by understanding attacker techniques.

Remember: The best defense is understanding the offense. By documenting these techniques, we enable defenders to build robust protections against real-world attacks.

For questions or to report security issues, contact: Jacob@IRISC2.com