VPN Detection and True Geolocation Recovery
Detecting whether a visitor is masking location via VPN, proxy, or Tor is foundational for non-local attendee classification. Techniques range from simple IP lookups to deeper network forensics, each with tradeoffs in accuracy, latency, and privacy.
IP Reputation and Data Centre Detection
The most reliable VPN detection method is querying continuously updated IP classification databases that categorise addresses as residential, business, hosting/data centre, VPN, proxy, or Tor exit node. Providers like IPQualityScore, MaxMind, Spur.us, IPinfo, and IP2Location maintain these datasets by enumerating commercial VPN endpoints, mapping cloud-provider ASN ranges (AWS AS16509, DigitalOcean AS14061, OVH AS16276), and ingesting abuse intelligence.
IP-to-ASN mapping extends this by classifying each IP's Autonomous System. Hosting-provider ASNs strongly indicate data-centre traffic where VPN endpoints usually run, and can be evaluated at the CDN edge with sub-millisecond latency using self-hosted MMDB data.
MTU/TTL Fingerprinting
VPN encapsulation leaves measurable artefacts at the network layer. VPN tunnelling reduces effective MTU below standard 1500-byte Ethernet MTU, and TCP MSS values expose this. Default TTL values also vary by OS, so a claimed Windows client with Linux-like TTL behaviour can indicate proxy or VPN mediation.
Network-Layer Fingerprints
SNITCH research (NDSS MADWeb 2025) demonstrated 89.1% accuracy in detecting VPN-tunnelled connections by comparing observed round-trip times against expected RTTs for claimed IP geolocation, with strong precision and recall at scale.
| VPN Protocol | Typical MTU | MSS Value | Detection Reliability |
|---|---|---|---|
| WireGuard (IPv4) | ~1440 | ~1400 | High |
| OpenVPN (UDP) | ~1400–1409 | ~1360–1369 | High |
| IPsec/IKEv2 | ~1380–1438 | ~1340–1398 | Medium |
| No VPN (Ethernet) | 1500 | 1460 | Baseline |
WebRTC and DNS Leaks
WebRTC Leak Detection
Creating an RTCPeerConnection with a STUN server and inspecting ICE candidates can reveal mismatches between connection IP and candidate addresses. Browser mitigations are reducing this signal over time, but it remains a useful secondary check.
DNS Leak Detection
Resolver-level mismatch analysis can indicate that DNS traffic is bypassing the VPN path. DNS-over-HTTPS adoption has reduced practical leak rates, so this is best used as enrichment rather than a primary classifier.
Timezone and Locale Mismatches
Comparing IP-derived timezone against browser-reported timezone via Intl.DateTimeFormat().resolvedOptions().timeZone is simple and universally supported, requiring no permission.
Combined with navigator.language and Accept-Language header analysis, this passive approach narrows likely location with high confidence and no permission prompts.
True Location Recovery Behind VPN
| Technique | Resolution | Accuracy | Permission | Spoofability |
|---|---|---|---|---|
| HTML5 Geolocation API | Street-level (1–100m) | Very High | Required | Low |
| WiFi positioning | 10–50m urban | Very High | Required | Low |
| Browser timezone | Country/region | High | No | Medium |
| Language/locale | Country | Good | No | Medium |
| Accept-Language header | Country | Good | No | Low |
| WebRTC IP leak | City | Declining | No | N/A |
Commercial VPN Detection Services
| Provider | VPN Detection | City Accuracy | Starting Price | Strength |
|---|---|---|---|---|
| IPQualityScore | 99.9% claimed | ~85% | Free (200/day) | Fraud scoring, residential proxy detection |
| Spur.us | 60M+ anonymous IPs | ~80% | Free (1M/month) | VPN provider attribution by name |
| MaxMind GeoIP2 | Good (not specialized) | 80–90% | ~$30/month | Industry standard, self-hosted MMDB |
| IPinfo | 5 boolean privacy flags | ~85% | $49/month | Daily updates via Probe Network |
| IP2Location | Good | ~75% | $99/year | Best value, broad coverage |
$100–500/month for moderate traffic volumes. Combines industry-standard geolocation accuracy with specialised VPN provider attribution.
Cookieless User Identification
Beyond cookies, a rich set of identifiers exists to match repeat visitors. These range from client-side browser fingerprints to server-side protocol signals and first-party identity anchors from registration.
Browser Fingerprinting Techniques
| Technique | Entropy | Method |
|---|---|---|
| Canvas Fingerprinting | 8–10 bits | Draws text/shapes on invisible canvas, hashes pixel output via toDataURL(). GPU hardware and driver versions produce unique outputs. |
| WebGL Fingerprinting | ~99% unique | Reads UNMASKED_VENDOR_WEBGL and UNMASKED_RENDERER_WEBGL strings. |
| AudioContext Fingerprinting | 3–5 bits | Generates signals via OscillatorNode, processes through DynamicsCompressorNode. |
| Font Enumeration | 13–15+ bits | JavaScript-based measurement against fallback fonts to detect installed typefaces. |
Server-Side Fingerprinting
Server-side techniques are the most durable identification layer because browser extensions cannot directly spoof lower-layer protocol characteristics.
JA4 TLS Fingerprinting
JA4 TLS fingerprinting (developed by FoxIO in 2023) sorts cipher suites and extensions before hashing, making it resilient to TLS extension randomisation in Chrome 110+ and Firefox 114+.
HTTP/2 SETTINGS Frame Fingerprinting
HTTP/2 SETTINGS fingerprints use hardcoded browser values: Chrome uses INITIAL_WINDOW_SIZE: 6291456 and MAX_CONCURRENT_STREAMS: 1000, while Firefox uses INITIAL_WINDOW_SIZE: 131072 and omits MAX_CONCURRENT_STREAMS.
Cross-Layer Consistency Checking
The strongest pattern is cross-layer inconsistency: if User-Agent, TCP fingerprint, TLS fingerprint, and HTTP/2 behaviour disagree, interception or spoofing is likely.
First-Party Data Approaches
Login-Based Identity
Login-based identity is the gold standard for deterministic matching. Event registration serves as the identity anchor with a persistent user ID.
Email Hashing
Email hashing via SHA-256 of lowercased, trimmed email enables cross-platform matching without sharing raw PII. This is the standard used by LiveRamp, The Trade Desk (UID2.0), and Facebook Custom Audiences.
Storage Mechanisms
localStorage, IndexedDB, and Service Worker state are useful first-party persistence layers, but are increasingly partitioned or time-limited by modern browser privacy controls.
CNAME Cloaking
CNAME cloaking maps first-party subdomains to tracker infrastructure. Defences vary by browser, and this approach is increasingly constrained by anti-tracking systems.
Fingerprint Uniqueness and Stability
EFF's Panopticlick found 83.6% of fingerprints unique among privacy enthusiasts (470K users), while “Hiding in the Crowd” found only 33.6% unique among 2M general-population users, and ~18.5% on mobile devices.
Fingerprint stability averages approximately 1.8 weeks per browser (PETS 2020 longitudinal study). Eckersley's fingerprint evolution algorithm correctly linked evolved fingerprints in 99.1% of cases with 0.87% false positive rate.
Commercial Fingerprinting Solutions
| Solution | Accuracy | Scale | Starting Price | Best For |
|---|---|---|---|---|
| Fingerprint Pro | 99.5% (claimed) | 4B+ devices, 50M+ daily events | $99/month (20K calls) | Web fingerprinting, incognito detection |
| ThreatMetrix | Enterprise-grade | 78B+ data records | Enterprise pricing | Full fraud prevention, behavioral biometrics |
| Arkose Labs | 125+ risk signals | Enterprise | Enterprise pricing | Progressive fingerprinting, challenge-response |
Best balance of accuracy, integration ease, and cost for HotelMap's scale. Offers best-in-class incognito detection and straightforward SDK integration.
The Browser Privacy Landscape in 2026
Chrome and Privacy Sandbox
Google reversed on third-party cookie deprecation in July 2024, offering a user-choice model instead. Users manage preferences through Chrome settings where third-party cookies remain enabled by default.
On 17 October 2025, Google retired most Privacy Sandbox advertising APIs: Topics API, Protected Audience API (FLEDGE), Attribution Reporting API, IP Protection, and Private Aggregation — citing low adoption. Three technologies survived: CHIPS, FedCM, and Private State Tokens.
Safari 26 (September 2025)
Advanced Fingerprinting Protection (AFP) blocks known fingerprinting scripts and Google Tag Manager in Private Browsing. Third-party cookies have been blocked since 2019. JavaScript-set first-party cookies and script-writable storage are capped at 7 days without user interaction.
iCloud Private Relay uses a two-hop relay: Apple sees the user's IP but not the destination; Cloudflare/Akamai sees the destination but not the IP. Covers only Safari traffic and DNS queries.
Firefox 145 (November 2025)
Canvas readback returns randomized data. Font enumeration is blocked in favour of standard OS fonts. Hardware details are normalized, and WebGL/Audio API outputs are randomized. These protections are active in ETP Strict mode and Private Browsing. The combined effect reduced uniquely identifiable users by nearly 50% (from 65% to ~35% in testing).
What Still Works Across All Browsers in 2026
| Technique | Chrome | Safari | Firefox | Brave |
|---|---|---|---|---|
| Canvas fingerprinting | Works | AFP blocks | Noise | Randomized |
| WebGL renderer string | Full detail | Masked | Grouped | Randomized |
| AudioContext | Works | Noise (Private) | Randomized (Strict) | Varies |
| Font enumeration | Works | Limited | Blocked (Strict) | Randomized |
| Third-party cookies | Default on | Blocked | Partitioned | Blocked |
| localStorage / IndexedDB | Partitioned | 7-day ITP | Partitioned | Partitioned |
| JA4 TLS fingerprint | Server-side | Server-side | Server-side | Server-side |
| HTTP/2 SETTINGS | Server-side | Server-side | Server-side | Server-side |
| Timezone / locale | Works | Works | Works | Works |
Event Attendee Tracking for HotelMap
Detecting Non-Local Registrants
The key question is whether a registrant lives far enough from the venue to require accommodation. The strongest implementation combines multiple weighted signals:
| Distance Band | Classification |
|---|---|
| 0–50 miles (0–80 km) | Local |
| 50–150 miles | Likely non-local |
| 150+ miles | Definitely non-local |
Recommended thresholds vary by event type. For multi-day conferences, 50 miles / 80 km is often the practical cutoff. A tiered model is most actionable: 0–50 miles local, 50–150 likely non-local, 150+ definitely non-local.
Cross-Session Identity Maintenance
Tracking registrants across weeks between registration and event requires a layered approach that does not rely on third-party cookies. A five-stage pipeline works best: registration anchor, URL handoff, email token re-identification, first-party cookie fallback, then probabilistic fingerprinting as last resort.
URL Parameter Passing
Critical for the registration-to-hotel-booking handoff. When an attendee clicks from the registration confirmation to the hotel booking page, appending ?reg_id=XXX&email_hash=YYY&event_id=ZZZ deterministically links sessions. This is the most reliable cross-session bridge because it requires no cookie persistence.
Email-Based Re-Identification
Every post-registration email contains a personalised link with a unique token (hotelmap.com/book?token=abc123) that maps to the registrant's server-side profile. Each email click re-establishes identity with zero cookie dependency. This is also the most privacy-compliant cross-session method.
Server-Side Identity Graph
Stitches all signals: {registrant_id, email_hash, cookie_id, events[], sessions[]}. When any identifier matches on a subsequent visit, sessions merge. Redis (or equivalent) provides real-time lookups.
Capture Rate Benchmarks and Optimisation
Capture rate is the share of non-local attendees who book through the official housing system versus booking direct, via OTAs, or via other channels. Kalibri Labs/PCMA Foundation research analysed 2M+ records across major event markets:
| Metric | Value |
|---|---|
| Official system bookings | 48% of non-local attendees |
| In-block, wrong channel | 23% stayed at block hotels but booked elsewhere |
| Price perception gap | 39% believed official rates more expensive |
| Cart abandonment email open rate | 70% |
The headline constraint is perception: many attendees believe direct booking is cheaper even when event rates are stronger. Transparent comparisons and clear value framing are usually more impactful than adding tracking complexity.
Retargeting Sequence for Unbooked Attendees
Email retargeting follows a tested cadence that materially outperforms general campaign traffic. Hotel cart abandonment emails achieve 70% open rates, 10% click-through rates, and $3.65 revenue per recipient. Retargeted ads deliver 180% higher click-through rates and 300% higher conversion rates versus first-time visitors.
Full-Funnel Analytics Architecture
The tracking architecture implements server-side event stitching using reg_id as the primary key across all systems. Four events form the complete funnel from registration to booking, with derived metrics calculated at reporting time.
Legal Compliance Architecture
IP addresses are personal data under GDPR (confirmed by CJEU rulings). EDPB Guidelines 2/2023 (October 2024) clarified that ePrivacy Article 5(3) applies to device fingerprinting, tracking pixels, IP-only tracking, and URL tracking, meaning many fingerprinting patterns require consent unless strictly necessary.
Tier 1: No Additional Consent Needed
Legitimate interest for service delivery and fraud prevention covers the following techniques:
Tier 2: Consent Recommended or Required
Terminal equipment access requires consent in most EU/UK contexts:
Tier 3: Explicit Consent Always Required
Data Controller Relationships
| Role | Entity | Responsibility |
|---|---|---|
| Data Controller | Event organizer | Determines purpose and means of processing. Registration forms must include clear privacy notices with separate opt-in checkboxes. |
| Data Processor | HotelMap | Processes under a Data Processing Agreement (DPA). Pre-checked boxes are invalid under GDPR. |
| Separate Data Controller | Hotels | Act independently post-booking for their own processing purposes. |
is_non_local flag. Raw IP need not be stored beyond initial processing. Email hashing (SHA-256) provides cross-system identifier. Retention limited to 6–12 months after event with automated DSAR handling and deletion cascades.Email-tokenized links and URL parameter passing are more reliable than fingerprinting and require no consent beyond registration itself. Server-set first-party cookies provide a fallback; device fingerprinting serves as probabilistic last resort with appropriate consent. This system operates within GDPR/ePrivacy compliance using a tiered consent model.
Further Reading
VPN Detection and IP Intelligence
SNITCH: Leveraging IP Geolocation for Active VPN Detection (NDSS MADWeb 2025)
JA4T: TCP Fingerprinting (FoxIO)
How MTU Fingerprinting Identifies VPNs and Mobile Users (PeakHour)
p0f-mtu: p0f with MTU-based VPN detection (GitHub)
IP Intelligence: A Guide to Recent Advances (Claremont Graduate University ICDC)
Browser Fingerprinting
Fingerprinting and Tracing Shadows: Browser Fingerprinting on Digital Privacy (arXiv, 2024)
Browser Fingerprinting: A Survey (arXiv)
JA4+ Network Fingerprinting (FoxIO Blog)
HTTP/2 Fingerprinting: A Relatively Unknown Method
Fingerprint Pro FAQ and Technical Documentation
Browser Privacy and Anti-Tracking
Update on Plans for Privacy Sandbox Technologies (Google, 2025)
Mozilla Firefox Gets New Anti-Fingerprinting Defences (Bleeping Computer)
Firefox 85 Cracks Down on Supercookies (Mozilla Security Blog)
Event Housing and Capture Rate
Real Reasons Attendees Are Not Booking in the Convention Room Block (PCMA)
5 Event Retargeting Methods That Fill Empty Seats (The Events Calendar)
Privacy and Compliance
IP Geolocation and GDPR: What Developers Need to Know (Kamero AI)