Section 01

VPN Detection and True Geolocation Recovery

Detecting whether a visitor is masking location via VPN, proxy, or Tor is foundational for non-local attendee classification. Techniques range from simple IP lookups to deeper network forensics, each with tradeoffs in accuracy, latency, and privacy.

Commercial VPN detection
95–99%
Accuracy for known VPN providers via IP reputation databases
Tor exit nodes
~100%
Detection rate via public exit node lists
Residential proxies
<60%
The critical blind spot for all detection services
VPN usage growth
+41%
Increase in global VPN adoption 2023 to 2025

IP Reputation and Data Centre Detection

The most reliable VPN detection method is querying continuously updated IP classification databases that categorise addresses as residential, business, hosting/data centre, VPN, proxy, or Tor exit node. Providers like IPQualityScore, MaxMind, Spur.us, IPinfo, and IP2Location maintain these datasets by enumerating commercial VPN endpoints, mapping cloud-provider ASN ranges (AWS AS16509, DigitalOcean AS14061, OVH AS16276), and ingesting abuse intelligence.

IP-to-ASN mapping extends this by classifying each IP's Autonomous System. Hosting-provider ASNs strongly indicate data-centre traffic where VPN endpoints usually run, and can be evaluated at the CDN edge with sub-millisecond latency using self-hosted MMDB data.

MTU/TTL Fingerprinting

VPN encapsulation leaves measurable artefacts at the network layer. VPN tunnelling reduces effective MTU below standard 1500-byte Ethernet MTU, and TCP MSS values expose this. Default TTL values also vary by OS, so a claimed Windows client with Linux-like TTL behaviour can indicate proxy or VPN mediation.

Network-Layer Fingerprints

SNITCH research (NDSS MADWeb 2025) demonstrated 89.1% accuracy in detecting VPN-tunnelled connections by comparing observed round-trip times against expected RTTs for claimed IP geolocation, with strong precision and recall at scale.

VPN ProtocolTypical MTUMSS ValueDetection Reliability
WireGuard (IPv4)~1440~1400High
OpenVPN (UDP)~1400–1409~1360–1369High
IPsec/IKEv2~1380–1438~1340–1398Medium
No VPN (Ethernet)15001460Baseline

WebRTC and DNS Leaks

WebRTC Leak Detection

Creating an RTCPeerConnection with a STUN server and inspecting ICE candidates can reveal mismatches between connection IP and candidate addresses. Browser mitigations are reducing this signal over time, but it remains a useful secondary check.

DNS Leak Detection

Resolver-level mismatch analysis can indicate that DNS traffic is bypassing the VPN path. DNS-over-HTTPS adoption has reduced practical leak rates, so this is best used as enrichment rather than a primary classifier.

Timezone and Locale Mismatches

Comparing IP-derived timezone against browser-reported timezone via Intl.DateTimeFormat().resolvedOptions().timeZone is simple and universally supported, requiring no permission.

Combined with navigator.language and Accept-Language header analysis, this passive approach narrows likely location with high confidence and no permission prompts.

HotelMap Implementation Note
Timezone comparison is identified as the single most valuable passive signal for non-local detection. It requires no consent and works across all major browsers.

True Location Recovery Behind VPN

TechniqueResolutionAccuracyPermissionSpoofability
HTML5 Geolocation APIStreet-level (1–100m)Very HighRequiredLow
WiFi positioning10–50m urbanVery HighRequiredLow
Browser timezoneCountry/regionHighNoMedium
Language/localeCountryGoodNoMedium
Accept-Language headerCountryGoodNoLow
WebRTC IP leakCityDecliningNoN/A

Commercial VPN Detection Services

ProviderVPN DetectionCity AccuracyStarting PriceStrength
IPQualityScore99.9% claimed~85%Free (200/day)Fraud scoring, residential proxy detection
Spur.us60M+ anonymous IPs~80%Free (1M/month)VPN provider attribution by name
MaxMind GeoIP2Good (not specialized)80–90%~$30/monthIndustry standard, self-hosted MMDB
IPinfo5 boolean privacy flags~85%$49/monthDaily updates via Probe Network
IP2LocationGood~75%$99/yearBest value, broad coverage
Recommended Stack
MaxMind GeoIP2 + Spur.us or IPQS

$100–500/month for moderate traffic volumes. Combines industry-standard geolocation accuracy with specialised VPN provider attribution.

Section 02

Cookieless User Identification

Beyond cookies, a rich set of identifiers exists to match repeat visitors. These range from client-side browser fingerprints to server-side protocol signals and first-party identity anchors from registration.

Browser Fingerprinting Techniques

TechniqueEntropyMethod
Canvas Fingerprinting8–10 bitsDraws text/shapes on invisible canvas, hashes pixel output via toDataURL(). GPU hardware and driver versions produce unique outputs.
WebGL Fingerprinting~99% uniqueReads UNMASKED_VENDOR_WEBGL and UNMASKED_RENDERER_WEBGL strings.
AudioContext Fingerprinting3–5 bitsGenerates signals via OscillatorNode, processes through DynamicsCompressorNode.
Font Enumeration13–15+ bitsJavaScript-based measurement against fallback fonts to detect installed typefaces.

Server-Side Fingerprinting

Server-side techniques are the most durable identification layer because browser extensions cannot directly spoof lower-layer protocol characteristics.

JA4 TLS Fingerprinting

JA4 TLS fingerprinting (developed by FoxIO in 2023) sorts cipher suites and extensions before hashing, making it resilient to TLS extension randomisation in Chrome 110+ and Firefox 114+.

HTTP/2 SETTINGS Frame Fingerprinting

HTTP/2 SETTINGS fingerprints use hardcoded browser values: Chrome uses INITIAL_WINDOW_SIZE: 6291456 and MAX_CONCURRENT_STREAMS: 1000, while Firefox uses INITIAL_WINDOW_SIZE: 131072 and omits MAX_CONCURRENT_STREAMS.

Cross-Layer Consistency Checking

The strongest pattern is cross-layer inconsistency: if User-Agent, TCP fingerprint, TLS fingerprint, and HTTP/2 behaviour disagree, interception or spoofing is likely.

First-Party Data Approaches

Login-Based Identity

Login-based identity is the gold standard for deterministic matching. Event registration serves as the identity anchor with a persistent user ID.

Email Hashing

Email hashing via SHA-256 of lowercased, trimmed email enables cross-platform matching without sharing raw PII. This is the standard used by LiveRamp, The Trade Desk (UID2.0), and Facebook Custom Audiences.

Storage Mechanisms

localStorage, IndexedDB, and Service Worker state are useful first-party persistence layers, but are increasingly partitioned or time-limited by modern browser privacy controls.

CNAME Cloaking

CNAME cloaking maps first-party subdomains to tracker infrastructure. Defences vary by browser, and this approach is increasingly constrained by anti-tracking systems.

Fingerprint Uniqueness and Stability

EFF's Panopticlick found 83.6% of fingerprints unique among privacy enthusiasts (470K users), while “Hiding in the Crowd” found only 33.6% unique among 2M general-population users, and ~18.5% on mobile devices.

Fingerprint stability averages approximately 1.8 weeks per browser (PETS 2020 longitudinal study). Eckersley's fingerprint evolution algorithm correctly linked evolved fingerprints in 99.1% of cases with 0.87% false positive rate.

Commercial Fingerprinting Solutions

SolutionAccuracyScaleStarting PriceBest For
Fingerprint Pro99.5% (claimed)4B+ devices, 50M+ daily events$99/month (20K calls)Web fingerprinting, incognito detection
ThreatMetrixEnterprise-grade78B+ data recordsEnterprise pricingFull fraud prevention, behavioral biometrics
Arkose Labs125+ risk signalsEnterpriseEnterprise pricingProgressive fingerprinting, challenge-response
Recommended
Fingerprint Pro at $99/month

Best balance of accuracy, integration ease, and cost for HotelMap's scale. Offers best-in-class incognito detection and straightforward SDK integration.

Section 03

The Browser Privacy Landscape in 2026

Chrome and Privacy Sandbox

Google reversed on third-party cookie deprecation in July 2024, offering a user-choice model instead. Users manage preferences through Chrome settings where third-party cookies remain enabled by default.

On 17 October 2025, Google retired most Privacy Sandbox advertising APIs: Topics API, Protected Audience API (FLEDGE), Attribution Reporting API, IP Protection, and Private Aggregation — citing low adoption. Three technologies survived: CHIPS, FedCM, and Private State Tokens.

Safari 26 (September 2025)

Advanced Fingerprinting Protection (AFP) blocks known fingerprinting scripts and Google Tag Manager in Private Browsing. Third-party cookies have been blocked since 2019. JavaScript-set first-party cookies and script-writable storage are capped at 7 days without user interaction.

iCloud Private Relay uses a two-hop relay: Apple sees the user's IP but not the destination; Cloudflare/Akamai sees the destination but not the IP. Covers only Safari traffic and DNS queries.

Firefox 145 (November 2025)

Canvas readback returns randomized data. Font enumeration is blocked in favour of standard OS fonts. Hardware details are normalized, and WebGL/Audio API outputs are randomized. These protections are active in ETP Strict mode and Private Browsing. The combined effect reduced uniquely identifiable users by nearly 50% (from 65% to ~35% in testing).

What Still Works Across All Browsers in 2026

TechniqueChromeSafariFirefoxBrave
Canvas fingerprintingWorksAFP blocksNoiseRandomized
WebGL renderer stringFull detailMaskedGroupedRandomized
AudioContextWorksNoise (Private)Randomized (Strict)Varies
Font enumerationWorksLimitedBlocked (Strict)Randomized
Third-party cookiesDefault onBlockedPartitionedBlocked
localStorage / IndexedDBPartitioned7-day ITPPartitionedPartitioned
JA4 TLS fingerprintServer-sideServer-sideServer-sideServer-side
HTTP/2 SETTINGSServer-sideServer-sideServer-sideServer-side
Timezone / localeWorksWorksWorksWorks
Key Pattern
Server-side signals (JA4, HTTP/2 SETTINGS, TCP/IP fingerprinting) remain reliable across all browsers. Client-side fingerprinting is most effective on Chrome (66.8% global market share) and significantly degraded on Safari, Firefox, and Brave.
Section 04

Event Attendee Tracking for HotelMap

Detecting Non-Local Registrants

The key question is whether a registrant lives far enough from the venue to require accommodation. The strongest implementation combines multiple weighted signals:

01
Registration address and zip code (weight 0.5)
Gold standard when available. Geocoding APIs convert address data to coordinates for Haversine distance scoring with very high reliability.
02
IP geolocation at registration (weight 0.3)
Fast passive signal for initial classification. Country accuracy is high; city-level precision varies by provider, mobile network, and VPN usage.
03
Browser timezone comparison (weight 0.2)
Strong tiebreaker signal for non-local inference using browser timezone versus venue timezone. Works without permissions and across major browsers.
Distance BandClassification
0–50 miles (0–80 km)Local
50–150 milesLikely non-local
150+ milesDefinitely non-local

Recommended thresholds vary by event type. For multi-day conferences, 50 miles / 80 km is often the practical cutoff. A tiered model is most actionable: 0–50 miles local, 50–150 likely non-local, 150+ definitely non-local.

Cross-Session Identity Maintenance

Tracking registrants across weeks between registration and event requires a layered approach that does not rely on third-party cookies. A five-stage pipeline works best: registration anchor, URL handoff, email token re-identification, first-party cookie fallback, then probabilistic fingerprinting as last resort.

1
Registration
Capture reg_id and email_hash
2
URL Handoff
Pass reg_id and event_id
3
Email Re-ID
Tokenized link match
4
Cookie Fallback
hm_visitor_id session merge
5
Fingerprint
Probabilistic last resort
Server-side identity resolution — priority order from deterministic to probabilistic

URL Parameter Passing

Critical for the registration-to-hotel-booking handoff. When an attendee clicks from the registration confirmation to the hotel booking page, appending ?reg_id=XXX&email_hash=YYY&event_id=ZZZ deterministically links sessions. This is the most reliable cross-session bridge because it requires no cookie persistence.

Email-Based Re-Identification

Every post-registration email contains a personalised link with a unique token (hotelmap.com/book?token=abc123) that maps to the registrant's server-side profile. Each email click re-establishes identity with zero cookie dependency. This is also the most privacy-compliant cross-session method.

Server-Side Identity Graph

Stitches all signals: {registrant_id, email_hash, cookie_id, events[], sessions[]}. When any identifier matches on a subsequent visit, sessions merge. Redis (or equivalent) provides real-time lookups.

Capture Rate Benchmarks and Optimisation

Capture rate is the share of non-local attendees who book through the official housing system versus booking direct, via OTAs, or via other channels. Kalibri Labs/PCMA Foundation research analysed 2M+ records across major event markets:

MetricValue
Official system bookings48% of non-local attendees
In-block, wrong channel23% stayed at block hotels but booked elsewhere
Price perception gap39% believed official rates more expensive
Cart abandonment email open rate70%

The headline constraint is perception: many attendees believe direct booking is cheaper even when event rates are stronger. Transparent comparisons and clear value framing are usually more impactful than adding tracking complexity.

Price Perception Problem
The biggest lever for improving capture rate is communication, not tracking. Presenting transparent rate comparisons addresses the perception that independent booking is cheaper. Attendees decide within the first few seconds on a booking page.

Retargeting Sequence for Unbooked Attendees

Email retargeting follows a tested cadence that materially outperforms general campaign traffic. Hotel cart abandonment emails achieve 70% open rates, 10% click-through rates, and $3.65 revenue per recipient. Retargeted ads deliver 180% higher click-through rates and 300% higher conversion rates versus first-time visitors.

1
Immediately post-registration — Confirmation + hotel CTA
Highest-conversion moment. Integrate booking as next step in the flow.
2
24–48 hours later — 'Complete your stay' reminder
Feature top 3 hotels by proximity and value.
3
1 week post-registration — Rate urgency
Real-time availability messaging: 'rooms filling up.'
4
2 weeks before event — 'Last chance for group rate'
Countdown timer and social proof.
5
1 week before event — Final reminder
Remaining inventory urgency messaging.
6
Day of cutoff — Urgency close
'Book today or lose the group rate.'

Full-Funnel Analytics Architecture

The tracking architecture implements server-side event stitching using reg_id as the primary key across all systems. Four events form the complete funnel from registration to booking, with derived metrics calculated at reporting time.

GA4 event schema — reg_id stitches all events server-side
Section 06

Further Reading

VPN Detection and IP Intelligence

SNITCH: Leveraging IP Geolocation for Active VPN Detection (NDSS MADWeb 2025)

JA4T: TCP Fingerprinting (FoxIO)

How MTU Fingerprinting Identifies VPNs and Mobile Users (PeakHour)

p0f-mtu: p0f with MTU-based VPN detection (GitHub)

IP Intelligence: A Guide to Recent Advances (Claremont Graduate University ICDC)

Browser Fingerprinting

Fingerprinting and Tracing Shadows: Browser Fingerprinting on Digital Privacy (arXiv, 2024)

Browser Fingerprinting: A Survey (arXiv)

JA4+ Network Fingerprinting (FoxIO Blog)

HTTP/2 Fingerprinting: A Relatively Unknown Method

Fingerprint Pro FAQ and Technical Documentation

Browser Privacy and Anti-Tracking

Update on Plans for Privacy Sandbox Technologies (Google, 2025)

Mozilla Firefox Gets New Anti-Fingerprinting Defences (Bleeping Computer)

Firefox 85 Cracks Down on Supercookies (Mozilla Security Blog)

Event Housing and Capture Rate

Real Reasons Attendees Are Not Booking in the Convention Room Block (PCMA)

5 Event Retargeting Methods That Fill Empty Seats (The Events Calendar)

Privacy and Compliance

IP Geolocation and GDPR: What Developers Need to Know (Kamero AI)