IP Reputation Scoring: Detecting Datacenters, Proxies, and VPNs
How IP reputation scoring works under the hood: detecting datacenter IPs, identifying proxies and VPNs, classifying residential vs commercial traffic, and building reputation databases.
Why IP Addresses Matter for Fraud Detection
An email address tells you who someone claims to be. An IP address tells you where they actually are and what kind of network they are using. When someone signs up for your service from a residential ISP in Portland, that is very different from the same signup coming through a datacenter IP in Romania routed through three proxy layers.
IP reputation is one of the most powerful fraud signals available, but it is also one of the most misunderstood. This article breaks down how IP reputation scoring actually works, what signals matter most, and how to use them without accidentally blocking legitimate users.
The Four Categories of IP Classification
At a high level, every IP address falls into one of four categories for fraud detection purposes:
1. Residential IPs
These are assigned by consumer ISPs (Comcast, Verizon, BT, Deutsche Telekom, etc.) to home users. They are the lowest-risk category because they are hard to acquire at scale. Getting a residential IP means paying for a home internet connection, which is expensive and traceable.
Fraud score impact: +5 to +15 (positive, trust-building)
2. Commercial/Corporate IPs
Assigned to businesses with dedicated internet connections. These are slightly higher risk than residential because corporate networks can host many users behind a single NAT IP, but they are still generally trustworthy.
Fraud score impact: +0 to +10 (neutral to slightly positive)
3. Datacenter IPs
This is where things get interesting. Datacenter IPs belong to hosting providers (AWS, DigitalOcean, Hetzner, OVH, etc.). Legitimate users rarely sign up for SaaS products from datacenter IPs. Bots almost always do.
Fraud score impact: -15 to -30 (strong negative signal)
4. Proxy/VPN/Tor IPs
Traffic routed through anonymizing services. This category ranges from privacy-conscious consumers using a commercial VPN to criminals using chained proxies. The risk level depends heavily on the specific service.
Fraud score impact: -10 to -25 (negative, varies by provider)
How Datacenter Detection Works
Identifying datacenter IPs is conceptually simple but operationally complex. Here are the main techniques:
ASN (Autonomous System Number) Classification
Every IP range is owned by an organization identified by its ASN. Major cloud providers have well-known ASNs:
// Simplified datacenter ASN check
const DATACENTER_ASNS = new Set([
'AS16509', // Amazon (AWS)
'AS14618', // Amazon
'AS15169', // Google Cloud
'AS396982', // Google Cloud
'AS8075', // Microsoft Azure
'AS13335', // Cloudflare
'AS14061', // DigitalOcean
'AS63949', // Linode (Akamai)
'AS24940', // Hetzner
'AS16276', // OVH
'AS46606', // Unified Layer
'AS20473', // Vultr / Choopa
// ... hundreds more
]);
interface ASNLookupResult {
asn: string;
organization: string;
isDatacenter: boolean;
}
function checkASN(asnResult: ASNLookupResult): number {
if (DATACENTER_ASNS.has(asnResult.asn)) {
return -25; // Strong negative signal
}
// Some ASNs are hosting companies not in our list.
// Check the organization name for hosting keywords.
const hostingKeywords = [
'hosting', 'server', 'cloud', 'vps',
'datacenter', 'data center', 'colocation',
];
const orgLower = asnResult.organization.toLowerCase();
const matchesHosting = hostingKeywords.some(
(kw) => orgLower.includes(kw)
);
if (matchesHosting) {
return -20; // Likely datacenter
}
return 0; // Unknown, neutral
}Reverse DNS Patterns
Datacenter IPs often have reverse DNS (PTR) records that reveal their nature:
// Common datacenter rDNS patterns
const DATACENTER_RDNS_PATTERNS = [
/.compute.amazonaws.com$/,
/.bc.googleusercontent.com$/,
/.cloudapp.azure.com$/,
/.vultrusercontent.com$/,
/.linodeusercontent.com$/,
/.your-server.de$/, // Hetzner
/.contaboserver.net$/,
/.ovh.(net|com|ca)$/,
/.dedicated./,
/.vps./,
/.server./,
];
function checkReverseDNS(hostname: string | null): number {
if (!hostname) return -5; // No rDNS is mildly suspicious
const isDatacenter = DATACENTER_RDNS_PATTERNS.some(
(pattern) => pattern.test(hostname)
);
return isDatacenter ? -20 : 0;
}Proxy and VPN Detection
Detecting proxies and VPNs is harder than detecting datacenters because the entire point of these services is to disguise traffic origin. Here are the main approaches:
Known VPN Provider IP Ranges
Commercial VPN providers operate thousands of servers with known IP ranges. Maintaining a database of these ranges requires continuous scanning and monitoring:
- Active probing: Connecting to known VPN endpoints and recording the exit IPs
- DNS enumeration: VPN providers often use predictable DNS patterns for their servers
- BGP monitoring: Watching for IP range announcements from known VPN-associated ASNs
- User reports and crowdsourcing: Community databases like VPN detection APIs aggregate reports
At BigShield, we track over 8,200 IP ranges associated with 140+ VPN providers. This database is updated daily.
Protocol-Level Detection
Some proxy types can be detected through protocol behavior:
- HTTP proxy headers:
X-Forwarded-For,Via,X-Real-IPheaders that leak the proxy chain - TCP/IP fingerprinting: Mismatches between the claimed OS (via User-Agent) and the TCP stack behavior can indicate tunneling
- WebRTC leaks: In browser contexts, WebRTC can reveal the real IP behind a VPN (though modern VPNs block this)
- DNS leak detection: When the DNS resolver does not match the expected location for the IP
Behavioral VPN Detection
Even when you cannot identify the specific VPN provider, behavioral signals can indicate VPN usage:
- Timezone mismatch: The browser reports a timezone inconsistent with the IP's geolocation
- Language mismatch: Accept-Language headers do not match the IP's country
- Latency anomalies: Round-trip times inconsistent with the supposed geographic location
For more on geographic inconsistency detection, see our article on timezone anomalies and impossible geographic signals.
Building an IP Reputation Database
Individual signals (datacenter, VPN, proxy) are useful, but the real power comes from aggregating behavioral data over time. An IP reputation database tracks:
- Historical fraud rate: What percentage of signups from this IP (or IP range) were later confirmed as fraudulent?
- Signup velocity: How many signups have come from this IP in the last hour, day, week?
- Cross-client correlation: If you process validation for multiple clients, is this IP creating accounts across many different services?
- Abuse reports: Has this IP been reported to abuse databases like AbuseIPDB or Spamhaus?
interface IPReputationRecord {
ip: string;
classification: 'residential' | 'commercial' | 'datacenter' | 'proxy';
vpnProvider: string | null;
abuseScore: number; // 0-100, higher = worse
signupCount30d: number; // signups in last 30 days
fraudRate30d: number; // 0-1, fraction that were fraudulent
firstSeen: string; // ISO date
lastSeen: string; // ISO date
country: string; // ISO country code
region: string;
}
function ipReputationScore(record: IPReputationRecord): number {
let score = 0;
// Classification impact
const classificationScores: Record<string, number> = {
residential: 10,
commercial: 5,
datacenter: -25,
proxy: -20,
};
score += classificationScores[record.classification] ?? 0;
// VPN penalty
if (record.vpnProvider) {
score -= 15;
}
// Abuse score impact
if (record.abuseScore > 75) score -= 20;
else if (record.abuseScore > 50) score -= 10;
else if (record.abuseScore > 25) score -= 5;
// Velocity checks
if (record.signupCount30d > 50) score -= 20;
else if (record.signupCount30d > 20) score -= 10;
else if (record.signupCount30d > 10) score -= 5;
// Historical fraud rate
if (record.fraudRate30d > 0.5) score -= 25;
else if (record.fraudRate30d > 0.2) score -= 15;
else if (record.fraudRate30d > 0.1) score -= 5;
return score;
}The VPN Nuance: Not All VPN Users Are Fraudsters
This is critical. Millions of legitimate users run VPNs for privacy, to access region-locked content, or because their company requires it. Blocking all VPN traffic is a terrible idea that will cost you real customers.
The key is to use VPN detection as one signal among many, not as a binary gate. A VPN combined with a legitimate-looking email from a real provider, normal browser fingerprint, and reasonable signup behavior? Probably fine. A VPN combined with a disposable email, datacenter-like browser fingerprint, and three other accounts from the same fingerprint? That is a very different story.
Some VPN providers have higher abuse rates than others. We track this and weight accordingly. For details, see our analysis of which VPN providers have the highest abuse rates.
Privacy Considerations
IP reputation scoring sits at the intersection of security and privacy. A few principles we follow at BigShield:
- Never store raw IPs longer than necessary. We hash IPs after reputation scoring and only retain the hash for correlation.
- Never use IP data alone to make blocking decisions. It is always combined with other signals.
- Be transparent. If a user is blocked or challenged, they should be able to contact support and get a human review.
- Comply with GDPR and regional regulations. IP addresses are considered personal data in many jurisdictions.
Putting It All Together
IP reputation is one of BigShield's 20+ signals. It contributes to the overall score alongside email pattern analysis, domain reputation, behavioral fingerprinting, and more. No single signal is decisive on its own, but IP reputation consistently ranks as one of the top 3 most impactful signals for catching fraud.
BigShield handles all of this complexity for you. Pass an IP address alongside the email in your validation request, and our scoring engine processes datacenter detection, VPN identification, abuse database lookups, and velocity checks in under 200ms. If you want to see IP reputation in action, sign up at bigshield.app and check the signal breakdown in your dashboard.