Cypher Reference for Splunk
Quick Cypher reference for Splunk queries.
Cypher Reference for Splunk Documentation
Note: This is a quick reference for using Cypher queries within Splunk. For the complete Cypher language reference, see the Cypher Language Reference. For API details, see the Cypher API Reference.
Reference for writing Cypher queries against the Whisper Knowledge Graph from Splunk using the whisperquery command or by understanding the queries behind whisperlookup and the pre-built macros.
Basics
Cypher is a declarative graph query language. The general shape of a query:
MATCH (pattern) WHERE condition RETURN fields LIMIT N
Node patterns
(n) -- any node
(n:HOSTNAME) -- node with label HOSTNAME
(n:HOSTNAME {name: "x.com"}) -- node with label and property
Relationship patterns
(a)-[:RESOLVES_TO]->(b) -- directed relationship
(a)-[:ALIAS_OF*1..5]->(b) -- variable-length path (1 to 5 hops)
(a)<-[:NAMESERVER_FOR]-(b) -- reverse direction
Property matching
Properties are case-sensitive. Domain names are stored in lowercase:
-- Correct
MATCH (h:HOSTNAME {name: "example.com"})
-- Wrong (will return no results)
MATCH (h:HOSTNAME {name: "Example.com"})
Graph schema
Node labels
| Label | Description | Example values |
|---|---|---|
| HOSTNAME | Fully-qualified domain names, subdomains, mail server names | www.google.com, ns1.cloudflare.com |
| IPV4 | IPv4 addresses | 1.1.1.1, 142.250.64.100 |
| IPV6 | IPv6 addresses | 2606:4700::6810:84e5 |
| PREFIX | IP CIDR blocks | 142.250.64.0/24 |
| ASN | Autonomous system numbers | AS13335, AS15169 |
| ASN_NAME | Human-readable AS organization names | CLOUDFLARENET - Cloudflare, Inc. |
| TLD | Top-level domains | com, net, org, io |
| TLD_OPERATOR | TLD registry operators | VeriSign, Inc. |
| REGISTRAR | Domain registrars (IANA ID format) | iana:292 (MarkMonitor) |
| WHOIS contact email addresses | domains@cloudflare.com | |
| PHONE | WHOIS contact phone numbers (E.164) | +14158675825 |
| ORGANIZATION | Organizations from WHOIS records | cloudflare hostmaster |
| CITY | GeoIP city with country code | Mountain View, US |
| COUNTRY | ISO 3166-1 alpha-2 country codes | US, DE, AU |
| RIR | Regional Internet Registries | ARIN, RIPENCC, APNIC, LACNIC, AFRINIC |
| DNSSEC_ALGORITHM | DNSSEC signing algorithms | ECDSAP256SHA256, RSASHA256 |
| FEED_SOURCE | Threat intelligence feed sources (virtual) | Spamhaus DROP, Feodo Tracker |
| CATEGORY | Threat feed categories (virtual) | C2 Servers, Phishing |
Virtual labels (synthesized at query time, global count() returns 0):
| Label | Description |
|---|---|
| REGISTERED_PREFIX | RIR-allocated prefix; has HAS_COUNTRY and REGISTERED_BY edges |
| ANNOUNCED_PREFIX | BGP-announced prefix; has ROUTES (to ASN) and HAS_COUNTRY edges |
Edge types
DNS resolution
| Edge type | Source to Target | Description |
|---|---|---|
| RESOLVES_TO | HOSTNAME to IPV4/IPV6 | DNS A/AAAA records |
| CHILD_OF | HOSTNAME to HOSTNAME/TLD | Domain hierarchy (sub.example.com -> example.com -> com) |
| ALIAS_OF | HOSTNAME to HOSTNAME | CNAME records |
| NAMESERVER_FOR | HOSTNAME to HOSTNAME | NS delegation (nameserver serves the target domain) |
| MAIL_FOR | HOSTNAME to HOSTNAME | MX records (mail server handles mail for the target domain) |
| SIGNED_WITH | HOSTNAME to DNSSEC_ALGORITHM | DNSSEC signing algorithm |
BGP and routing
| Edge type | Source to Target | Description |
|---|---|---|
| ANNOUNCED_BY | IPV4/PREFIX to ASN | BGP announcement (virtual, resolved at query time) |
| ROUTES | ASN to ANNOUNCED_PREFIX | ASN routes this prefix (virtual) |
| BELONGS_TO | IPV4 to PREFIX/REGISTERED_PREFIX/ANNOUNCED_PREFIX | IP membership in a prefix block |
| PEERS_WITH | ASN to ASN | BGP peering session (virtual) |
| HAS_NAME | ASN to ASN_NAME | Network operator name (virtual) |
| HAS_COUNTRY | ASN/PREFIX/CITY/IPV4/HOSTNAME/PHONE to COUNTRY | Country association |
WHOIS and registration
| Edge type | Source to Target | Description |
|---|---|---|
| HAS_REGISTRAR | HOSTNAME to REGISTRAR | Current domain registrar |
| PREV_REGISTRAR | HOSTNAME to REGISTRAR | Previous domain registrar |
| REGISTERED_BY | HOSTNAME/ASN/PREFIX to ORGANIZATION | WHOIS / RIR registration |
| HAS_EMAIL | HOSTNAME to EMAIL | WHOIS contact email |
| HAS_PHONE | HOSTNAME to PHONE | WHOIS contact phone |
GeoIP
| Edge type | Source to Target | Description |
|---|---|---|
| LOCATED_IN | IPV4 to CITY | GeoIP city location |
| LOCATED_IN | CITY to COUNTRY | City to country mapping |
Threat intelligence
| Edge type | Source to Target | Description |
|---|---|---|
| LISTED_IN | IPV4/HOSTNAME to FEED_SOURCE | IP or hostname appears in this threat feed (virtual) |
| BELONGS_TO | FEED_SOURCE to CATEGORY | Feed classified under this category |
Web
| Edge type | Source to Target | Description |
|---|---|---|
| LINKS_TO | HOSTNAME to HOSTNAME | Hyperlink between hostnames (from web crawl data) |
SPF
| Edge type | Source to Target | Description |
|---|---|---|
| SPF_INCLUDE | HOSTNAME to HOSTNAME | SPF include: mechanism |
| SPF_IP | HOSTNAME to PREFIX | SPF ip4: / ip6: mechanism |
| SPF_A | HOSTNAME to HOSTNAME | SPF a: mechanism |
| SPF_MX | HOSTNAME to HOSTNAME | SPF mx: mechanism |
| SPF_REDIRECT | HOSTNAME to HOSTNAME | SPF redirect= modifier |
| SPF_EXISTS | HOSTNAME to HOSTNAME | SPF exists: mechanism |
Other
| Edge type | Source to Target | Description |
|---|---|---|
| OPERATES | TLD_OPERATOR to TLD | Registry operator manages this TLD (virtual) |
Entity relationship diagram
Diagram
Solid lines are physical edges stored on disk. Dashed lines are virtual edges computed at query time from live infrastructure and threat intelligence data.
Traversal chains
Common multi-hop paths through the graph. These are the patterns used by whisperlookup and the pre-built macros.
| Chain | Path |
|---|---|
| DNS resolution | HOSTNAME -> RESOLVES_TO -> IPV4 -> ANNOUNCED_BY -> ANNOUNCED_PREFIX -> ROUTES -> ASN -> HAS_NAME -> ASN_NAME |
| GeoIP (from IP) | IPV4 -> LOCATED_IN -> CITY -> LOCATED_IN -> COUNTRY |
| GeoIP (from domain) | HOSTNAME -> RESOLVES_TO -> IPV4 -> LOCATED_IN -> CITY -> LOCATED_IN -> COUNTRY |
| BGP routing | ASN -> ROUTES -> ANNOUNCED_PREFIX, ASN -> PEERS_WITH -> ASN |
| DNS hierarchy | HOSTNAME -> CHILD_OF -> HOSTNAME(parent) -> CHILD_OF -> TLD |
| DNS security | HOSTNAME <- NAMESERVER_FOR <- HOSTNAME, HOSTNAME -> SIGNED_WITH -> DNSSEC_ALGORITHM |
| WHOIS | HOSTNAME -> HAS_REGISTRAR -> REGISTRAR, HOSTNAME -> HAS_EMAIL -> EMAIL, HOSTNAME -> REGISTERED_BY -> ORGANIZATION |
| Threat intel | IPV4/HOSTNAME -> LISTED_IN -> FEED_SOURCE -> BELONGS_TO -> CATEGORY |
| SPF chain | HOSTNAME -> SPF_INCLUDE -> HOSTNAME, HOSTNAME -> SPF_IP -> PREFIX |
Language reference
For the complete language reference, see the Cypher Language Reference.
Supported functions
For the complete supported functions, see the Cypher Language Reference.
Procedures
For the complete procedures, see the Cypher Language Reference.
Query cookbook
All examples use the whisperquery command. Where a pre-built macro exists, it is noted.
Incident investigation
Trace an IP through DNS, network, and routing layers:
Identify the network owner:
| whisperquery query="MATCH (ip:IPV4 {name: $ip})-[:ANNOUNCED_BY]->(ap:ANNOUNCED_PREFIX)-[:ROUTES]->(a:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN ip.name AS ip, ap.name AS prefix, a.name AS asn, n.name AS asn_name LIMIT 10" params="ip=142.250.64.100"
ANNOUNCED_BY vs BELONGS_TO:
ANNOUNCED_BYuses live BGP routing data for current routing info.BELONGS_TOreturns the registered RIR allocation block. They may give different results. For IPs where BGP data is unavailable, fall back toBELONGS_TO.
GeoIP lookup:
| whisperquery query="MATCH (ip:IPV4 {name: $ip})-[:LOCATED_IN]->(city:CITY)-[:LOCATED_IN]->(country:COUNTRY) RETURN ip.name AS ip, city.name AS city, country.name AS country LIMIT 1" params="ip=142.250.64.100"
Count domains on an IP (co-hosting):
| whisperquery query="MATCH (ip:IPV4 {name: $ip})<-[:RESOLVES_TO]-(h:HOSTNAME) RETURN count(h) AS cohosted_domains LIMIT 1" params="ip=104.16.123.96"
Check threat feeds:
| whisperquery query="CALL explain($ip)" params="ip=104.16.123.96"
Tip: For a full investigation in one command:
| `whisper_full_investigation("malware-c2.evil.com")`
Domain to infrastructure
Full trace from hostname through DNS, network, and routing layers:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:RESOLVES_TO]->(ip:IPV4)-[:ANNOUNCED_BY]->(ap:ANNOUNCED_PREFIX)-[:ROUTES]->(a:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN h.name AS hostname, ip.name AS ip, ap.name AS prefix, a.name AS asn, n.name AS asn_name LIMIT 10" params="domain=www.google.com"
Or use whisperlookup for inline enrichment of events:
index=dns sourcetype=dns
| whisperlookup field=query type=domain
| table query whisper_ip whisper_prefix whisper_asn whisper_asn_name whisper_country
IP to ASN
| whisperquery query="MATCH (ip:IPV4 {name: $ip})-[:ANNOUNCED_BY]->(ap:ANNOUNCED_PREFIX)-[:ROUTES]->(a:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN ip.name AS ip, ap.name AS prefix, a.name AS asn, n.name AS asn_name LIMIT 10" params="ip=8.8.8.8"
Co-hosted domains
Find other domains sharing the same IP address. Low co-hosting density is a common sign of attacker-controlled infrastructure.
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:RESOLVES_TO]->(ip:IPV4)<-[:RESOLVES_TO]-(cohost:HOSTNAME) WHERE cohost.name <> $domain RETURN ip.name AS ip, cohost.name AS cohost LIMIT 100" params="domain=suspicious-site.com"
Tip:
| `whisper_cohosted_domains("suspicious-site.com")`
Shared nameservers
Shared nameservers often indicate common ownership or compromised hosting.
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})<-[:NAMESERVER_FOR]-(ns:HOSTNAME)-[:NAMESERVER_FOR]->(other:HOSTNAME) WHERE other.name <> $domain RETURN ns.name AS nameserver, other.name AS shared_domain LIMIT 100" params="domain=phishing-target.com"
Tip:
| `whisper_shared_nameservers("phishing-target.com")`
CNAME chain
Follow CNAME alias chains up to 5 hops. Useful for detecting dangling CNAMEs (subdomain takeover risk).
| whisperquery query="MATCH path = (h:HOSTNAME {name: $domain})-[:ALIAS_OF*1..5]->(target:HOSTNAME) RETURN [n IN nodes(path) | n.name] AS chain, last(nodes(path)).name AS target, length(path) AS depth LIMIT 10" params="domain=www.example.com"
Tip:
| `whisper_cname_chain("www.example.com")`
Subdomain discovery
Find subdomains using the domain hierarchy index (more efficient than ENDS WITH):
| whisperquery query="MATCH (sub:HOSTNAME)-[:CHILD_OF]->(h:HOSTNAME {name: $domain}) RETURN sub.name AS subdomain LIMIT 50" params="domain=google.com"
Or by prefix/suffix:
| whisperquery query="MATCH (h:HOSTNAME) WHERE h.name STARTS WITH $prefix RETURN h.name AS hostname LIMIT 50" params="prefix=mail.google"
SPF include chain
Trace SPF includes to audit RFC 7208 compliance (max 10 DNS lookups).
| whisperquery query="MATCH path = (h:HOSTNAME {name: $domain})-[:SPF_INCLUDE*1..3]->(included:HOSTNAME) RETURN [n IN nodes(path) | n.name] AS spf_chain, length(path) AS depth LIMIT 50" params="domain=example.com"
Full SPF mechanism breakdown:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[r:SPF_INCLUDE|SPF_IP|SPF_A|SPF_MX|SPF_REDIRECT|SPF_EXISTS]->(target) RETURN type(r) AS mechanism, target.name AS authorized LIMIT 20" params="domain=microsoft.com"
Warning:
SPF_REDIRECTreplaces the entire SPF policy with another domain's policy. If the target domain is permissive, your effective policy is too.
Tip:
| `whisper_spf_chain("example.com")`
DNS infrastructure audit
Nameservers:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})<-[:NAMESERVER_FOR]-(ns:HOSTNAME) RETURN ns.name AS nameserver LIMIT 50" params="domain=google.com"
Mail servers:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})<-[:MAIL_FOR]-(mx:HOSTNAME) RETURN mx.name AS mail_server LIMIT 50" params="domain=google.com"
DNSSEC status:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:SIGNED_WITH]->(algo:DNSSEC_ALGORITHM) RETURN h.name AS domain, algo.name AS algorithm LIMIT 1" params="domain=cloudflare.com"
WHOIS investigation
Current registrar:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:HAS_REGISTRAR]->(r:REGISTRAR) RETURN r.name AS registrar LIMIT 1" params="domain=google.com"
Contact emails:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:HAS_EMAIL]->(e:EMAIL) RETURN e.name AS email LIMIT 10" params="domain=google.com"
Registrant organization:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:REGISTERED_BY]->(o:ORGANIZATION) RETURN o.name AS organization LIMIT 5" params="domain=cloudflare.com"
Domain history:
| whisperquery query="CALL whisper.history($domain)" params="domain=cloudflare.com"
Attack surface mapping
Map all infrastructure behind an ASN:
Routed prefixes:
| whisperquery query="MATCH (a:ASN {name: $asn})-[:ROUTES]->(p) RETURN a.name AS asn, p.name AS prefix LIMIT 200" params="asn=AS13335"
Hostnames on the ASN's infrastructure:
| whisperquery query="MATCH (a:ASN {name: $asn})-[:ROUTES]->(p)<-[:BELONGS_TO]-(ip:IPV4)<-[:RESOLVES_TO]-(h:HOSTNAME) RETURN h.name AS hostname, ip.name AS ip LIMIT 50" params="asn=AS13335"
BGP peers:
| whisperquery query="MATCH (a:ASN {name: $asn})-[:PEERS_WITH]->(peer:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN peer.name AS peer_asn, n.name AS peer_name LIMIT 100" params="asn=AS13335"
ASN profile (prefix count + peer count):
| whisperquery query="MATCH (a:ASN {name: $asn}) OPTIONAL MATCH (a)-[:ROUTES]->(p) WITH a, count(p) AS prefix_count OPTIONAL MATCH (a)-[:PEERS_WITH]->(peer:ASN) RETURN a.name AS asn, prefix_count, count(peer) AS peer_count" params="asn=AS13335"
Tip:
| `whisper_asn_infrastructure("AS13335")`and| `whisper_bgp_peers("AS13335")`
Threat intelligence
Check if an indicator is listed in any threat feed:
| whisperquery query="MATCH (n {name: $indicator})-[:LISTED_IN]->(f:FEED_SOURCE) RETURN n.name AS indicator, f.name AS feed_name LIMIT 20" params="indicator=185.220.101.1"
Threat assessment via explain():
| whisperquery query="CALL explain($indicator)" params="indicator=185.220.101.1"
Returns threat score (0-100), level (NONE/INFO/LOW/MEDIUM/HIGH/CRITICAL), explanation, contributing factors, and source feed breakdown. Works with IPs, domains, ASNs, and CIDR ranges.
Tip:
| `whisper_explain("185.220.101.1")`
Threat properties available on enriched nodes:
| Property | Type | Description |
|---|---|---|
threatScore | Double | Computed threat score (0-100) |
threatLevel | String | NONE, INFO, LOW, MEDIUM, HIGH, CRITICAL |
isThreat | Boolean | Listed in any threat feed |
isTor | Boolean | Tor exit/relay node |
isC2 | Boolean | Command-and-control infrastructure |
isMalware | Boolean | Malware distribution |
isPhishing | Boolean | Phishing infrastructure |
isAnonymizer | Boolean | Anonymizer/proxy/VPN |
These properties are exposed by whisperlookup as whisper_threat_score, whisper_threat_level, etc.
Batch lookup
Enrich a list of indicators in a single query:
| whisperquery query="UNWIND $hosts AS h MATCH (n:HOSTNAME {name: h})-[:RESOLVES_TO]->(ip:IPV4) RETURN n.name AS hostname, ip.name AS ip" params='{"hosts": ["www.google.com", "cloudflare.com", "example.com"]}'
Web link analysis
Find outbound links from a domain (Common Crawl data):
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:LINKS_TO]->(target:HOSTNAME) RETURN target.name AS linked_domain LIMIT 50" params="domain=google.com"
Campaign infrastructure mapping
Find related infrastructure through shared hosting:
| whisperquery query="MATCH (h1:HOSTNAME {name: $domain})-[:RESOLVES_TO]->(ip:IPV4)<-[:RESOLVES_TO]-(h2:HOSTNAME) WHERE h1 <> h2 RETURN h2.name AS related_domain, ip.name AS shared_ip LIMIT 20" params="domain=paypal--confirm.com"
Pivot further through shared WHOIS contact emails:
| whisperquery query="MATCH (h1:HOSTNAME {name: $domain})-[:HAS_EMAIL]->(e:EMAIL)<-[:HAS_EMAIL]-(h2:HOSTNAME) WHERE h1 <> h2 RETURN h2.name AS related_domain, e.name AS shared_email LIMIT 20" params="domain=cloudflare.com"
Note: Check whether a shared email is a privacy-service proxy address before drawing attribution conclusions.
Brand protection
Search for domains that impersonate your brand:
| whisperquery query="MATCH (h:HOSTNAME) WHERE h.name CONTAINS $brand RETURN h.name AS hostname LIMIT 20" params="brand=paypal"
Then check if any are already threat-listed:
| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:LISTED_IN]->(f:FEED_SOURCE) RETURN h.name, f.name" params="domain=paypal--confirm.com"
Note: An empty result means the domain is not yet in any feed, not that it is clean. New phishing domains often predate feed coverage.
Threat intelligence feeds
Whisper Graph indexes 40+ threat intelligence feeds across 18 categories, with hourly incremental and daily full refresh cycles.
Feed sources
| Feed | Category |
|---|---|
| AlienVault Reputation | Reputation |
| Binary Defense Banlist | General Blacklists |
| Blocklist.de All | General Blacklists |
| Blocklist.de Mail | Spam |
| Blocklist.de SSH | Brute Force |
| Botvrij Domains | Malicious Domains |
| Botvrij Dst IPs | C2 Servers |
| Brute Force Blocker | Brute Force |
| C2 Intel 30d | C2 Servers |
| C2 Tracker | C2 Servers |
| CERT.pl Domains | Malicious Domains |
| CINS Score | General Blacklists |
| Cloudflare Radar Top 1M | Popularity/Trust |
| DNS RD Abuse | General Blacklists |
| Dan Tor Exit | TOR Network |
| ET Compromised IPs | General Blacklists |
| Feodo Tracker | C2 Servers |
| FireHOL Abusers 1d | General Blacklists |
| FireHOL Anonymous | Proxies |
| FireHOL Level 1 | General Blacklists |
| FireHOL Level 2 | General Blacklists |
| FireHOL Level 3 | General Blacklists |
| FireHOL WebClient | General Blacklists |
| GreenSnow Blacklist | General Blacklists |
| Hagezi Light | Ad/Tracking Blocklists |
| Hagezi Pro | Ad/Tracking Blocklists |
| IPsum | General Blacklists |
| InterServer RBL | General Blacklists |
| MalwareBazaar Recent | Malware Distribution |
| OpenPhish Feed | Phishing |
| SSH Client Attacks | Brute Force |
| SSH Password Auth | Brute Force |
| SSL IP Blacklist | General Blacklists |
| Spamhaus DROP | General Blacklists |
| Spamhaus EDROP | General Blacklists |
| StevenBlack Hosts | Ad/Tracking Blocklists |
| ThreatFox IOCs | C2 Servers |
| Tor Exit Nodes | TOR Network |
| Tranco Top 1M | Popularity/Trust |
| URLhaus Recent | Malware Distribution |
Categories
Ad/Tracking Blocklists, Anonymization Infrastructure, Attack Sources, Brute Force, C2 Servers, General Blacklists, Malicious Domains, Malicious Infrastructure, Malware Distribution, Phishing, Popularity/Trust, Proxies, Reference Data, Reputation, Spam, TOR Network, Threat Intelligence, VPNs.
Query rules
| Rule | Reason |
|---|---|
Always anchor with {name: "value"} | Prevents full scans across billions of nodes |
Always use LIMIT | Controls result set size |
Use $parameter syntax with params= | Prevents injection, enables index optimization |
Bound variable-length paths (*1..N) | Prevents unbounded traversals |
| Keep path length within your plan limit | See plan depth limits below |
Use STARTS WITH / ENDS WITH / CONTAINS | Much faster than regex (=~) on large labels |
Use CHILD_OF for subdomain queries | Indexed edge; faster than ENDS WITH |
Use OPTIONAL MATCH for WHOIS fields | Avoids losing rows when sparse fields are missing |
Use ANNOUNCED_BY for current BGP routing | BELONGS_TO returns registered allocation instead |
| No write operations | CREATE, DELETE, SET, MERGE are blocked; the graph is read-only |
ENDS WITH suffix must start with . | .google.com is indexed; google.com requires a full scan |
API plan depth limits
The API enforces traversal depth limits based on your plan. Queries that exceed the limit return a QueryDepthExceeded error.
| Plan | Max traversal depth |
|---|---|
| Anonymous | 2 hops |
| Free | 3 hops |
| Professional | 5 hops |
The pre-built macros and whisperlookup queries are set to depths supported by the Free plan or higher (CNAME *1..5 requires Professional, SPF_INCLUDE *1..3 requires Free). If you see a depth-exceeded error, either reduce your *1..N bound or upgrade your plan.
Warning: A
400 QueryDepthExceedederror means your query uses a variable-length path that exceeds your plan's hop limit. Reduce the upper bound in*1..Nor upgrade to a higher plan.
Performance tips
| Query pattern | Typical response | Risk |
|---|---|---|
Anchored lookup {name: "..."} | ~96 ms | Low |
STARTS WITH / ENDS WITH / CONTAINS + LIMIT | ~97-102 ms | Low |
| Single-hop traversal | ~99 ms | Low |
| Multi-hop (2-5 hops) | ~98 ms | Low |
Threat intel (LISTED_IN, explain()) | ~140-182 ms | Low |
NAMESERVER_FOR / MAIL_FOR traversal | ~438 ms | Medium |
Variable-length path [*1..3] | ~123 ms | Medium |
Regex (=~) on HOSTNAME | 7 s+ or timeout | Avoid |
| Unanchored label scan | ~30 s | Avoid |
All timings include network latency (~90-140 ms round trip).
Best practices
- Anchor your starting node.
MATCH (n:HOSTNAME {name: "example.com"})does an indexed lookup.MATCH (n:HOSTNAME)scans billions of nodes. - Always use LIMIT. Especially on traversals that could fan out (LINKS_TO, RESOLVES_TO on CDN IPs).
- Use OPTIONAL MATCH for WHOIS fields. Not every domain has a registrar, email, or phone.
- Use count() before pulling large result sets. Check cardinality first to avoid unexpectedly large responses.
- Use UNWIND for batch lookups. Pass lists of indicators in a single request rather than one request per indicator.
- Specify edge types explicitly.
[:RESOLVES_TO]is faster than[r]because the engine does not need to check all edge types. - Anchor LINKS_TO queries. The web link graph is one of the largest datasets. Queries without an anchored starting node will time out.
| Do this | Not that | Why |
|---|---|---|
MATCH (h:HOSTNAME {name: "example.com"}) | MATCH (h:HOSTNAME) WHERE h.name = "example.com" | Inline property gets an indexed lookup |
Always add LIMIT | Open-ended traversals | Prevents timeout on billion-scale labels |
OPTIONAL MATCH for WHOIS fields | MATCH for sparse relationships | Avoids losing rows when fields are missing |
MATCH (sub)-[:CHILD_OF]->(h {name: "x.com"}) | WHERE h.name ENDS WITH ".x.com" | CHILD_OF uses an indexed edge; ENDS WITH scans |
Warning: -
MATCH (n:HOSTNAME) RETURN n.namewithoutLIMIT-- scans billions of rows
ORDER BYon unfiltered large scans -- sorts billions of rows=~regex on HOSTNAME -- full scan, may timeout at 60 s- Unbounded
[*]withoutLIMIT-- explosive traversal