Cypher Reference for Splunk

Quick Cypher reference for Splunk queries.

Updated April 2026Splunk Integration

Cypher Reference for Splunk Documentation

Note: This is a quick reference for using Cypher queries within Splunk. For the complete Cypher language reference, see the Cypher Language Reference. For API details, see the Cypher API Reference.

Reference for writing Cypher queries against the Whisper Knowledge Graph from Splunk using the whisperquery command or by understanding the queries behind whisperlookup and the pre-built macros.

Basics

Cypher is a declarative graph query language. The general shape of a query:

MATCH (pattern) WHERE condition RETURN fields LIMIT N

Node patterns

(n)                          -- any node
(n:HOSTNAME)                 -- node with label HOSTNAME
(n:HOSTNAME {name: "x.com"}) -- node with label and property

Relationship patterns

(a)-[:RESOLVES_TO]->(b)      -- directed relationship
(a)-[:ALIAS_OF*1..5]->(b)    -- variable-length path (1 to 5 hops)
(a)<-[:NAMESERVER_FOR]-(b)   -- reverse direction

Property matching

Properties are case-sensitive. Domain names are stored in lowercase:

-- Correct
MATCH (h:HOSTNAME {name: "example.com"})

-- Wrong (will return no results)
MATCH (h:HOSTNAME {name: "Example.com"})

Graph schema

Node labels

LabelDescriptionExample values
HOSTNAMEFully-qualified domain names, subdomains, mail server nameswww.google.com, ns1.cloudflare.com
IPV4IPv4 addresses1.1.1.1, 142.250.64.100
IPV6IPv6 addresses2606:4700::6810:84e5
PREFIXIP CIDR blocks142.250.64.0/24
ASNAutonomous system numbersAS13335, AS15169
ASN_NAMEHuman-readable AS organization namesCLOUDFLARENET - Cloudflare, Inc.
TLDTop-level domainscom, net, org, io
TLD_OPERATORTLD registry operatorsVeriSign, Inc.
REGISTRARDomain registrars (IANA ID format)iana:292 (MarkMonitor)
EMAILWHOIS contact email addressesdomains@cloudflare.com
PHONEWHOIS contact phone numbers (E.164)+14158675825
ORGANIZATIONOrganizations from WHOIS recordscloudflare hostmaster
CITYGeoIP city with country codeMountain View, US
COUNTRYISO 3166-1 alpha-2 country codesUS, DE, AU
RIRRegional Internet RegistriesARIN, RIPENCC, APNIC, LACNIC, AFRINIC
DNSSEC_ALGORITHMDNSSEC signing algorithmsECDSAP256SHA256, RSASHA256
FEED_SOURCEThreat intelligence feed sources (virtual)Spamhaus DROP, Feodo Tracker
CATEGORYThreat feed categories (virtual)C2 Servers, Phishing

Virtual labels (synthesized at query time, global count() returns 0):

LabelDescription
REGISTERED_PREFIXRIR-allocated prefix; has HAS_COUNTRY and REGISTERED_BY edges
ANNOUNCED_PREFIXBGP-announced prefix; has ROUTES (to ASN) and HAS_COUNTRY edges

Edge types

DNS resolution

Edge typeSource to TargetDescription
RESOLVES_TOHOSTNAME to IPV4/IPV6DNS A/AAAA records
CHILD_OFHOSTNAME to HOSTNAME/TLDDomain hierarchy (sub.example.com -> example.com -> com)
ALIAS_OFHOSTNAME to HOSTNAMECNAME records
NAMESERVER_FORHOSTNAME to HOSTNAMENS delegation (nameserver serves the target domain)
MAIL_FORHOSTNAME to HOSTNAMEMX records (mail server handles mail for the target domain)
SIGNED_WITHHOSTNAME to DNSSEC_ALGORITHMDNSSEC signing algorithm

BGP and routing

Edge typeSource to TargetDescription
ANNOUNCED_BYIPV4/PREFIX to ASNBGP announcement (virtual, resolved at query time)
ROUTESASN to ANNOUNCED_PREFIXASN routes this prefix (virtual)
BELONGS_TOIPV4 to PREFIX/REGISTERED_PREFIX/ANNOUNCED_PREFIXIP membership in a prefix block
PEERS_WITHASN to ASNBGP peering session (virtual)
HAS_NAMEASN to ASN_NAMENetwork operator name (virtual)
HAS_COUNTRYASN/PREFIX/CITY/IPV4/HOSTNAME/PHONE to COUNTRYCountry association

WHOIS and registration

Edge typeSource to TargetDescription
HAS_REGISTRARHOSTNAME to REGISTRARCurrent domain registrar
PREV_REGISTRARHOSTNAME to REGISTRARPrevious domain registrar
REGISTERED_BYHOSTNAME/ASN/PREFIX to ORGANIZATIONWHOIS / RIR registration
HAS_EMAILHOSTNAME to EMAILWHOIS contact email
HAS_PHONEHOSTNAME to PHONEWHOIS contact phone

GeoIP

Edge typeSource to TargetDescription
LOCATED_INIPV4 to CITYGeoIP city location
LOCATED_INCITY to COUNTRYCity to country mapping

Threat intelligence

Edge typeSource to TargetDescription
LISTED_INIPV4/HOSTNAME to FEED_SOURCEIP or hostname appears in this threat feed (virtual)
BELONGS_TOFEED_SOURCE to CATEGORYFeed classified under this category

Web

Edge typeSource to TargetDescription
LINKS_TOHOSTNAME to HOSTNAMEHyperlink between hostnames (from web crawl data)

SPF

Edge typeSource to TargetDescription
SPF_INCLUDEHOSTNAME to HOSTNAMESPF include: mechanism
SPF_IPHOSTNAME to PREFIXSPF ip4: / ip6: mechanism
SPF_AHOSTNAME to HOSTNAMESPF a: mechanism
SPF_MXHOSTNAME to HOSTNAMESPF mx: mechanism
SPF_REDIRECTHOSTNAME to HOSTNAMESPF redirect= modifier
SPF_EXISTSHOSTNAME to HOSTNAMESPF exists: mechanism

Other

Edge typeSource to TargetDescription
OPERATESTLD_OPERATOR to TLDRegistry operator manages this TLD (virtual)

Entity relationship diagram

DiagramDiagram

Solid lines are physical edges stored on disk. Dashed lines are virtual edges computed at query time from live infrastructure and threat intelligence data.

Traversal chains

Common multi-hop paths through the graph. These are the patterns used by whisperlookup and the pre-built macros.

ChainPath
DNS resolutionHOSTNAME -> RESOLVES_TO -> IPV4 -> ANNOUNCED_BY -> ANNOUNCED_PREFIX -> ROUTES -> ASN -> HAS_NAME -> ASN_NAME
GeoIP (from IP)IPV4 -> LOCATED_IN -> CITY -> LOCATED_IN -> COUNTRY
GeoIP (from domain)HOSTNAME -> RESOLVES_TO -> IPV4 -> LOCATED_IN -> CITY -> LOCATED_IN -> COUNTRY
BGP routingASN -> ROUTES -> ANNOUNCED_PREFIX, ASN -> PEERS_WITH -> ASN
DNS hierarchyHOSTNAME -> CHILD_OF -> HOSTNAME(parent) -> CHILD_OF -> TLD
DNS securityHOSTNAME <- NAMESERVER_FOR <- HOSTNAME, HOSTNAME -> SIGNED_WITH -> DNSSEC_ALGORITHM
WHOISHOSTNAME -> HAS_REGISTRAR -> REGISTRAR, HOSTNAME -> HAS_EMAIL -> EMAIL, HOSTNAME -> REGISTERED_BY -> ORGANIZATION
Threat intelIPV4/HOSTNAME -> LISTED_IN -> FEED_SOURCE -> BELONGS_TO -> CATEGORY
SPF chainHOSTNAME -> SPF_INCLUDE -> HOSTNAME, HOSTNAME -> SPF_IP -> PREFIX

Language reference

For the complete language reference, see the Cypher Language Reference.

Supported functions

For the complete supported functions, see the Cypher Language Reference.

Procedures

For the complete procedures, see the Cypher Language Reference.

Query cookbook

All examples use the whisperquery command. Where a pre-built macro exists, it is noted.

Incident investigation

Trace an IP through DNS, network, and routing layers:

Identify the network owner:

| whisperquery query="MATCH (ip:IPV4 {name: $ip})-[:ANNOUNCED_BY]->(ap:ANNOUNCED_PREFIX)-[:ROUTES]->(a:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN ip.name AS ip, ap.name AS prefix, a.name AS asn, n.name AS asn_name LIMIT 10" params="ip=142.250.64.100"

ANNOUNCED_BY vs BELONGS_TO: ANNOUNCED_BY uses live BGP routing data for current routing info. BELONGS_TO returns the registered RIR allocation block. They may give different results. For IPs where BGP data is unavailable, fall back to BELONGS_TO.

GeoIP lookup:

| whisperquery query="MATCH (ip:IPV4 {name: $ip})-[:LOCATED_IN]->(city:CITY)-[:LOCATED_IN]->(country:COUNTRY) RETURN ip.name AS ip, city.name AS city, country.name AS country LIMIT 1" params="ip=142.250.64.100"

Count domains on an IP (co-hosting):

| whisperquery query="MATCH (ip:IPV4 {name: $ip})<-[:RESOLVES_TO]-(h:HOSTNAME) RETURN count(h) AS cohosted_domains LIMIT 1" params="ip=104.16.123.96"

Check threat feeds:

| whisperquery query="CALL explain($ip)" params="ip=104.16.123.96"

Tip: For a full investigation in one command: | `whisper_full_investigation("malware-c2.evil.com")`

Domain to infrastructure

Full trace from hostname through DNS, network, and routing layers:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:RESOLVES_TO]->(ip:IPV4)-[:ANNOUNCED_BY]->(ap:ANNOUNCED_PREFIX)-[:ROUTES]->(a:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN h.name AS hostname, ip.name AS ip, ap.name AS prefix, a.name AS asn, n.name AS asn_name LIMIT 10" params="domain=www.google.com"

Or use whisperlookup for inline enrichment of events:

index=dns sourcetype=dns
| whisperlookup field=query type=domain
| table query whisper_ip whisper_prefix whisper_asn whisper_asn_name whisper_country

IP to ASN

| whisperquery query="MATCH (ip:IPV4 {name: $ip})-[:ANNOUNCED_BY]->(ap:ANNOUNCED_PREFIX)-[:ROUTES]->(a:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN ip.name AS ip, ap.name AS prefix, a.name AS asn, n.name AS asn_name LIMIT 10" params="ip=8.8.8.8"

Co-hosted domains

Find other domains sharing the same IP address. Low co-hosting density is a common sign of attacker-controlled infrastructure.

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:RESOLVES_TO]->(ip:IPV4)<-[:RESOLVES_TO]-(cohost:HOSTNAME) WHERE cohost.name <> $domain RETURN ip.name AS ip, cohost.name AS cohost LIMIT 100" params="domain=suspicious-site.com"

Tip: | `whisper_cohosted_domains("suspicious-site.com")`

Shared nameservers

Shared nameservers often indicate common ownership or compromised hosting.

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})<-[:NAMESERVER_FOR]-(ns:HOSTNAME)-[:NAMESERVER_FOR]->(other:HOSTNAME) WHERE other.name <> $domain RETURN ns.name AS nameserver, other.name AS shared_domain LIMIT 100" params="domain=phishing-target.com"

Tip: | `whisper_shared_nameservers("phishing-target.com")`

CNAME chain

Follow CNAME alias chains up to 5 hops. Useful for detecting dangling CNAMEs (subdomain takeover risk).

| whisperquery query="MATCH path = (h:HOSTNAME {name: $domain})-[:ALIAS_OF*1..5]->(target:HOSTNAME) RETURN [n IN nodes(path) | n.name] AS chain, last(nodes(path)).name AS target, length(path) AS depth LIMIT 10" params="domain=www.example.com"

Tip: | `whisper_cname_chain("www.example.com")`

Subdomain discovery

Find subdomains using the domain hierarchy index (more efficient than ENDS WITH):

| whisperquery query="MATCH (sub:HOSTNAME)-[:CHILD_OF]->(h:HOSTNAME {name: $domain}) RETURN sub.name AS subdomain LIMIT 50" params="domain=google.com"

Or by prefix/suffix:

| whisperquery query="MATCH (h:HOSTNAME) WHERE h.name STARTS WITH $prefix RETURN h.name AS hostname LIMIT 50" params="prefix=mail.google"

SPF include chain

Trace SPF includes to audit RFC 7208 compliance (max 10 DNS lookups).

| whisperquery query="MATCH path = (h:HOSTNAME {name: $domain})-[:SPF_INCLUDE*1..3]->(included:HOSTNAME) RETURN [n IN nodes(path) | n.name] AS spf_chain, length(path) AS depth LIMIT 50" params="domain=example.com"

Full SPF mechanism breakdown:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[r:SPF_INCLUDE|SPF_IP|SPF_A|SPF_MX|SPF_REDIRECT|SPF_EXISTS]->(target) RETURN type(r) AS mechanism, target.name AS authorized LIMIT 20" params="domain=microsoft.com"

Warning: SPF_REDIRECT replaces the entire SPF policy with another domain's policy. If the target domain is permissive, your effective policy is too.

Tip: | `whisper_spf_chain("example.com")`

DNS infrastructure audit

Nameservers:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})<-[:NAMESERVER_FOR]-(ns:HOSTNAME) RETURN ns.name AS nameserver LIMIT 50" params="domain=google.com"

Mail servers:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})<-[:MAIL_FOR]-(mx:HOSTNAME) RETURN mx.name AS mail_server LIMIT 50" params="domain=google.com"

DNSSEC status:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:SIGNED_WITH]->(algo:DNSSEC_ALGORITHM) RETURN h.name AS domain, algo.name AS algorithm LIMIT 1" params="domain=cloudflare.com"

WHOIS investigation

Current registrar:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:HAS_REGISTRAR]->(r:REGISTRAR) RETURN r.name AS registrar LIMIT 1" params="domain=google.com"

Contact emails:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:HAS_EMAIL]->(e:EMAIL) RETURN e.name AS email LIMIT 10" params="domain=google.com"

Registrant organization:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:REGISTERED_BY]->(o:ORGANIZATION) RETURN o.name AS organization LIMIT 5" params="domain=cloudflare.com"

Domain history:

| whisperquery query="CALL whisper.history($domain)" params="domain=cloudflare.com"

Attack surface mapping

Map all infrastructure behind an ASN:

Routed prefixes:

| whisperquery query="MATCH (a:ASN {name: $asn})-[:ROUTES]->(p) RETURN a.name AS asn, p.name AS prefix LIMIT 200" params="asn=AS13335"

Hostnames on the ASN's infrastructure:

| whisperquery query="MATCH (a:ASN {name: $asn})-[:ROUTES]->(p)<-[:BELONGS_TO]-(ip:IPV4)<-[:RESOLVES_TO]-(h:HOSTNAME) RETURN h.name AS hostname, ip.name AS ip LIMIT 50" params="asn=AS13335"

BGP peers:

| whisperquery query="MATCH (a:ASN {name: $asn})-[:PEERS_WITH]->(peer:ASN)-[:HAS_NAME]->(n:ASN_NAME) RETURN peer.name AS peer_asn, n.name AS peer_name LIMIT 100" params="asn=AS13335"

ASN profile (prefix count + peer count):

| whisperquery query="MATCH (a:ASN {name: $asn}) OPTIONAL MATCH (a)-[:ROUTES]->(p) WITH a, count(p) AS prefix_count OPTIONAL MATCH (a)-[:PEERS_WITH]->(peer:ASN) RETURN a.name AS asn, prefix_count, count(peer) AS peer_count" params="asn=AS13335"

Tip: | `whisper_asn_infrastructure("AS13335")` and | `whisper_bgp_peers("AS13335")`

Threat intelligence

Check if an indicator is listed in any threat feed:

| whisperquery query="MATCH (n {name: $indicator})-[:LISTED_IN]->(f:FEED_SOURCE) RETURN n.name AS indicator, f.name AS feed_name LIMIT 20" params="indicator=185.220.101.1"

Threat assessment via explain():

| whisperquery query="CALL explain($indicator)" params="indicator=185.220.101.1"

Returns threat score (0-100), level (NONE/INFO/LOW/MEDIUM/HIGH/CRITICAL), explanation, contributing factors, and source feed breakdown. Works with IPs, domains, ASNs, and CIDR ranges.

Tip: | `whisper_explain("185.220.101.1")`

Threat properties available on enriched nodes:

PropertyTypeDescription
threatScoreDoubleComputed threat score (0-100)
threatLevelStringNONE, INFO, LOW, MEDIUM, HIGH, CRITICAL
isThreatBooleanListed in any threat feed
isTorBooleanTor exit/relay node
isC2BooleanCommand-and-control infrastructure
isMalwareBooleanMalware distribution
isPhishingBooleanPhishing infrastructure
isAnonymizerBooleanAnonymizer/proxy/VPN

These properties are exposed by whisperlookup as whisper_threat_score, whisper_threat_level, etc.

Batch lookup

Enrich a list of indicators in a single query:

| whisperquery query="UNWIND $hosts AS h MATCH (n:HOSTNAME {name: h})-[:RESOLVES_TO]->(ip:IPV4) RETURN n.name AS hostname, ip.name AS ip" params='{"hosts": ["www.google.com", "cloudflare.com", "example.com"]}'

Find outbound links from a domain (Common Crawl data):

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:LINKS_TO]->(target:HOSTNAME) RETURN target.name AS linked_domain LIMIT 50" params="domain=google.com"

Campaign infrastructure mapping

Find related infrastructure through shared hosting:

| whisperquery query="MATCH (h1:HOSTNAME {name: $domain})-[:RESOLVES_TO]->(ip:IPV4)<-[:RESOLVES_TO]-(h2:HOSTNAME) WHERE h1 <> h2 RETURN h2.name AS related_domain, ip.name AS shared_ip LIMIT 20" params="domain=paypal--confirm.com"

Pivot further through shared WHOIS contact emails:

| whisperquery query="MATCH (h1:HOSTNAME {name: $domain})-[:HAS_EMAIL]->(e:EMAIL)<-[:HAS_EMAIL]-(h2:HOSTNAME) WHERE h1 <> h2 RETURN h2.name AS related_domain, e.name AS shared_email LIMIT 20" params="domain=cloudflare.com"

Note: Check whether a shared email is a privacy-service proxy address before drawing attribution conclusions.

Brand protection

Search for domains that impersonate your brand:

| whisperquery query="MATCH (h:HOSTNAME) WHERE h.name CONTAINS $brand RETURN h.name AS hostname LIMIT 20" params="brand=paypal"

Then check if any are already threat-listed:

| whisperquery query="MATCH (h:HOSTNAME {name: $domain})-[:LISTED_IN]->(f:FEED_SOURCE) RETURN h.name, f.name" params="domain=paypal--confirm.com"

Note: An empty result means the domain is not yet in any feed, not that it is clean. New phishing domains often predate feed coverage.


Threat intelligence feeds

Whisper Graph indexes 40+ threat intelligence feeds across 18 categories, with hourly incremental and daily full refresh cycles.

Feed sources

FeedCategory
AlienVault ReputationReputation
Binary Defense BanlistGeneral Blacklists
Blocklist.de AllGeneral Blacklists
Blocklist.de MailSpam
Blocklist.de SSHBrute Force
Botvrij DomainsMalicious Domains
Botvrij Dst IPsC2 Servers
Brute Force BlockerBrute Force
C2 Intel 30dC2 Servers
C2 TrackerC2 Servers
CERT.pl DomainsMalicious Domains
CINS ScoreGeneral Blacklists
Cloudflare Radar Top 1MPopularity/Trust
DNS RD AbuseGeneral Blacklists
Dan Tor ExitTOR Network
ET Compromised IPsGeneral Blacklists
Feodo TrackerC2 Servers
FireHOL Abusers 1dGeneral Blacklists
FireHOL AnonymousProxies
FireHOL Level 1General Blacklists
FireHOL Level 2General Blacklists
FireHOL Level 3General Blacklists
FireHOL WebClientGeneral Blacklists
GreenSnow BlacklistGeneral Blacklists
Hagezi LightAd/Tracking Blocklists
Hagezi ProAd/Tracking Blocklists
IPsumGeneral Blacklists
InterServer RBLGeneral Blacklists
MalwareBazaar RecentMalware Distribution
OpenPhish FeedPhishing
SSH Client AttacksBrute Force
SSH Password AuthBrute Force
SSL IP BlacklistGeneral Blacklists
Spamhaus DROPGeneral Blacklists
Spamhaus EDROPGeneral Blacklists
StevenBlack HostsAd/Tracking Blocklists
ThreatFox IOCsC2 Servers
Tor Exit NodesTOR Network
Tranco Top 1MPopularity/Trust
URLhaus RecentMalware Distribution

Categories

Ad/Tracking Blocklists, Anonymization Infrastructure, Attack Sources, Brute Force, C2 Servers, General Blacklists, Malicious Domains, Malicious Infrastructure, Malware Distribution, Phishing, Popularity/Trust, Proxies, Reference Data, Reputation, Spam, TOR Network, Threat Intelligence, VPNs.


Query rules

RuleReason
Always anchor with {name: "value"}Prevents full scans across billions of nodes
Always use LIMITControls result set size
Use $parameter syntax with params=Prevents injection, enables index optimization
Bound variable-length paths (*1..N)Prevents unbounded traversals
Keep path length within your plan limitSee plan depth limits below
Use STARTS WITH / ENDS WITH / CONTAINSMuch faster than regex (=~) on large labels
Use CHILD_OF for subdomain queriesIndexed edge; faster than ENDS WITH
Use OPTIONAL MATCH for WHOIS fieldsAvoids losing rows when sparse fields are missing
Use ANNOUNCED_BY for current BGP routingBELONGS_TO returns registered allocation instead
No write operationsCREATE, DELETE, SET, MERGE are blocked; the graph is read-only
ENDS WITH suffix must start with ..google.com is indexed; google.com requires a full scan

API plan depth limits

The API enforces traversal depth limits based on your plan. Queries that exceed the limit return a QueryDepthExceeded error.

PlanMax traversal depth
Anonymous2 hops
Free3 hops
Professional5 hops

The pre-built macros and whisperlookup queries are set to depths supported by the Free plan or higher (CNAME *1..5 requires Professional, SPF_INCLUDE *1..3 requires Free). If you see a depth-exceeded error, either reduce your *1..N bound or upgrade your plan.

Warning: A 400 QueryDepthExceeded error means your query uses a variable-length path that exceeds your plan's hop limit. Reduce the upper bound in *1..N or upgrade to a higher plan.

Performance tips

Query patternTypical responseRisk
Anchored lookup {name: "..."}~96 msLow
STARTS WITH / ENDS WITH / CONTAINS + LIMIT~97-102 msLow
Single-hop traversal~99 msLow
Multi-hop (2-5 hops)~98 msLow
Threat intel (LISTED_IN, explain())~140-182 msLow
NAMESERVER_FOR / MAIL_FOR traversal~438 msMedium
Variable-length path [*1..3]~123 msMedium
Regex (=~) on HOSTNAME7 s+ or timeoutAvoid
Unanchored label scan~30 sAvoid

All timings include network latency (~90-140 ms round trip).

Best practices

  • Anchor your starting node. MATCH (n:HOSTNAME {name: "example.com"}) does an indexed lookup. MATCH (n:HOSTNAME) scans billions of nodes.
  • Always use LIMIT. Especially on traversals that could fan out (LINKS_TO, RESOLVES_TO on CDN IPs).
  • Use OPTIONAL MATCH for WHOIS fields. Not every domain has a registrar, email, or phone.
  • Use count() before pulling large result sets. Check cardinality first to avoid unexpectedly large responses.
  • Use UNWIND for batch lookups. Pass lists of indicators in a single request rather than one request per indicator.
  • Specify edge types explicitly. [:RESOLVES_TO] is faster than [r] because the engine does not need to check all edge types.
  • Anchor LINKS_TO queries. The web link graph is one of the largest datasets. Queries without an anchored starting node will time out.
Do thisNot thatWhy
MATCH (h:HOSTNAME {name: "example.com"})MATCH (h:HOSTNAME) WHERE h.name = "example.com"Inline property gets an indexed lookup
Always add LIMITOpen-ended traversalsPrevents timeout on billion-scale labels
OPTIONAL MATCH for WHOIS fieldsMATCH for sparse relationshipsAvoids losing rows when fields are missing
MATCH (sub)-[:CHILD_OF]->(h {name: "x.com"})WHERE h.name ENDS WITH ".x.com"CHILD_OF uses an indexed edge; ENDS WITH scans

Warning: - MATCH (n:HOSTNAME) RETURN n.name without LIMIT -- scans billions of rows

  • ORDER BY on unfiltered large scans -- sorts billions of rows
  • =~ regex on HOSTNAME -- full scan, may timeout at 60 s
  • Unbounded [*] without LIMIT -- explosive traversal