Glossary

What Is a Knowledge Graph?

A knowledge graph is a data structure that represents real-world entities as nodes and the relationships between them as edges. Both nodes and edges carry typed properties. The shape is purpose-built for the multi-hop questions analysts actually ask — "show me everything connected to X within N steps along these specific relationships."

Why Not a Relational Database?

Relational databases are excellent at one thing: filtering and aggregating rows in well-defined tables. They are bad at recursive joins. Asking "give me all entities reachable from X via any number of hops along these edges" forces a relational engine into a chain of self-joins that explode in cost as the graph grows.

Graph databases store relationships as first-class objects. Traversing edges is the cheap operation, not the expensive one. That inverts the cost model of the typical security investigation.

Anatomy of a Knowledge Graph

  • Nodes — entities with a label (HOSTNAME, IPV4, ASN) and properties (e.g. name, firstSeen, country).
  • Edges — typed relationships (RESOLVES_TO, ANNOUNCED_BY, REGISTERED_BY) with their own properties (e.g. validFrom, validTo).
  • Schema — the set of allowed labels and edge types, often loose enough to absorb new data without migrations.

How Knowledge Graphs Are Queried

Cypher is the dominant query language for knowledge graphs. Its pattern syntax ((a)-[:EDGE]->(b)) reads like a sentence and maps directly to how analysts already think. SPARQL (RDF) and Gremlin (TinkerPop) are alternatives in different ecosystems.

Where Knowledge Graphs Win

  • Investigations — pivot from one indicator to a full campaign in one query.
  • Lineage and provenance — trace how a piece of data was assembled from sources.
  • Recommendation and matching — proximity in the graph encodes similarity.
  • Compliance and audit — explainable evidence chains, not black-box scores.

Whisper as a Knowledge Graph

Whisper is a knowledge graph of the internet's infrastructure — billions of nodes for hostnames, IPs, ASNs, certificates, and threat indicators, billions of edges for resolution, routing, ownership, and reputation. Read more about the engine we built.