How To: Neo4j Data Import - Minimal Example

We want to import data into Neo4j, there are too many resources with a lot of information which makes it confusing. Here is the minimal thing you need to know.

Imagine the data coming from the export of a relational or legacy system, just plain CSV files without headers (this time). One for the "people" and one for the "friendships"-table.

people.csv

1,"John"
10,"Jane"
234,"Fred"
4893,"Mark"
234943,"Anne"

friendships.csv

1,234
10,4893
234,1
4893,234943
234943,234
234943,1

Graph Model

Our graph Model would be very simple:

(p1:Person {userId:10, name:"Anne"})-[:KNOWS]->(p2:Person {userId:123,name:"John"})

Import with Neo4j Server & Cypher

Download, install and start Neo4j Server.
Open http://localhost:7474
Run the following statements one by one:

I used http-urls here to run this as an interactive, live Graph Gist.

CREATE CONSTRAINT ON (p:Person) ASSERT p.userId IS UNIQUE;

LOAD CSV FROM "https://gist.githubusercontent.com/jexp/d8f251a948f5df83473a/raw/people.csv" AS row
CREATE (:Person {userId: toInt(row[0]), name:row[1]});

USING PERIODIC COMMIT
LOAD CSV FROM "https://gist.githubusercontent.com/jexp/d8f251a948f5df83473a/raw/friendships.csv" AS row
MATCH (p1:Person {userId: toInt(row[0])}), (p2:Person {userId: toInt(row[1])})
CREATE (p1)-[:KNOWS]->(p2);

Note	You can also use file-urls. Best with absolute paths like `file:/path/to/data.csv`, on Windows use: `file:c:/path/to/data.csv`

If you want to find your people not only by id but also by name quickly, also run:

CREATE INDEX ON :Person(name);

For instance all second degree friends of "Anne" and on how many ways they can be reached.

MATCH (:Person {name:"Anne"})-[:KNOWS*2..2]-(p2)
RETURN p2.name, count(*) as freq
ORDER BY freq DESC;

Bulk Data Import

For tens of millions up to billions of rows.

Shutdown the server first!!

Create two additional header files:

people_header.csv

userId:ID,name

friendships_header.csv

:START_ID,:END_ID

Execute from the terminal:

path/to/neo/bin/neo4j-import --into path/to/neo/data/graph.db  \
--nodes:Person people_header.csv,people.csv --relationships:KNOWS friendships_header.csv,friendships.csv

After starting your database again, run:

CREATE CONSTRAINT ON (p:Person) ASSERT p.userId IS UNIQUE;

How To: Neo4j Data Import - Minimal Example

Graph Model

Import with Neo4j Server & Cypher

Bulk Data Import

Resources