How to set up a Python environment with Neo4j

Note: At the time of this writing, latest versions are Python 3.7.2 and Neo4j 3.5.3.

Neo4j is used to create and maintain Graph Databases, which are queryable using the Cypher language.

The native form of Neo4j is written in Java but we can find drivers to other languages and also more Python work by the community. As of now, IMO the best choice is to use the official py2neo.

Installation of the virtual environment & packages

I assume you already have Python. If you don’t, download the latest version and follow the installation steps.

To manage virtual environments I use pipenv. I’s a Python package that includes the features of both pip and virtualenv. It also uses ideas from pew (another Python virtual environments manager) and npm (official package manager of Node.s), so you can write pipenv commands the same way you would do it with pip and also use pew magic to manage your projects.

Pipenv uses pip, so we need to make sure ourpip version is up to date. Don’t forget to also check your version of Python by typing python -V and pip -V before executing any command.

If the pip command itself is not recognized, you have a PATH problem to fix to access your librairies. Alternatively, you can use python -m pip instead of pip in the instructions below:

mkdir my_neo4j_project/
cd my_neo4j_project/
pip install --upgrade --user pip pipenv
pipenv install py2neo

The py2neo package depend on neo4j’s, so it will install it if you did not already. If this is the first time you use pipenv in your current directory, it will generate two files:

  • Pipfile: it stores the package’s names and potential constraints (version number, git link, …) given in the previous pipenv install [packages] commands.
  • Pipfile.lock: it stores the hash of the versioned packages, it is updated automatically at every new installation and can be done manually using pipenv update. The command pipenv install without arguments will install packages as they are described in the Pipefile.lock file.

To enter in (“activate”) your virtual environment, just use pipenv shell. It will load everything you need: the right Python’s version, the packages, environment variables, …

If you just want to activate the environment for one command, you can use pipenv run [cmd], like pipenv run python main.py.

Connect to Neo4j database

If you do not have Neo4j already, I suggest you download the latest Server edition on Neo4j’s official download page and follow the official installation instructions.

Once the installation finished, you should have your $NEO4J_HOME environment variable set. You can then start running your Neo4j server using the command start (stop and restart works too):

sudo $NEO4J_HOME/bin/neo4j start

Now that we have both Neo4j and py2neo, we can use the latter to connect to our Neo4j databa’se.

import os
from py2neo import Graph

class GraphDriver(object):

    @classmethod
    def from_env(cls):
        uri: str = os.environ["NEO4J_URI"]
        usr: str = os.environ["NEO4J_USR"]
        pwd: str = os.environ["NEO4J_PASSWORD"]
        return cls(uri, usr, pwd)

    def __init__(self: object, uri: str, usr: str, pwd: str) -> None:
        self.graph: Graph = Graph(uri, auth=(usr, pwd))

This code allow to connect to a graph database using either explicit values or environment variables. Those who followed until then are thinking: convenient, since Pipenv loaded them for us!

Typically, those variables are stored in a .env file (say “dot env”) at the root of the project’s folder or repository. So you only have to create this file and write key=value pairs, like for this example:

NEO4J_URI=bolt://127.0.0.1:7687
NEO4J_USR=laure
NEO4J_PASSWORD=neofoxes

Those key-value pairs will be accessible next time you launch pipenv shell. You can list them from your terminal by typing the command env or printenv, or from your Python code using import os; print(os.environ).

Perhaps you’ve noticed the link starts with bolt instead of usual http(s). Indeed, Neo4j engineers created this binary protocol to overcome HTTP(S) limitations in speed, payload and overall performances.

Bolt vs HTTP

As you can see, the Bolt requests and responses are significantly smaller than HTTP’s. Also, the stateful set-up including in the Bolt world allow to not send multiple times all the informations, including authentification.

Let’s play!

Now that you have everything set up, it’s time to populate your graph with tons of nodes and relationships! I personally suggest reading through the py2neo documentation to discover the different available objects (Nodes, Relationships, but also Graphs, Subgraphs, Record, Cursor, …) and how they interact.

In addition, you can read the py2neo API overview of Nigel Small, which was (and probably still is) the leader of the team that deployed py2neo.

Leave a Reply

Your email address will not be published. Required fields are marked *