Memory mapped I/O configuration

Memory mapped I/O can be used for read/writes to every file in the Neo4j storage. The best performance will be obtained if complete memory mapping of the file can occur, but if there is a shortage of memory for that, then Neo4j tries to optimize the memory use.

Note

Neo4j makes extensive use of the java.nio native Java package. Use of the native I/O package allows the allocation of memory external to the Java heap, which has to be handled separately. It will also depend on other system processes using memory. Neo4j allocates memory, which is a total of the JVM heap memory and the memory mapping needs, leaving the remaining memory for system processes.

It is not a great idea to use the complete available system memory for heap memory. The Neo4j data store (in the Neo4j database directory) stores the data in separate files which are outlined as follows:

  • nodestore: It is used to store node information
  • relationshipstore: It is used to store relationship information
  • propertystore: All simple properties of nodes and relationships, occurring as primitive types are saved in this file
  • propertystore strings: It is the storage for string type properties
  • propertystore arrays: It is the storage of all array type properties

You can configure the memory mapping configurations for the mentioned files separately using the mapped_memory option along with the following parameters:

neostore.nodestore.db.mapped_memory=75M
neostore.relationshipstore.db.mapped_memory=100M
neostore.propertystore.db.mapped_memory=180M
neostore.propertystore.db.strings.mapped_memory=210M
neostore.propertystore.db.arrays.mapped_memory=210M

Tip

If traversal speed is the highest priority, it is good to memory map the node and relationship stores as much as possible.

Traversal speed optimization example

Let us see an example that Neo4j uses to illustrate mapped memory allocation. In order to tune the settings for memory mapping, we need to first look up the size of the files in the data store in the Neo4j database directory. Let us take a case where the size of the files is found to be as follows:

neostore.nodestore.db: 14MB
neostore.propertystore.db: 510MB
neostore.propertystore.db.strings: 1.2GB
neostore.relationshipstore.db: 304MB

Let us say the system being used has a total memory of 4 GB, with 50 percent reserved for the system programs. The memory allocated to the Java Heap is 1.5 gigabytes leaving about 0.5 gigabytes for memory-mapping purposes. For obtaining optimum traversal speed, you can use a configuration for the memory mapping as follows:

neostore.nodestore.db.mapped_memory=15M
neostore.relationshipstore.db.mapped_memory=285M
neostore.propertystore.db.mapped_memory=100M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=0M

Since our data had no file for array based properties, we can safely allocate no memory for memory mapping array based properties.

Batch insert example

Memory mapping can also be used to optimize batch insertion speed. Let us take a look at an example that Neo4j uses to demonstrate this. Suppose we have a graph with 10M nodes that are connected with 100M relationships. Every object has distinct primitive and string type properties. For simplicity, let's say there are no array based properties. We need to give more memory to the node and relationship stores. The allocations can be made as follows:

neostore.nodestore.db.mapped_memory=90M
neostore.relationshipstore.db.mapped_memory=3G
neostore.propertystore.db.mapped_memory=50M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=0M

The configuration is intended to store the entire graph in memory. A naive way to calculate memory needed for mapping the nodes is by using the number_of_nodes * 9 bytes formula and, as for relationships, it can be number_of_relationships * 33 bytes. You will know why, if you have read about storage basics in the previous chapter. It is important to note that the above configuration requires a Java heap of more than 3.3G since, for batch inserter mode normal, Java buffers which are allocated on the JVM heap memory are used in place of memory mapped ones.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset