Memory mapped I/O can be used for read/writes to every file in the Neo4j storage. The best performance will be obtained if complete memory mapping of the file can occur, but if there is a shortage of memory for that, then Neo4j tries to optimize the memory use.
Neo4j makes extensive use of the java.nio
native Java package. Use of the native I/O package allows the allocation of memory external to the Java heap, which has to be handled separately. It will also depend on other system processes using memory. Neo4j allocates memory, which is a total of the JVM heap memory and the memory mapping needs, leaving the remaining memory for system processes.
It is not a great idea to use the complete available system memory for heap memory. The Neo4j data store (in the Neo4j database directory) stores the data in separate files which are outlined as follows:
You can configure the memory mapping configurations for the mentioned files separately using the mapped_memory
option along with the following parameters:
neostore.nodestore.db.mapped_memory=75M neostore.relationshipstore.db.mapped_memory=100M neostore.propertystore.db.mapped_memory=180M neostore.propertystore.db.strings.mapped_memory=210M neostore.propertystore.db.arrays.mapped_memory=210M
Let us see an example that Neo4j uses to illustrate mapped memory allocation. In order to tune the settings for memory mapping, we need to first look up the size of the files in the data store in the Neo4j database directory. Let us take a case where the size of the files is found to be as follows:
neostore.nodestore.db: 14MB neostore.propertystore.db: 510MB neostore.propertystore.db.strings: 1.2GB neostore.relationshipstore.db: 304MB
Let us say the system being used has a total memory of 4 GB, with 50 percent reserved for the system programs. The memory allocated to the Java Heap is 1.5 gigabytes leaving about 0.5 gigabytes for memory-mapping purposes. For obtaining optimum traversal speed, you can use a configuration for the memory mapping as follows:
neostore.nodestore.db.mapped_memory=15M neostore.relationshipstore.db.mapped_memory=285M neostore.propertystore.db.mapped_memory=100M neostore.propertystore.db.strings.mapped_memory=100M neostore.propertystore.db.arrays.mapped_memory=0M
Since our data had no file for array based properties, we can safely allocate no memory for memory mapping array based properties.
Memory mapping can also be used to optimize batch insertion speed. Let us take a look at an example that Neo4j uses to demonstrate this. Suppose we have a graph with 10M nodes that are connected with 100M relationships. Every object has distinct primitive and string type properties. For simplicity, let's say there are no array based properties. We need to give more memory to the node and relationship stores. The allocations can be made as follows:
neostore.nodestore.db.mapped_memory=90M neostore.relationshipstore.db.mapped_memory=3G neostore.propertystore.db.mapped_memory=50M neostore.propertystore.db.strings.mapped_memory=100M neostore.propertystore.db.arrays.mapped_memory=0M
The configuration is intended to store the entire graph in memory. A naive way to calculate memory needed for mapping the nodes is by using the number_of_nodes * 9 bytes formula and, as for relationships, it can be number_of_relationships * 33 bytes. You will know why, if you have read about storage basics in the previous chapter. It is important to note that the above configuration requires a Java heap of more than 3.3G since, for batch inserter mode normal, Java buffers which are allocated on the JVM heap memory are used in place of memory mapped ones.