Part of the problem with any powerful technology is how it is perceived. It might be something that is too early for its time or it may just need those years of development and use for the market to catch up to its potential. That is true of graph databases like Neo4j, which now has a new graphical interface that helps people map relationships between different people, places or things.
There is one simple way to think about graph databases, said Emil Eifrem, CEO of Neo Technology and one of the original developers of the graph database technology. And that is to explore how graph databases treat relationships as first-class citizens. The knowledge is all about relationships. When more connections are made, the view fills out. Graph databases find the relationships, the missing pieces that help form connections.
Graphs databases are still known by relatively few people, but they are gaining acceptance as the use cases increase. The reasons are clear when you consider how much data is now getting created. With that scaling comes a growing demand for new types of analytics capabilities.
Graph databases are becoming more popular for the varied amounts of data they aggregate and analyze. They treat everything as a node. That might be things like a street light or people. The properties of a graph database describe the nodes. A graph database also has “edges” that connect the nodes and properties, defining the relationship between them. The value is derived when analyzing the patterns between the nodes and the properties.
Neo Technology has put a heavy emphasis on the user interface to make it more accessible and easier to query and build graphs from large data sets. It took the developers 10 years to get to Neo4j 1.0, said Eifrem. They built it to be fully transactional and ACID compliant, meaning transactions are processed reliably. But it lacked a decent user interface. So that became the focus as they developed the 2.0 release.
In this new version, the goal is to make it possible for developers to create their own recommendation engines using visual guides and simple queries. Cypher, the query language, has been updated and streamlined, making it more accessible to someone like a business analyst. It also now supports labels that refer to subsets of nodes in a graph, introducing a form of schema into the technology. This means that the data can be indexed better, allowing the developer to tell the database more about the data. https://www.youtube.com/embed/qbZ_Q-YnHYo?feature=player_embedded
Snap-Interactive uses Neo4j to build social graphs that find the patterns in the data to recommend potential matches on dating sites. Graph databases also apply to network and data center management. A cloud infrastructure, for example, is a connected system. The shape of the data can help track the root cause analysis of a data center outage.
Neo4j is the most popular graph database, said Matt Aslett, a research director for data management and analytics at 451 Research. But these are early days and there are competitors like Objectivity and established providers such as Teradata that have layered a graph capability to its Aster database.
Orchestrate takes a different approach, Aslett said. They are doing it in reverse. They have a single API that allows the user to sync multiple databases with its service that has a graph capability. It does not layer on top of the existing database. Instead, it syncs with the existing database. In Orchestrate they get the graph capability, which is connected to the pre-existing database.
Time To Visualize
Data has been locked up in rows and columns on a spreadsheet for years but that has to change as we increasingly need to visualize what the data means. There is just too much of it for our minds to comprehend unless it is shown in some way through pictures and graphics.
The visualization in Neo4j helps show the relationships between different data sets that you can’t get by only looking at the code. This visualization capability is supported with Cypher, the query language developed by Neo4j. Each query represents an interaction that is shown in the visualization.
Traditional SQL engines are not meant to collect data and seek out relationships. They are instead designed more for transaction-oriented systems. Graph databases handle the loads with far more ease as they model, store and query the data. Everything is connected. A relational database will often deteriorate in performance over time as the data set grows. It may be a result of joined tables or queries that rely on joins. This can make it quite slow as opposed to a graph database that is densely connected and easily queried.
As sensors become more widely used in wearables such as Google Glass, the demand for graph databases will increase. It will be important to correlate the data from any number of sensors that might be in a house, a car or city street. There will also be the need to analyze increasing amounts of text from medical records, contracts, etc.
It will be a long time before relational databases ebb into oblivion. But their role is no longer universal. Graph databases are here to stay and for now Neo4j is setting the standard for the rest of the market.
“There is a tipping point but that will take some time,” Aslett said.
This article was first posted on Techcrunch.