The market of graph databases, which structures data according to a network of relationships, is tiny. Strong growth might see it hit $2.4bn in 2023.
By comparison, Oracle’s annual revenue is $39.5bn, and open source relational databases have many more instances than its licensed products. But plucky graph technology can have a big impact when used to solve the right problem.
Business risk insight company Dun & Bradstreet has been working with graph database slinger Neo4J since 2017 and, having completed the latest upgrade over Christmas, says graph DBs have proved their worth in solving queries. Product manager Paul Westcott told The Register: “We were able to solve customer’s problem in seconds rather than days.”
Dun & Bradstreet uses graph databases to help answer a seemingly simple question which becomes fiendishly complex upon closer examination: Who owns which company? “A desktop researcher might start looking for a limited company and find it is owned by Company B, Inc. in America, and then they find that the company in America is owned by a company in Luxembourg, so they buy some information from Luxembourg, and then find it was based in the Netherlands,” Westcott said.
In this approach, the user requests data for each part of the puzzle, without seeing the bigger picture.
But the approach is slow, he said. “You would go on to several different registries across the world, pay for that information, bring it all together, try and connect all of the dots and put it into an Excel spreadsheet and then do some kind of brute-force calculation,” he said.
It might take between five and 10 working days, he said, which is expensive and inefficient. But the issue also becomes pressing as money laundering legislation (AMLD5) from the EU requires banks to understand which company owns which as they accept new corporate clients.
Dun & Bradstreet was already helping these banks to untangle the web, but with a graphDB, it thought it could do better. “What our customers were asking for was some way of building relationships between the information we had on businesses and people, and a normal SQL database was not driving that level of network analysis. It was pretty obvious to us that we needed some form of graph to deliver that level of relationship,” Westcott said.
Developers were enthusiastic to learn about a new technology, but content and data governance teams took more convincing. With rules in place to ensure the integrity of data and comply with data protection regulations, content teams were unsure of pivoting to a new data model. But once they saw graph data was able to help them by revealing gaps in knowledge that relational databases could not, they were convinced to adjust their governance models to adapt to the new technology, Westcott said.
Agile? OK… but wagile – really!
Dun & Bradstreet is currently running Ne04j 3.4 and planning to move to 3.5. Deploying the highly distributed architecture on AWS and its clients’ private clouds meant the system was able to scale up or down as queries came in. It was able to handle individual requests from a UI and also scale to hundreds of thousands of requests sent in JSON format.
But, crucially, all queries address the same data through a series of “sync causal clusters” developed based on the Raft protocol, which supports both large clusters and different cluster topologies in the cloud and data centres. “Both internal reporting and analytics were using a cluster and our customers were using a cluster and we had customers that were working on a private cloud so that they could maintain their security… and from that could be continually synced back to the same data,” he said.
Whaddya mean, ‘niche’?! Neo4j’s chief scientist schools El Reg on graph databases
The Dun & Bradstreet tech team used an agile scrum approach to developing the solution. At first it had three services in mind, but initially built a single micro service and gathered customer validation from that. It was a good job because feedback led to the development of two more services different from those originally planned. In this way, a truly agile approach proved its worth.
“For the Dun & Bradstreet, it really solidified that agile [strategy] and working with small scrum teams was a much better way of working than waterfall or worse still [hybrid approach] ‘wagile’. I don’t know if that is really a word but it drives me insane,” Westcott said.
A small faction of database implementation may be in graph, but Dun & Bradstreet is a $1.7bn company specialising in data. Its success with the approach shows the technology is ready for big business. ®
Follow me for more information.