NASA is increasingly turning towards graph databases to store and map its research documents, making it easier for agency employees to find mission critical information, and also to help the agency prove its worth to the general public.
David Meza, chief knowledge architect at NASA spoke to Computerworld UK from Houston, Texas last week to explain why the agency is increasingly turning to graph databases.
Instead of a traditional relational database, graph is a flavour of NoSQL database built upon graph theory, an academic computer science methodology which plots data points, known as objects or nodes, and the connections between them on a 'graph'.
Getting started with graph database
Meza's role, as he puts it, is "to develop and implement a tech roadmap to transfer data into knowledge." A few years ago graph databases started to pique his interest as a way of mapping the complex web of documents across NASA and make connections more easily discoverable for researchers and external stakeholders.
Back in 2015 the agency turned to Neo4J's commercial graph database to re-architect its vital 'lessons learned' database of documents relating to previous missions.
Meza wrote a technical breakdown of the initial project in 2015 for the Neo4J website, stating: "Recently I had a project engineer ask me if we could search our lessons learned using a list of 22 key terms the team was interested in. Our current keyword search engine would require him to enter each term individually, select the link and save the document for review [...] This would not do."
Now researchers within NASA are self-serving using the Neo4J database and a web-based visualisation front end from Linkurious - the same technology used to map the Panama Papers leaks for the International Investigative Journalist Consortium (ICIJ) last year - to search the lessons learned.
The best documented case of this new approach paying dividends was regarding an engineering issue on the uprighting mechanism for the Orion capsule. "They knew Apollo capsules had a similar uprighting system and trying to find information to repair and fix issues. With the current system they couldn't find the information. Using our system they were able to find what they wanted in hours instead of months," Meza explained.
Now Meza is looking to use the graph database technology to make all of NASA's research documents more easily searchable. NASA has 10 centres across the US, each with their own document stores.
A graph database maps these documents in a way that makes it simpler to search for what they contain beyond simple keywords, improving knowledge discovery, trending analysis, and risk-based decision making. It also helps the agency find documents that reference each other and keep citations up to date.
Meza would then like the database to be queried using natural language, and is even looking at Amazon's recently open sourced Lex toolkit to use voice search.
Meza is currently focusing on getting all of the research documents related to the international space station (ISS) into a graph database. The ISS acts as an orbiting research lab for the agency and crew members often conduct experiments in the fields of biology, human biology, physics, astronomy, meteorology, and other fields.
"We want to see how this impacts across academia and across the economy and industry. So patents developed, how the research impacts education and how to show to the taxpayer the value they get from the ISS," he said.
NASA is a federally funded agency which has seen its annual budget steadily fall since its heyday in the space race during the 1960s. It has a dedicated technology transfer programme but hasn't been great at sharing its success stories up to this point.
As administrator Charles Boden writes in his foreword for the latest Spinoff report: "NASA technologies can be found in your mobile devices, in self-driving tractors that work the fields, and in the latest 3D printers used by makers and hackers.
"They are making brain surgery safer and spotting rainforest fires before they spread. Spinoffs are even more diverse than the broad array of NASA missions they come from."
Graph allows Meza to easily incorporate external data, such as Thomson Reuters financial reports or the US patent office, to map the connections between ISS research and the outside world without having to constantly change the schema or re-interrogate the data. "With Neo4j if you don't like the model you can easily change it," Meza said.
Read next: NASA seeks commercial uses for space station
Meza anticipates the entire project to take between three and four years to complete, starting with information relating to patents. He wants to be able to publish some insight into how many patents NASA contributes in the USA within the next six months.