SPARQL and Amazon Web Service’s Neptune database

存储架构 2017-12-31

Amazon recently announced
Neptune as an AWS service. As its home page
describes it,

Amazon Neptune is a fast, scalable graph database service. Neptune efficiently stores and navigates highly connected data. Its query processing engine is optimized for leading graph query languages, Apache TinkerPop™ Gremlin and the W3C's RDF SPARQL. Neptune provides high performance through the open and standard APIs of these graph frameworks. And, Neptune is fully managed, so you no longer need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.

Apart from the practical aspects of the scalable yet convenient use of RDF and SPARQL that Neptune will enable, it's exciting to see such a high-profile acknowledgment of SPARQL as a serious development tool. Many organizations
already knew this, but judging from the reaction to the Neptune announcement on Twitter, many more people are finally understanding this.

It's exciting to see such a high-profile acknowledgment of SPARQL as a serious development tool.

Rumors have been flying that the Blazegraph
triplestore may play some role in Amazon's new graph store. As Stardog CEO Kendall Clark
wrote on ycombinator recently
, "Amazon acquired the domains, etc. Many former Blazegraph engineers are now Amazon Neptune engineers according to LinkedIn, etc. It was rumored widely in the graph db world fwiw." Yahoo Knowledge Graph science and data lead Nicolas Torzec
responded to Kendall's comment with a link showing that Amazon now owns the Blazegraph trademark
. (Blazegraph's website hasn't shown much activity in a while, with the latest post on their Press
page being from May of last year.)

May of last year was also when I wrote Trying out Blazegraph
about my positive experiences about this graph store, and after the recent announcement I tweeted
that if Blazegraph was part of Neptune, it would be very cool if that included Blazegraph's inferencing. Pavel Klinov replied by pointing out
a Neptune announcement video
where they explicitly say that inferencing is not supported.

This hour-long "AWS re:Invent 2017: NEW LAUNCH! Deep dive on Amazon Neptune" video included some other interesting points. Because Neptune supports property graphs via Tinkerpop as well as SPARQL, early in the video the speaker provides some background on property graphs versus RDF
. He devotes a good portion of his presentation to talking through an SQL query for people who are unfamiliar with graph databases and then covering comparable SPARQL and Tinkerpop Gremlin queries.

The plug from Thomson Reuters
early in the video was nice to see, coming from a large well-known organization that has been taking SPARQL seriously for a while. Later in the video, one slide's
use of Thomson Reuter's PermID
vocabulary with the geonames vocabulary in the same triple was especially nice to see, because while the extent of RDF's usage continues to be a pleasant surprise for me, I'm also surprised by how many people only use it for the simplicity of the triples data model--they're missing the data integration power of the ability to mix and match the wide variety of existing vocabularies (and hence data sources) with their own data.

The video's second speaker
talks more about Neptune's enterprise features such as fast failover, encryption at rest and in transit, and backup and restore, which are all great things to see in a cloud-based triplestore. Neptune offers a lot of room; as this speaker mentions
, "Storage volumes are not required to be statically allocated; they actually grow automatically up to a maximum size of 64 terabytes." The ability to restore a dataset to its state from a previous point in time
also sounds very useful.

Once the speakers started taking questions
, it looked to me like there were more questions about RDF and SPARQL than there were about Tinkerpop and Gremlin. The former included the question about inferencing
, which got a response (as Pavel had pointed out to me) of "we do not have in-database inference currently... we are very interested in use cases for inferencing." They also said
that Neptune's underlying graph engine was custom-built by Amazon as a graph system, which left me more curious about the potential role of Blazegraph in the released version of Neptune. (Maybe "by Amazon" includes former Blazegraph engineers.)

Some more interesting facts from the question and answer session:

I'm looking forward to playing with SPARQL on AWS Neptune and will certainly be reporting back about my experiences here.

Please add any comments to this Google+ post

责编内容by:Planet RDF (源链)。感谢您的支持!


N+1 Queries, Batch Loading & Active Model Seri... The n+1 query problem is one of the most common scalability bottlenecks. If you’re comfortable with Rails, Active Model Serializers and already ha...
Here’s how to hack 40 websites in 7 minutes Last summer I started learning about information security and hacking. Over the last year I’ve played in various wargames, capture the flag and penetr...
Laravel 的十八个最佳实践 本文第一次出现在 Laravel China 社区上,知乎不支持 Markdown 表格,更好的排版请见 —— Laravel 的十八个最佳实践 。作者 JokerLinly , 翻译改编自 Laravel 的十八个最佳实践 。 这篇文章并不是什么由 Lar...
Pre and post-migration script for Flyway I am looking for a way to execute a hook script before and after migration. I have a bunch of views and stored procedures and would like the proces...
视频访谈: 冯忠旗:以用户和业务角度去考虑技术层面的问题才能做好产品... 2. 您是QCon的嘉宾,先介绍一下你们的平台。 冯忠旗: 这个平台非常好,经常在上面分享一些文章,把我们的经验拿出来希望能够帮助到跟我们类似的团队,像互联网相关的金融也好支付也好,我们把这块东西拿出来分享。 ...