Hello folks. I’m very new to graph databases and I’d like to share some recent insights that I hope you can validate
Neptune is closer to a transactional database than an OLAP
Because of the transactional nature, graph analytics operations like PageRank, Label Propogations, forms of clustering tend to not be done in Neptune + Gremlin
For analytical operations
, it is more common to export Neptune data to Spark (EMR/Glue) and perform said operations there. Curious if any of these assumptions are incorrect before I tattoo them on my left arm.
I would say you are correct but would also check with Ora L.
Neptune was indeed designed as an OLTP database, and performs best in transactional workloads where queries touch a limited “graph neighborhood”. That said, there are already many examples of companies using Neptune (say, with Gremlin) for some analytics tasks, albeit large-scale graph algorithms (such as PageRank) are not what the current version of Neptune is optimal for. A recent addition to Neptune’s capabilities is integration with AWS SageMaker and its ML capabilities (including the Deep Graph Library), and this may be applicable.