The Knowledge Graph Conference Icon
The Knowledge Graph Conference
  • 🏠Home
  • 📅Events
  • 👤Members
  • 🔵Announcements
  • 🔵Ask
  • 🔵Ask The Ontologists
  • 🔵Events
  • 🔵Jobs
  • 🔵Promotions
  • 🔵Share
Powered by Tightknit

Victor Mariano Leite

Commented on Optimizing SparQL Queries in AWS Neptune to Avoid...·Posted inAsk
Avatar of Victor Mariano Leite
Victor Mariano Leite
·

Andrea R. thanks a lot for the tip! I was using a numeric, but you gave me an idea, when the fileId's are generated sequentially, I think it's approximally a "random" sample in my case getting only the one's that fileId % 10 == 0, since there in my case there is (i hope haaha) no bias in being the 10th item, it helped a lot 🙂

Commented on Optimizing SparQL Queries in AWS Neptune to Avoid...·Posted inAsk
Avatar of Victor Mariano Leite
Victor Mariano Leite
·

Andrea R. Hmm, i'm thinking how i could do that, because I wanted to make """kind""" of a stratified sampling, there are 3 fields i've wanted to get: For example: fileId, pageId, pagemetadataId And I wanted to sample X fileId's, and get all of their pages and pages metadata. I don't know if there is a way to filter fileId if I want to sample it previously. I was going to sample it first in a subquery, then use the fileId's returned as a constraint for the outside query. Does it make sense? hahaha

Posted in Ask·
Avatar of Victor Mariano LeiteVictor Mariano Leite
·

Optimizing SparQL Queries in AWS Neptune to Avoid Memory Issues

Hi guys! Good evening, someone knows how can I make a more efficient SparQL query to sample some IDs than this:

SELECT DISTINCT ?sourceId WHERE {
     ?sourceId :hasX ?object.
}
ORDER BY RAND()
LIMIT 100

It's returning out of memory errors inconsistenly, seems that the smaller the limit more likely to get out of memory(???), running in AWS Neptune.

See more
5Comments

About

  • Job title
  • Location
  • Organization