The Knowledge Graph Conference Icon
The Knowledge Graph Conference
  • 馃彔Home
  • 馃搮Events
  • 馃懁Members
  • 馃數Announcements
  • 馃數Ask
  • 馃數Ask The Ontologists
  • 馃數Events
  • 馃數Jobs
  • 馃數Promotions
  • 馃數Share
Powered by Tightknit
Ask
Ask

Publishing Knowledge Graph Datasets in HDT Format: Insights?

Avatar of Andrew P.Andrew P.
路Jul 30, 2024 02:04 PM

Hi. has anyone published KG datasets using the HDT format (https://www.rdfhdt.org/ ) Thoughts?

馃憖1

5 comments

路 Sorted by Oldest
  • Avatar of Ryan B.
    Ryan B.
    路

    Looks really interesting. It reminds me a lot of modern arco approaches (kerchunk). There is quite a bit of interest in tests and studies of federated query over file serializations (vs managed database solutions) (e.g. parquet, zarr) with contextual headers storing mapping files R2RML/RML, kerchunk). I could see a lot of performance potential here vs say virtualized queries via athena over json-ld files, while still maintaining benefits of file-store data vs managed service. Thanks for the share.

  • Avatar of Andrew P.
    Andrew P.
    路

    yep reminds me of parquet

  • Avatar of Andrew P.
    Andrew P.
    路

    good start towards separating storage and compute for graphs .

    馃檶1
  • Avatar of Ryan B.
    Ryan B.
    路

    it鈥檚 definitely a piece of the puzzle

  • Avatar of Denny V.
    Denny V.
    路

    One issue we had was that HDT wasn't great in supporting dynamic KGs, which changed often, because it needed a complete recalculation, IIRC. For reasonably static datasets this sounds like a good efficient way.

    馃憤1