Dimitris K. | The Knowledge Graph Conference

The Knowledge Graph Conference Icon

The Knowledge Graph Conference

Commented on Exploring Blank Nodes and Local Scopes in RDF Grap...·Posted inAsk

·

Sven V. these algorithms are used to compute a consistent hash for a dataset but in order to do that, they also compute deterministic IDs for blank nodes (depending on the algorithm you choose). One option could be to use these IDs as a suffix in the skolem namespace you will create, e.g. http://example.com/.well-known/genid/{bnode_id}

Commented on Exploring Blank Nodes and Local Scopes in RDF Grap...·Posted inAsk

·

There are also algorithms for providing consistent IDs (or skolem IRIs) for these blank nodes, see references in https://w3c.github.io/rch-wg-charter/explainer.html

Commented on Exploring Blank Nodes and Local Scopes in RDF Grap...·Posted inAsk

·

RDF 1.1 defines the proposed way for generating IRIs for blank nodes: https://www.w3.org/TR/rdf11-concepts/#section-skolemization

Commented on Enforcing Unique Triples in RDF for sdo:Corporatio...·Posted inAsk

·

you can also look here for more details: https://book.validatingrdf.com/

Commented on Integrating Tabular Data with Neptune Graph: Seeki...·Posted inAsk

·

there is some work in mapping RDF to JPA (Java persistence), I am not very familiar with the details but maybe there is a way to use that approach and persist the data to different storage systems. I mean use the Java object as the intermediate form to map between storage systems https://hobbitdata.informatik.uni-leipzig.de/quweda/quweda2017/QuWeDa_2017_paper_8.pdf

Commented on Recommendations for Lifting Tools to Load XML Data...·Posted inAsk

·

you can check https://rml.io/

Commented on Best Practices for Handling Entity Resolution in D...·Posted inAsk

·

the general steps for a KG pipeline is data extraction, data enrichment, linking and fusion (and data quality checks in various parts of the pipeline). How these steps are implemented (technically) is highly dependent on how you acquire and maintain your sources and if you need manual curation. e.g. creating a KG from a few static input sources can have a different approach from one that has input sources with high change frequency, But, overall, what you describe sounds correct

Commented on Best Practices for Handling Entity Resolution in D...·Posted inAsk

·

For prototyping, I would say that picking any ontology that is close to your model would work fine. For taking it to the next level, you should decide if you will stick with an external ontology or build your own, the former makes things easy at first but can get complicated if you need to extend your model in the future.

Commented on Best Practices for Handling Entity Resolution in D...·Posted inAsk

·

a good provenance scheme is something that you might also want to pay attention to, this can help you identity the source of data quality issues that surface in your data

Commented on Best Practices for Handling Entity Resolution in D...·Posted inAsk

·

wrt linking, this again depends on your use case and how you want to store or query your data, you may decide to do a "hard" deduplication where you take all the duplicates and replace them with a single entity by fusing/merging all fields, or you may decide that you want to keep the data in the original form and do the fusion/merging on-demand or in real time

About

Job title
Location
Organization