2020 LDP Conference on Linked Data

A Machine Learning Approach for Classifying Sinopia's RDF

https://ld4p.github.io/classify-rdf-2020/
BackgroundSinopia's Classifying RDF ChallengeResource TemplatesPanda's DataFrameColab Notebook DemoNext Steps

Next Steps

Improving the RDF Model

Although the current accuracy of the Production model is over 75%, there are number of potential ways we can adjust the parameters of the model to improve the accuracy.

With the training and validation data coming from catalogers using Sinopia to generate RDF, over time the more data we have, the better the models will be for predicting resource templates.

Moving to Production

When the model's accuracy improves to be consistently over to 90%, we will create a simple web service (likely using AWS Lambda) that will accept an external RDF payload and return the best guess of what existing resource template best matches the RDF entity.

The source code repository for RDF Classify project is at https://github.com/LD4P/rdf-classify


Thank-you!

Jeremy Nelson
jpnelson@stanford.edu