Logo Logo
Help
Contact
Switch language to German
Semantic-guided predictive modeling and relational learning within industrial knowledge graphs
Semantic-guided predictive modeling and relational learning within industrial knowledge graphs
The ubiquitous availability of data in today’s manufacturing environments, mainly driven by the extended usage of software and built-in sensing capabilities in automation systems, enables companies to embrace more advanced predictive modeling and analysis in order to optimize processes and usage of equipment. While the potential insight gained from such analysis is high, it often remains untapped, since integration and analysis of data silos from different production domains requires high manual effort and is therefore not economic. Addressing these challenges, digital representations of production equipment, so-called digital twins, have emerged leading the way to semantic interoperability across systems in different domains. From a data modeling point of view, digital twins can be seen as industrial knowledge graphs, which are used as semantic backbone of manufacturing software systems and data analytics. Due to the prevalent historically grown and scattered manufacturing software system landscape that is comprising of numerous proprietary information models, data sources are highly heterogeneous. Therefore, there is an increasing need for semi-automatic support in data modeling, enabling end-user engineers to model their domain and maintain a unified semantic knowledge graph across the company. Once the data modeling and integration is done, further challenges arise, since there has been little research on how knowledge graphs can contribute to the simplification and abstraction of statistical analysis and predictive modeling, especially in manufacturing. In this thesis, new approaches for modeling and maintaining industrial knowledge graphs with focus on the application of statistical models are presented. First, concerning data modeling, we discuss requirements from several existing standard information models and analytic use cases in the manufacturing and automation system domains and derive a fragment of the OWL 2 language that is expressive enough to cover the required semantics for a broad range of use cases. The prototypical implementation enables domain end-users, i.e. engineers, to extend the basis ontology model with intuitive semantics. Furthermore it supports efficient reasoning and constraint checking via translation to rule-based representations. Based on these models, we propose an architecture for the end-user facilitated application of statistical models using ontological concepts and ontology-based data access paradigms. In addition to that we present an approach for domain knowledge-driven preparation of predictive models in terms of feature selection and show how schema-level reasoning in the OWL 2 language can be employed for this task within knowledge graphs of industrial automation systems. A production cycle time prediction model in an example application scenario serves as a proof of concept and demonstrates that axiomatized domain knowledge about features can give competitive performance compared to purely data-driven ones. In the case of high-dimensional data with small sample size, we show that graph kernels of domain ontologies can provide additional information on the degree of variable dependence. Furthermore, a special application of feature selection in graph-structured data is presented and we develop a method that allows to incorporate domain constraints derived from meta-paths in knowledge graphs in a branch-and-bound pattern enumeration algorithm. Lastly, we discuss maintenance of facts in large-scale industrial knowledge graphs focused on latent variable models for the automated population and completion of missing facts. State-of-the art approaches can not deal with time-series data in form of events that naturally occur in industrial applications. Therefore we present an extension of learning knowledge graph embeddings in conjunction with data in form of event logs. Finally, we design several use case scenarios of missing information and evaluate our embedding approach on data coming from a real-world factory environment. We draw the conclusion that industrial knowledge graphs are a powerful tool that can be used by end-users in the manufacturing domain for data modeling and model validation. They are especially suitable in terms of the facilitated application of statistical models in conjunction with background domain knowledge by providing information about features upfront. Furthermore, relational learning approaches showed great potential to semi-automatically infer missing facts and provide recommendations to production operators on how to keep stored facts in synch with the real world.
Machine Learning, Knowledge Graph, Manufacturing
Ringsquandl, Martin
2019
English
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Ringsquandl, Martin (2019): Semantic-guided predictive modeling and relational learning within industrial knowledge graphs. Dissertation, LMU München: Faculty of Mathematics, Computer Science and Statistics
[img]
Preview
PDF
Ringsquandl_Martin.pdf

11MB

Abstract

The ubiquitous availability of data in today’s manufacturing environments, mainly driven by the extended usage of software and built-in sensing capabilities in automation systems, enables companies to embrace more advanced predictive modeling and analysis in order to optimize processes and usage of equipment. While the potential insight gained from such analysis is high, it often remains untapped, since integration and analysis of data silos from different production domains requires high manual effort and is therefore not economic. Addressing these challenges, digital representations of production equipment, so-called digital twins, have emerged leading the way to semantic interoperability across systems in different domains. From a data modeling point of view, digital twins can be seen as industrial knowledge graphs, which are used as semantic backbone of manufacturing software systems and data analytics. Due to the prevalent historically grown and scattered manufacturing software system landscape that is comprising of numerous proprietary information models, data sources are highly heterogeneous. Therefore, there is an increasing need for semi-automatic support in data modeling, enabling end-user engineers to model their domain and maintain a unified semantic knowledge graph across the company. Once the data modeling and integration is done, further challenges arise, since there has been little research on how knowledge graphs can contribute to the simplification and abstraction of statistical analysis and predictive modeling, especially in manufacturing. In this thesis, new approaches for modeling and maintaining industrial knowledge graphs with focus on the application of statistical models are presented. First, concerning data modeling, we discuss requirements from several existing standard information models and analytic use cases in the manufacturing and automation system domains and derive a fragment of the OWL 2 language that is expressive enough to cover the required semantics for a broad range of use cases. The prototypical implementation enables domain end-users, i.e. engineers, to extend the basis ontology model with intuitive semantics. Furthermore it supports efficient reasoning and constraint checking via translation to rule-based representations. Based on these models, we propose an architecture for the end-user facilitated application of statistical models using ontological concepts and ontology-based data access paradigms. In addition to that we present an approach for domain knowledge-driven preparation of predictive models in terms of feature selection and show how schema-level reasoning in the OWL 2 language can be employed for this task within knowledge graphs of industrial automation systems. A production cycle time prediction model in an example application scenario serves as a proof of concept and demonstrates that axiomatized domain knowledge about features can give competitive performance compared to purely data-driven ones. In the case of high-dimensional data with small sample size, we show that graph kernels of domain ontologies can provide additional information on the degree of variable dependence. Furthermore, a special application of feature selection in graph-structured data is presented and we develop a method that allows to incorporate domain constraints derived from meta-paths in knowledge graphs in a branch-and-bound pattern enumeration algorithm. Lastly, we discuss maintenance of facts in large-scale industrial knowledge graphs focused on latent variable models for the automated population and completion of missing facts. State-of-the art approaches can not deal with time-series data in form of events that naturally occur in industrial applications. Therefore we present an extension of learning knowledge graph embeddings in conjunction with data in form of event logs. Finally, we design several use case scenarios of missing information and evaluate our embedding approach on data coming from a real-world factory environment. We draw the conclusion that industrial knowledge graphs are a powerful tool that can be used by end-users in the manufacturing domain for data modeling and model validation. They are especially suitable in terms of the facilitated application of statistical models in conjunction with background domain knowledge by providing information about features upfront. Furthermore, relational learning approaches showed great potential to semi-automatically infer missing facts and provide recommendations to production operators on how to keep stored facts in synch with the real world.