Machine learning is powerful, and the results we see from deep-learning papers are typically impressive. But can we be confident enough in them to make business decisions that have real, monetary outcomes? For any type of isolated tasks, such as learning to detect a face in an image, or classify a credit card transaction as fraudulent or not, or classify a frequently rebooting device as defective, we can usually construct the relevant low-level features (e.g., pixels, filters, zip codes, vendors, reboots, URLs) and solve the problem using conventional machine learning techniques. These conventional techniques that analyze entities in isolation have their limitations. In the examples above, we may like to not only detect the face in an image but to recognize that, for example, it is the face of a tall man who is carrying a pistol and running away from a bank, or a little boy jumping into a pool. And we may find it more useful to know whether, for example, the device is frequently rebooting as it is commonly occurring with a firmware version when connecting to a particular Internet service, or a test engineer is repeatedly power-cycling the device in a lab. Conventional machine learning techniques have two primary assumptions that limit their application in relational domains. Firstly, they don’t handle inherent uncertainty well. Secondly, they don't analyze networks of interconnected entities to exploit the compositional structure. These two missing pieces are fundamental to understanding and designing AI systems that address large scale problems of high uncertainty. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Internet, to images, video, and natural language. In relational domains such as intelligence analysis, Internet connected experiences, bioinformatics, citation analysis, epidemiology, fraud detection, and web analytics, there is often limited information about any one entity in isolation. Instead, it is the connections among entities that are critically important to pattern discovery. Statistical Relational Learning (SRL), an emerging and exciting field of data science, is revolutionizing the field of automated learning and discovery by moving beyond the conventional analysis of entities in isolation to analyze networks of interconnected entities. SRL is a combination of statistical learning, which addresses uncertainty in data, and relational learning, which deals with complex relational structures. Statistical relational learning builds on ideas from probability theory and statistics to address uncertainty while incorporating tools from logic, databases, and programming languages to represent structure.
Relational data instances are usually more varied and complex, and dependencies may occur either as a result of direct relations or through chaining multiple relations together. Data abundance combined with powerful data science techniques has the potential to dramatically improve our lives by creating delightful experiences, enabling new high-quality products and services, and creating positive financial impact to businesses. Ultimately, we are interested in not just answering an isolated yes / no question, but in understanding the nuances by producing and manipulating structured representations of the data. The challenge is to develop the models and algorithms that enable effective and robust reasoning from interconnected entities and relational data at large scale and high performance. At Omny IQ we are happy to share how we apply SRL AI to connected experience problems of high uncertainty and large scale. by Vivek Pathela, chief instigator at Omny IQ
7 Comments
Christian
8/13/2019 05:45:43 pm
Bravo, great post! This explains it very well in layman terms! Thanks!
Reply
Sandra W.
8/14/2019 05:18:07 am
Very interested in this space and what you are doing at the frontier.
Reply
Mark
8/14/2019 06:43:32 am
Very insightful! Sharing with my team. Thanks
Reply
Jason M
8/14/2019 09:35:48 am
Very cool! 😎 agree on the challenge in developing high-impact ML technologies is balancing the need to model rich, structured domains with the ability to scale to big data. SRL models are the ones to watch.
Reply
Ronen Stern
8/14/2019 11:01:09 am
Highly impressed by the concept of determining and adding data to improve accuracy and outcomes vs. continuous fine tuning and optimization of existing data set. The challenge of this is to use the dynamic changes while still operating on existing data set and determine factors that impacted performance. Looking forward to further details and how Omny IQ solution implements the logic for optimal performance. Very cool!!
Reply
Vivek
8/14/2019 01:57:52 pm
Yes indeed Ronen! It's a question about how we've designed the architecture to address dynamic changes, scalability and performance. We shall soon show you enablement for 55 million plus devices. Thanks!
Reply
Eriq
8/15/2019 12:54:30 pm
Wow Vivek, this is great!
Reply
Your comment will be posted after it is approved.
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2019
Categories
All
|