The Smart Oil Fields group is dedicated to researching and developing software and models for improving the efficiency of the operation of oil fields. We have frequent communication with people working in the field to understand their needs and work on applications that have high practical value. This work is funded by Chevron as part of CiSoft.

Oil fields generate an immense amount of data every day. Oil companies have installed sensors on oil pumps and in oil pipes to monitor the status of their equipment at all times, these sensors generate millions of data points per day on each oil field. However, all these data are currently not utilized to their fullest potential. Many processes and important decisions are still made by on field operators monitoring for certain patterns in some particular data streams.

Our work aims to automate these processes so that decisions can be made more accurately and promptly. Solving these problems is a great challenge because of the huge amount of data involved. Some of our previous work includes a system for automatically integrating heterogeneous data across the many insufficiently documented data sets generated from oil fields, and a model for detecting failure events in oil pumps.

Areas of Interest:

Deep Learning, Text Analytics, Time Series Analysis, Data Integration, Event Modeling

Deep learning

Deep learning has taken over the machine learning commumity in the past few years. It has surpassed other machine learning models in many applications including image recognition, speech recogntion and bioinformatics.


We are developing a deep neural network framework for oilfield applications. Our vision is to automate decision processes that are currently carried out by human efforts. Specifically, we are working on the problems of steam job prediction and slippage detection. We develop models which capture the correlations between highly interdependent multidimensional sensor data collected from operating oilfields in California and the target decisions we are predicting. We are building a deep convolutional autoencoder which takes multi-dimensional time series as input and outputs encoded features. We input these features into a fully connected deep neural network which is trained for the task at hand.


Smart Oil Field Safety Net (SOSNet)

The goal of the SOSNet project Is to Integrate multiple heterogeneous data streams, apply complex analytics to Identify patterns for facilitating decision making for asset integrity management.


We are given multiple heterogeneous data sources and the objective is to find correlations, patterns and perform predictive analytics to achieve the big picture goal of making asset integrity related decision making process effective and robust, which is only possible if we provide a complete view of the environment.


To achieve this, the first step is to integrate multiple data streams. We use Ontologies to model and annotate our raw data sources, which facilitates automatic integration of the data streams. This gives us the web of knowledge or SOSNet facts. Using analytical and information extraction techniques, information (new data, symbols, and classifications) extracted from raw data (e.g. drawing, images etc.) also become part of this web of facts. This integrated repository drives applications such as predictive modeling, alarms and alerts, interactive visualizations and smart search applications.

Information Integration and Data Mining

The major challenge with Big Data analysis is attributed to varying granularity, incompatible data models, and complex interdependence across content. Existing frameworks for data analysis support specific types of data (for example, real-time streams, social content) and assume homogeneous computing resources which is rarely the case in real-world complex systems.


We are developing a framework for rapid integration of heterogeneous Big Data information sources. The framework captures complex interrelationships and interdependence across datasets and establishes probabilistic linkages among distributed content. The system is built using Semantic Web technologies, facilitating complex queries to be issued across the integrated data repositories. This approach complements existing techniques by providing probabilistic queries that take into account the discovered structure among the data sources. To facilitate rich analysis of the integrated datasets, we leverage existing statistical learning, machine learning, and data mining techniques. We are also developing algorithms that identify both simple and complex patterns across datasets. Such patterns are also used to improve the integration process.


Automation achieved in this manner not only reduces the manual effort involved in cleansing and processing large datasets significantly, but also ensures consistency and effective use of compute and storage resources. The framework is being validated on real-world use-cases from the petroleum industry.


Collaboration Analytics

The vast scale (Big data) of online human interactions impose challenges to the study of interdisciplinary theories that describe collaboration, which are by their nature intertwined in multiple dimensions. This research focuses on modeling such multidimensional data, mining their intra and inter dependencies to uncover hidden structures and emergent knowledge. In particular, we examine informal interactions at the workplace as well as online social media. We study users’ communication behavioral patterns, dynamics and characteristics, statistical properties and complex correlations between social and topical structures. We have performed quantitative studies of communication patterns in a corporate microblogging service. Our analysis suggests that users with strong local topical alignment tend to participate in focused interactions, whereas users with disperse interests contribute to multiple discussions, broadening the diversity of participants.


We are also developing models for predicting communication intention, recipient recommendation in microblogging services, collective opinion mining and sentiment analysis in social media, and expertise identification from heterogeneous datasets.


Recent Publications