Abstract
Recent advancements in the field of natural language processing (NLP) and machine learning has allowed for the potential to ingest decades of field history and heterogeneous production records. This paper proposes an analytics workflow that leverages artificial intelligence to process thousands of historical workover reports (handwritten and electronic), extract important information, learn patterns in production activity, and train machines to quantify workover impact and derive best practices for field operations.
Natural language processing libraries were developed to ingest and catalog gigabytes of field data, identify rich sources of workover information, and extract workover and cost information from unstructured reports. A clustering based architecture was developed and trained to categorize documents based on free text describing the activities found in reports. This machine learning model learnt the pattern and context of repeating words and was able to cluster documents with similar content together. This enabled the user to find a category of documents e.g. workover intervention reports instantaneously. Statistical models were built to determine return on investment from workovers and rank them based on production improvement and payout time.
Today, 80% of an oilfield expert's time can be spent manually organizing data. When processing decades of historical oilfield production data spread across both structured (production timeseries) and unstructured records (e.g., workover reports), experts often face two major challenges: 1) How to rapidly analyze field data with thousands of historical records. 2) How to use the rich historical information to generate effective insights to take the proper actions to optimize production. In this paper, we analyzed multiple field datasets in a heterogeneous file environment with 20 different file formats (PDF, Excel, and other formats), 2,000+ files, production history spanning 50+ years across, and 2,000+ producing wells. Libraries were developed to extract files from complex folder hierarchies, machine learning architectures assisted in finding the workover reports from the myriad documents. Information from reports was extracted through Python libraries and optical character recognition technology to build master data source with production history, workover and cost information. The rich dataset was then used to analyze episodic workover activity by well and compute key performance indicators (KPIs) to identify well candidates for production enhancement. The building blocks included quantifying production upside and calculating return of investment for various workover classes. O&G companies have vast volumes of unstructured data and use less than 1% of it to uncover meaningful insights about field operations. Our workflow describes a methodology to ingest both structured and unstructured documents, capture knowledge, quantify production upside, understand capital spending, and learn best practices in workover operations through an automated process. This process helps optimize forward operating expense (OPEX) plans with a focus on cost reduction and shortened turnaround time for decision making.