Reliable predictions of oil formation volume factor based on transparent and auditable machine learning approaches

David A. Wood, Abouzar Choubineh

Abstract view|148|times       PDF download|29|times Supplementary file download|13|times


Neural-network, machine-learning algorithms are effective prediction tools but can behave as black boxes in many applications by not easily providing the exact calculations and relationships among the underlying input variables (which may or may not be independent of each other) involved each of their predictions. The transparent open box (TOB) learning network algorithm overcomes this limitation by providing the exact calculations involved in all its predictions and achieving acceptable and auditable levels of prediction accuracy. The TOB network, based on an optimized data-matching algorithm, can be applied in spreadsheet or fully-coded configurations. This algorithm offers significant benefits to analysis and prediction of many complex and difficult to measure non-linear systems. To demonstrate its prediction performance, the algorithm is applied to the prediction of crude oil formation volume factor at bubble point (Bob) using published datasets of 166, 203 and 237 data records involving 4 variables (reservoir temperature, gas-oil ratio, oil gravity and gas specific gravity). Two of these datasets display uneven and irregular data coverage. The TOB network demonstrates high prediction accuracy for Bob (Root Mean Square Error (RMSE) ~ 0.03; R2 > 0.95) for the more evenly distributed dataset. The performance of the TOB readily reveals the risk of overfitting such datasets. With its high levels of transparency and inhibitions to being overfitted, the TOB learning network offers an insightful approach to machine learning applied to predicting complex non-linear systems. Its results complement and benchmark the prediction contributions of neural networks and empirical correlations. In doing so it provides further insight to the underlying data.


Machine learning transparency; non-correlation-based machine learning; oil formation volume factor prediction; sparse data impacts; avoidance of overfitting

Full Text:

PDF Supplementary file


Al-Marhoun, M. A. PVT correlations for Middle East crude oils. Journal of Petroleum Technology. 1998, 40(05): 650-666.

Al-Marhoun, M. A. New Correlation for formation Volume Factor of oil and gas Mixtures. Journal of Canadian Petroleum Technology. 1992 31(3): 22-26.

Arabloo, M., Amooie, M. A., Hemmati-Sarapardeh, A., Ghazanfari, M. H., Mohammadi, A. H. Application of constrained multi-variable search methods for prediction of PVT properties of crude oil systems. Fluid Phase Equilibria. 2014, 363: 121-130.

Atkeson, C.G., Moore, A.W., Schaal. S. Locally weighted learning. Artificial Intelligence Review. 1997, 11(1-5):11-73.

Auret, L., Aldrich, C. Interpretation of nonlinear relationships between process variables by use of random forests. Minerals Engineering. 2012 35: 27-42.

Birattari, M., Bontempi, G., Bersini, H. Lazy learning meets the recursive least squares algorithm. In Advances in neural information processing systems. 1999, 375-381.

Bishop, C.M. Neural Networks for Pattern Recognition, 2nd ed. Oxford University Press. U.K. 1995, 482 pages. ISBN:0198538642

Bontempi, G., Birattari, M., Bersini, H. Lazy learning for local modelling and control design. International Journal of Control. 1999, 72(7-8): 643-658.

Chen, G. H., Shah, D. Explaining the Success of Nearest Neighbor Methods in Prediction. Foundations and Trends in Machine Learning. 2018, 10 (5-6): 337-588.

Choubineh, A., Ghorbani, H., Wood, D. A., Moosavi, S. R., Khalafi, E., Sadatshojaei, E. Improved predictions of wellhead choke liquid critical-flow rates: modelling based on hybrid neural network training learning based optimization. Fuel 2017, 207: 547-560.

Dokla, M., Osman, M. Correlation of PVT Properties for UAE Crudes (includes associated papers 26135 and 26316). SPE Formation Evaluation. 1992, 7(01): 41-46.

Dutta, S., Gupta, J. P. PVT correlations for Indian crude using artificial neural networks. Journal of Petroleum Science and Engineering. 2010, 72(1-2): 93-109.

El-Hoshoudy, A.N. Desouky, S. M. Numerical Prediction of Oil Formation Volume Factor at Bubble Point for Black and Volatile Oil Reservoirs Using Non-Linear Regression Models. Petroleum & Petrochemical Engineering Journal. 2018 2 (2): 000145.

Elkatatny, S., Mahmoud, M. Development of new correlations for the oil formation volume factor in oil reservoirs using artificial intelligent white box technique. Petroleum, 2018, 4(2): 178-186.

Fattah, K.A., Lashin, A. Improved oil formation volume factor (Bo) correlation for volatile oil reservoirs: An integrated non-linear regression and genetic programming approach. Journal of King Saud University – Engineering Sciences,#. 2018, 30: 398-404.

Frontline Solvers. Standard Excel Solver - Limitations of Nonlinear Optimization (accessed September 2018) .

Garcia, S., Derrac, J., Cano, J., Herrera, F. Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE transactions on pattern analysis and machine intelligence. 2012 34(3): 417-435.

Gharbi, R. B., Elsharkawy, A. M. Neural network model for estimating the PVT properties of Middle East crude oils. In Middle East Oil Show and Conference. Society of Petroleum Engineers. 1997 (January).

Glaso, O. Generalized pressure-volume-temperature correlations. Journal of Petroleum Technology. 1980 32(05): 785-795.

Haykin, S. Neural Networks: A Comprehensive Introduction, 3rd ed. Pearson / Prentice Hall, New York, U.S.A. 1999: 906 pages. ISBN-10:0-13-147139-2

Heinert, M. Artificial neural networks–how to open the black boxes. Application of Artificial Intelligence in Engineering Geodesy (AIEG 2008). 2008, 5: 42-62.

Irene, A. I., Sunday, I. S. Forecasting Oil Formation Volume Factor for API Gravity Ranges Using Artificial Neural Network. Advances in Petroleum Exploration and Development. 2013, 5(1): 14-21.

Jarrahian, A., Moghadasi, J., Heidaryan, E. Empirical estimating of black oils bubblepoint (saturation) pressure. Journal of Petroleum Science and Engineering. 2015, 126: 69-77.

Karimnezhad, M., Heidarian, M., Kamari, M., Jalalifar, H. A new empirical correlation for estimating bubble point oil formation volume factor. Journal of Natural Gas Science and Engineering. 2014 18: 329-335.

Katz D.L. Prediction of shrinkage of crude oils. API, Drill Prod Pract. 1942, 137–147.

Lever, J., Krywinski, M., Altman, N. Model selection and overfitting. Nature Methods. 2016, 13: 703-704. Published online:

Liang, P., Bose, N. K. Neural Network Fundamentals with graphs, algorithms, and applications. McGraw-Hill, New York. 1996.

Mahmood, M. A., Al-Marhoun, M. A. Evaluation of empirically derived PVT properties for Pakistani crude oils. Journal of Petroleum Science and Engineering. 1996, 16(4): 275-290.

Moghadam, J. N., Salahshoor, K., Kharrat, R. Introducing a new method for predicting PVT properties of Iranian crude oils by applying artificial neural networks. Petroleum Science and Technology. 2011, 29(10): 1066-1079.

Oloso, M. A., Hassan, M. G., Bader-El-Den, M. B., Buick, J. M. Hybrid functional networks for oil reservoir PVT characterisation. Expert Systems with Applications. 2017, 87, 363-369.

Omar, M. I., Todd, A. C. Development of new modified black oil correlations for Malaysian crudes. In SPE Asia Pacific oil and gas conference (January 1993). Society of Petroleum Engineers.

Petrosky Jr, G. E., Farshad, F. F. Pressure-volume-temperature correlations for Gulf of Mexico crude oils. In SPE annual technical conference and exhibition (1993, January). Society of Petroleum Engineers.

Rammay, M. H., & Abdulraheem, A. (2017). PVT correlations for Pakistani crude oils using artificial neural network. Journal of Petroleum Exploration and Production Technology. 2017 7(1): 217-233.

Rao, R. V., Savsani, V. J., Vakharia, D. P. Teaching–learning-based optimization: an optimization method for continuous non-linear large-scale problems. Information sciences. 2012, 183(1): 1-15.

Samworth, R. J. Optimal weighted nearest neighbour classifiers. The Annals of Statistics. 2012, 40(5): 2733-2763.

Schmidhuber, J. Deep learning in neural networks: An overview. Neural networks, 2015, 61: 85-117.

Shakhnarovich, G., Darrell, T., Indyk, P. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing), The MIT Press, 2006. ISBN:026219547X

Standing M.B. A pressure–volume–temperature correlation for mixtures of California oils and gases. API, Drill Prod Pract. 1947: 275–287.

Varotsis, N., Gaganis, V., Nighswander, J., Guieze, P. A novel non-iterative method for the prediction of the PVT behavior of reservoir fluids. In SPE Annual Technical Conference and Exhibition (1999, January). Society of Petroleum Engineers.

Vazquez, M., Beggs, H.D. Correlations for Fluid Physical Property Prediction. SPE-6719-PA. Journal of Petroleum Technology. 1980, 32 (6): 968–970. DOI: 10.2118/6719-PA.

Wood, D. A. A transparent open-box learning network provides insight to complex systems and a performance benchmark for more-opaque machine learning algorithms. Advances in Geo-Energy Research, 2018, 2(2): 148-162.

Wood, D.A., Choubineh, A., Vaferi, B. Transparent open-box learning network provides auditable predictions: pool boiling heat transfer coefficient for alumina-water-based nanofluids. Journal of Thermal Analysis and Calorimetry. 2018, 20 pages, Published online 12 September 2018.


  • There are currently no refbacks.

Copyright (c) 2019 The Author(s)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Copyright ©2018. All Rights Reserved