CESC Adopts Artificial Intelligence & Machine Learning in Predicting Underground HT Cable Faults

Debashis U Banerjee, Managing Director (Distribution), CESC Limited 

Arnab Kundu, Senior Executive, Business Intelligence & Analytics, CESC Limited 


About the Company

CESC Limited is a flagship company of RP-Sanjiv Goenka Group. It is a fully integrated power utility, with its operation spanning the entire value chain: from generation to distribution of energy. CESC is the sole distributor of electricity within an area of 567 square km of Kolkata and its neighboring areas, serving around 3.4 million customers, which includes domestic, industrial and commercial users, and has been delivering safe, cost-effective and reliable energy, since 1899. Recently, CESC has been presented with the coveted IEEE milestone for establishing the first commercial electric supply company in South Asia. CESC switched on the 1000kW thermal power generation plant at Prinsep Street in Kolkata on 17th April 1899. The event heralded the era of electricity in the Indian Subcontinent.


Defining the Problem

To cater power to 3.4 million customers, almost 6800 circuit-km of High Tension (HT) (11/6 kV) distribution network is spread over the entire licensed area. This network consists of 1800 HT feeders emanating from 116 distribution stations, feeding around 1700 HT customers and 8500 distribution transformers, which in turn step down the voltage to 415/230V level to feed the low voltage customers. Hence, the HT distribution network plays a key role in maintaining the reliability of the distribution system.


In CESC, in order to have situational awareness and remote controllability, SCADA (Supervisory Control and Data Acquisition) system has been installed for the entire HT network. This system helps monitor the respective feeder loading in real-time. Should any HT feeder becomes overloaded, an alarm is generated in the control room through the help of SCADA in real-time. Depending on the load, the then network topology, and the available redundancy of the system, the HT field team relieves the overloaded feeder by reducing its loading, through suitable network reorganization, by shifting the partial load onto the adjoining HT network. This process helps in controlling the probable HT feeder tripping in the system due to the exceeding of over-loading thresholds of the particular protective relays installed for this feeder. In spite of all this precaution, it has been observed that there are several scenarios, in which feeder trippings take place in spite of the HT cables operating within their permissible loading limits/capacity, primarily owing to cable faults (either due to breakdown of HT cable insulation or discontinuity of the conductor inside the cable joint). Besides the overloading factor, many other factors (e.g. cable aging, cable surrounding conditions, weather, and external influences) also play a significant role in underground cable fault occurrence.


The Inspiration Behind the Solution-Exploration

Random HT cable faults, spread across the license area, result in unavoidable short-term disruptions to the distribution system, causing inconvenience to the customers at large. Primarily, HT cable faults are being addressed in a reactive manner, so much so, that all efforts are focused on repairing the HT cable faults promptly the moment an HT cable fault occurs and the HT feeder trips, stopping the feeding of the fault, and restoring the services fed from the tripped feeder, automatically. In our continued journey towards migration from a relatively reactive regime to a more proactive and predictive regime, a need was felt from within the organization to predict and hence pre-empt an HT cable fault, using contemporary statistical tools and techniques, further enhancing the customer experiences.


Brief Solution to the Problem

A cross-functional team (CFT), working on this issue, brainstormed to arrive at several factors influencing the tripping of HT feeders due to cable faults.  Factors like deterioration of cable health due to aging, cable surrounding conditions, weather changes, and factors of external influences contribute to fault occurrences leading to an HT feeder trip. In such circumstances, it has become extremely important to find out the statistical interference of these influencing parameters towards a cable fault occurrence, so that preventive actions can be initiated or strengthened in order to increase system reliability. By analyzing the historical fault occurrence, we have zeroed in on 30 factors that have been found to be having moderate/strong co-relation with fault occurrence. Primarily, these factors encompass 4 broad categories: cable-related, cable surrounding-related, load-related & weather related as depicted in Table 1.


Table 1

Static Parameters

Dynamic Parameters

Cable related parameters

Cable surrounding related parameters

Load related parameters

Weather related parameters

Cable Length

Cable Depth

Maximum /Average Load

Max / Min / Mean Temperature

No. of Joints

Parallel Cable Running or not

Co-efficient of Variation in Load

Variance in Temperature

Average Length between 2 joints

Cable Disturbance

Peak Load Generation

Max / Min / Mean Humidity

Age of Cable

Road Traffic

Peak Load Duration

Rain Perception

Cable Size

Soil Condition

Peak Heat Generation


Cable Type

Frequency of Disturbances

Heat Dissipation Factor


Armour Condition

Health Score


Lead Exposure



Cable and its surrounding related parameters are only static parameters. However, load and weather-related parameters are the only dynamic parameters that vary daily. The limitation in the existing monitoring system in handling the dynamicity and finding correlation of all these parameters simultaneously for accurate HT cable fault prediction necessitated the development of a dynamic statistical model for predictive analytics.


Approach Framework  

In the journey of transformation towards digital utility, we leverage contemporary technologies like predictive analytics using artificial intelligence (AI) & machine learning (ML). A predictive model is developed for Day-Ahead Fault Prediction. This model considers historical fault data (for the last 6 years) and associated parameters to trigger the suspected feeder in real-time. This hybrid model also handles historical weather and associated loading parameters for 1800 HT feeders (378 million data points) to forecast the feeder-wise day ahead loading pattern across 96 intervals in a day. These forecasted loads in turn feed the main component of the predictive model to generate the day ahead suspected feeder list. In addition to that, the output of the model is verified with the actual fault occurrence which feedbacks the model in a closed loop chain for further tuning of the model as a part of reinforcement learning. The schematic diagram of the entire solution is presented in Fig. 1. The major steps involved in the solution are described next.



Description automatically generated

Fig 1


Step I: Day-ahead load forecasting at 15-mins intervals for 1800 HT feeders 

Load related parameters are one of the crucial dynamic parameters which play a significant role in the core model of fault prediction technique. Hence, an accurate 15-mins interval load forecast of each feeder plays a key role with respect to the accuracy of the fault prediction model. From the load profile analysis, a seasonal trend can be observed in the time series data. Also, a repeated cycle of loading value is found to have a strong correlation with the seasonal variation in temperature, humidity, and rainfall.


Earlier significant studies had been performed on Short Term Load Forecasting (SLTF) for one hour to month-ahead load forecast since it has a high impact on the economy. Statistical method-based forecasting like exponential smoothing models (ESM), multiple linear regression (MLR), auto-regressive and moving average (ARMA) [1], artificial intelligence method-based forecasting like artificial neural networks (ANN) [2], fuzzy regression models (FRM), and support vector machines (SVM) are the most popular forecasting methods. A comprehensive review of models & techniques used in load forecasting is presented in [3,4]. As per [5], the VARMA model outperforms the standard univariate ARMA and Winters’ methods for time series forecasting.  


In our problem, we have extended the univariate forecasting problem to multivariate time series forecasting. The feeder load and corresponding weather data are analyzed as correlative time series and are reconstructed to the multivariate phase space. We have deployed a multivariate vector-ARMA (VARMA) model for day-ahead 15-mins interval load forecasting of 1800 such HT feeders. The day-ahead load prediction of one such feeder with respect to the actual load is presented in Fig. 2. The day-ahead forecasted a load of all HT feeders is used to derive the load-related parameters which have been used as input parameters for the fault prediction model, whereas the day-ahead weather forecast data is scrapped from the website and fed to the fault prediction model.



Step II: Day-ahead prediction of probable fault-prone feeders

The binary class dataset used for the fault prediction model suffers from a class imbalance problem since the count of faulty feeders is much less than healthy feeders. The decision trees do not require any prior assumption about the probability distributions addressed by the class and other significant discrete and continuous attributes [6]. Also, the accuracy of any decision tree is not affected by the presence of redundant attributes and outlier data. C4.5 decision tree classifier was chosen over the ID3 decision tree due to its higher accuracy & lower average CPU execution time [7]. We have used the J48 consolidated decision tree, implemented in Weka, an open-source data mining software developed by the University of Waikato in New Zealand [8]. J48 consolidated classifier is an implementation of the C4.5 decision tree algorithm [9] with a minor deviation in the knowledge extraction process [10]. The mechanism behind the knowledge extraction process is said to be built on the consolidated tree construction (CTC) algorithm [11] for providing better structural stability to the tree. The resampling technique used in the J48 consolidated decision tree also aids in equilibrating the class distribution problem. 



The fault prediction model equips the control room to take corrective actions based on the network topology and orientation of the HT feeders, suspected feeders in the system, the redundant capacity of reliever feeders, and other feasible factors, with 72% prediction accuracy. Also, the critical/essential services fed from the suspected feeders can be shifted to the healthy feeder. All of these preventive activities not only increase the system availability by improving SAIDI/SAIFI but also increase customer delight by avoiding service interruption for a large number of customers. In addition to the above, the Just-in-Time (JIT) approach for Operational Expenditure (OPEX), as well as Capital Expenditure (CAPEX) planning has been adopted for the underground HT cable network based on such data-driven insights/decisions. 



1.Y. Chakhchoukh, P. Panciatici, and L. Mili, “Electric load forecasting based on statistical robust methods,” IEEE Transactions on Power Systems, vol. 26, no. 3, pp. 982-991, 2011.
2.D.-x. Niu, H.-f. Shi, and D. D. Wu, “Short-term load forecasting using bayesian neural networks learned by Hybrid Monte Carlo algorithm,” Applied Soft Computing, vol. 12, no. 6, pp. 1822-1827, 2012.
3.Nti, I.K.; Teimeh, M.; Nyarko-Boateng, O.; Adekoya, A.F. Electricity Load Forecasting: A Systematic Review. J. Electr. Syst. Inf Technol 20207, 13.
4.Hong, T.; Fan, S. Probabilistic Electric Load Forecasting: A Tutorial Review. Int. J. Forecast. 201632, 914–938.
5.Patrick Aboagye-Sarfo, Qun Mai, Frank M. Sanfilippo, David B. Preen, Louise M. Stewart, Daniel M. Fatovich, A comparison of multivariate and univariate time series approaches to modeling and forecasting emergency department demand in Western Australia, Journal of Biomedical Informatics, Volume 57,2015, Pages 62-73, ISSN 1532-0464.
6.Piltaver, Rok, Luštrek, Mitja, Gams, Matjaž and Martinčić-Ipšić, Sanda (2016) “What makes classification trees comprehensible?”, Expert Systems with Applications62(16): 333-346.
7.Hssina, Badr, Merbouha, Abdelkarim, Ezzikouri, Hanane and Erritali, Mohammed (2014) “A comparative study of decision tree ID3 and C4.5”, International Journal of Advanced Computer Science and Applications, Special Issue on Advances in Vehicular Ad Hoc Networking and Applications:13-17.
8.Bhargava Neeraj, Sharma Girja, Bhargava Ritu, Mathuria Manish (2013) “Decision Tree Analysis on J48 Algorithm for Data Mining”, International Journal of Advanced Research in Computer Science and Software Engineering:1114-1119.
9.Quinlan J Rose (1993) “C4.5: Programs for Machine Learning”, Morgan Kaufmann Publishers Inc.(eds), San Mateo, California
10.Arbelaitz, Olatz, Gurrutxaga, Ibai, Lozano, Fernando, Muguerza, Javier, Pérez, Jesus Maria M (2013) “J48Consolidated: An implementation of CTC algorithm for WEKA”, Technical Report EHU-KAT-IK-05-13, University of the Basque Country (UPV/EHU). http://www.sc.ehu.es/aldapa/weka-ctc/Weka-CTC.pdf, Accessed on21Dec, 2017
11.Ibarguren, Igor, Pérez, Jesús M., Muguerza, Javier, Gurrutxaga, Ibai and Arbelaitz, Olatz (2015) “Coverage-based resampling: Building robust consolidated decision trees”, Knowledge-Based Systems79(C):51-67


Mr. Debasish Banerjee, the Managing Director (Distribution) of CESC Limited is an Electrical Engineer with proficiency in Business Management, having 38 years of rich and diverse industry experience. He commenced his professional career at Areva and moved on to Crompton Greaves and Schneider Electric, heading Business Operations in Dealer, Industry & Utility domains. In his last stint as CEO of Reliance Energy, he contributed to improving Operational Efficiency and Optimizing Costs through Business Processes Reengineering & Automation, thus increasing the bottom line and customer delight. In his current capacity as MD (Distribution) of CESC, he has ushered in a transformational journey for leapfrogging CESC to newer heights of excellence with enhanced operational resilience and efficiency for effective business continuity during and after any crisis. Adoption of Industry 4.0 and sensor-based IoT along with Big Data Analytics and Immersive Technologies / Digital Twins has enabled the shift from preventive to predictive maintenance, thus making CESC more agile, customer centric, cost-effective, and a digital utility of the future. Being a firm believer in the process of technology, he has also initiated the integration of various AI/ML-driven applications in key processes, thus transforming them into proactive edge-based decision-making.  In pursuit of his passion to deploy cutting-edge technologies for radical change, he is constantly engaged in embracing disruptive technological innovations, for enhancing Customer & Employee Experience (CX & EX). Rapid deployment of digital platforms like ChatBot, WhatsappBot & unique vernacular VoiceBot embedded with AI/ML and NLL/NLP technologies has further enriched Customer Service delivery. He has been instrumental in demonstrating sustainability, which is one of the core values of CESC through Digitization & Decentralization for providing safe, cost-effective, and reliable electricity. He is engaged in developing a responsible and diverse value chain for powering a sustainable future &

creating a positive societal impact for a better planet and people.


Mr. Arnab Kundu is a Senior Executive in the Business Intelligence & Analytics of CESC Limited, Kolkata. He is an M.Tech. in Computer Science from Indian Statistical Institute in 2016 & B.E. in Electrical Engineering from Jadavpur University in 2013. He is also the recipient of a gold medal for the best student project award from the Indian Statistical Institute in 2016. Currently, he is responsible for designing and implementing various Business Intelligence & Analytics projects, across the organization.