🌍 bp Global Innovation  Β·  I&E Digital Science & Engineering
⚑

EV Site Planning & Innovation POCs

bp (British Petroleum) β€” Global Energy Major

2023  Β·  Role: Innovation Technical Project Manager

13.1M
EVs in Target Market
China 2022, +67% YoY
6
Core ML Models
spatial Β· density Β· forecast
5
Data Source Categories
population Β· NEV Β· POI Β· …
5
AI Platform Modules
heatmap Β· eval Β· rec Β· …
$40K
PoC Budget
DS&E global funding
01
Project Overview
AI-driven site selection PoC to accelerate bp's EV charging network expansion in China

Problem Statement

China's EV market hit 13.1M vehicles in 2022 (+67% YoY), creating urgent demand for high-value charging locations. Existing site selection was manual, inefficient, costly, and difficult to assess at scale. No structured AI modelling tool existed in the industry to systematically identify optimal sites.

Manual & inefficient High assessment cost No AI tooling Fast-moving market

Solution & PoC Objectives

Build a modelling platform combining GIS spatial AI, ML classification, and forecast algorithms to score, rank, and predict performance of candidate EV charging sites β€” enabling business teams to make faster, data-driven expansion decisions. Pilot city: Guangzhou. Validation against business development on-site investigation results.

AI site scoring Performance forecast GIS visualisation Validated vs. BD data
02
System Architecture
5-layer AI platform: raw data β†’ derived features β†’ ML engine β†’ AI modules β†’ user interface
USER LAYER
πŸ—ΊοΈ Heat Map
Density visualisation
πŸ“ Site Evaluation
Score + grade
πŸ† Site Recommendation
Ranked list
⭐ Good Sites
Factor analysis
πŸ“ˆ Perf. Forecast
Revenue Β· break-even
↕ SaaS Platform API (Alading)
AI MODULES
🎯 Location Evaluation
Classification algorithms
πŸ’‘ Location Recommendation
Clustering + CF
πŸ“Š Performance Forecast
Regression algorithms
↕ Spark MLlib Β· Model Training Pipeline Β· Feature Selection
ML ENGINE
🌐 Kernel Density
Spatial distribution
πŸš— OD/PA Model
Population flow
πŸ“ Spatial Interpolation
Feature coverage
πŸ•ΈοΈ Force Layout
Feature correlation
πŸ™οΈ Business District
Commercial analysis
πŸ’Ž Value Factors
High-value site ID
↕ Data Processing Β· Deduplication Β· Normalisation Β· Derived Feature Engineering
DATA LAYER
πŸ‘₯ Population Portrait
Demographics Β· finance
πŸ“ AOI / POI
Location attributes
🚘 NEV Data
Static vehicle counts
πŸ›£οΈ Traffic Flow
Dynamic + real-time
⚑ Existing EVC Sites
Competitor landscape
↕ Multi-source Data Ingestion Β· Crawler Β· Telecom APIs Β· Map APIs
SOURCE DATA
πŸ“‘ Telecom Providers
πŸ—ΊοΈ Map / GIS Companies
πŸš• Ride-hailing Platforms
🏦 Insurance Data
πŸ•ΈοΈ Web Crawlers
Stack: Spark MLlib Python GIS / Spatial AI SaaS Platform REST APIs
03
Data Dashboard & Market Context
Market scale, data coverage, and PoC resource breakdown
EV Site Planning PoC β€” Project Dashboard 2023 Β· Guangzhou Pilot
πŸš—
13.1M
EVs in China (2022)
↑ +67% year-on-year growth
πŸ€–
6
Core ML Models Deployed
↑ Kernel density Β· OD/PA Β· Spatial Β· …
πŸ“
5+
Data Source Categories
↑ Population Β· NEV Β· POI Β· Traffic Β· EVC
πŸ’°
$40K
PoC Investment (DS&E)
↑ SaaS + validation report
ML Algorithm Coverage by Module
Classification
Eval
Clustering
Rec
Recommendation
Rec
Regression/Forecast
Fcst
Dim. Reduction (PCA)
Feat
Data Source Diversity
Population Portrait
High
AOI / POI Data
Core
NEV Static Data
High
Traffic Flow (Dynamic)
Med
Existing EVC Sites
Med
04
Core ML Models
6 spatial AI and statistical models forming the site-scoring engine
Model 01
🌐 Kernel Density Model
Collects feature data within a defined analysis radius, calculates the density of relevant features (population, POIs, NEVs) in the surrounding area. Produces spatial heatmaps to surface demand hotspots.
Spatial statisticsBandwidth selection
Model 02
πŸš— OD / PA Flow Model
Processes tens of millions of smart mobile device records to model city-level population travel patterns. Derives Origin-Destination and Production-Attraction matrices to estimate actual demand flows to candidate sites.
Big data processingFlow estimation
Model 03
πŸ“ Spatial Interpolation Model
Analyses the spatial extent of feature distributions and finds the nearest input sample subset to predict feature values at unsampled locations across the entire study area.
IDW / KrigingGIS interpolation
Model 04
πŸ•ΈοΈ Force-Oriented Layout Model
Draws a topological relationship network by calculating correlation degrees between all features β€” visualises the aggregation and compactness of site-relevant factors to reveal key driver clusters.
Graph layoutCorrelation analysis
Model 05
πŸ™οΈ Business District Analysis Model
Analyses customer situation, site conditions, and surrounding operational factors within a defined commercial area to identify the optimal location address within a business district.
Commercial circleBoundary analysis
Model 06
πŸ’Ž Value Factors Analysis Model
Identifies high-value sites by analysing key success factors from existing good-performing sites, then forms a location selection model to systematically discover new high-potential sites.
Feature importanceGood-site learning
05
Data Sources & Feature Engineering
5 data categories across demand signals, supply data, and geospatial intelligence
πŸ‘₯
Population Portrait
Population size, demographics (gender, age), lifestyle preferences (travel, shopping), and financial profiles β€” used to model demand density and user segment affinity.
Source: Telecom companies Β· Insurance companies
πŸ“
AOI / POI (Area & Point of Interest)
Name, address, coordinates, telephone and other attributes for commercial zones, residential areas, transit hubs, and destinations β€” core inputs for spatial feature extraction.
Source: Map companies Β· Crawler & data processing
🚘
New Energy Vehicle Data (Static)
NEV count by brand and model segmented by geographic area β€” identifies actual EV ownership concentration and reveals high-demand zones.
Source: Insurance companies Β· Crawler
πŸ›£οΈ
Traffic Flow Data (Dynamic)
Road grade, time-of-day segmented traffic volumes, real-time NEV counts on road, e-hailing NEV booking counts, and GPS tracking within each POI β€” provides demand dynamics.
Source: Ride-hailing platforms Β· Telecom companies
⚑
Existing EV Charging Site Data
Competitor and existing site attributes: name, pile count, pricing, real-time availability (free/busy), and power rating β€” informs supply gap and competitive landscape analysis.
Source: Public charging networks Β· Web crawlers
Derived Features (computed)
Commercial circle boundaries Β· Opportunity zones Β· Peer radiation areas Β· Competitive zones Β· Equidistance circles Β· Surrounding business density β€” all derived via the 6 ML models from raw source data.
06
Feature Selection & Model Training Pipeline
Rigorous ML pipeline from raw data preprocessing to model optimisation
Filter Methods
Statistical Relevance
Variance selection (low-variance removal)
Pearson & Spearman correlation
Chi-square test
ANOVA (F-statistic)
Kendall's tau
Mutual information method
Wrapper Methods
Model-guided Selection
Recursive Feature Elimination (RFE)
Cross-validated RFE with estimator
Embedded Methods
Penalty-based
L1 / Lasso regularisation
Tree-based feature importance
Dimensionality Reduction
Compression & Decorrelation
PCA (Principal Component Analysis)
LDA (Linear Discriminant Analysis)
Model Evaluation Metrics
KS statistic & PSI
Accuracy, Recall, F1 Score
AUC / ROC curve
Training pipeline: Data preprocessing→ Feature extraction & selection→ Model training/building→ Model testing→ Model evaluation→ Model optimisation
07
AI Platform β€” 5 User Modules
End-to-end visual interface for business teams to evaluate and plan EV charging sites
πŸ—ΊοΈ
Heat Map
Input: filter conditions (area, EV density, POI type)
↓
Output: colour-coded demand density heat map across city
πŸ“
Site Evaluation
Input: candidate site address
↓
Output: overall composite score + recommendation grade (A/B/C)
πŸ†
Site Recommendation
Input: filter conditions (district, budget, site type)
↓
Output: ranked list of recommended sites sorted by score
⭐
Good Sites Analysis
Input: known high-performing site address
↓
Output: key success factors and their relative weights
πŸ“ˆ
Performance Forecast
Input: site address + investment cost
↓
Output: customer unit price Β· annual revenue split Β· break-even point
08
Project Management & Delivery
End-to-end PoC ownership from vendor selection to model validation
Phase 1 β€” Market Research
Benchmarking & Vendor Selection
Evaluated multiple China technology providers on data quality, coverage, prior practice experience, and cost. Selected Alading as prioritised partner following structured benchmarking.
Phase 2 β€” PoC Design
Solution Architecture & Scope Definition
Defined 3-module AI system (Evaluation Β· Recommendation Β· Forecast), identified 5 data source categories, scoped Guangzhou as pilot city. Secured $40K DS&E global funding.
Phase 3 β€” Contract & Build
Contract Signed β†’ First Deployment
Signed contract with Alading. Oversaw SaaS platform configuration with heat map, site evaluation, site recommendation, good-site models, and performance forecast modules.
Phase 4 β€” Validation
Trial Launch & Model Accuracy Validation
Compared AI modelling results against Business Development team's on-site investigation findings. Generated validation report covering functions, performance, and model accuracy.

PoC Deliverables

SaaS platform modules5 functional
ML models deployed6 models
Validation reportFunction + accuracy
Total budget$40K USD
Pilot cityGuangzhou
Risk Β· High
Data Availability
Real-time traffic and NEV tracking data subject to provider access restrictions and coverage gaps.
β†’ Multi-source fallback Β· data SLA in contract
Risk Β· Medium
Model Accuracy
Prediction vs. BD on-site investigation divergence β€” new domain with limited labelled training data.
β†’ Iterative retraining Β· accuracy KPI gates
09
Technology Stack
Spatial AI, distributed ML, and GIS-integrated analytics platform
⚑
Spark MLlib
Distributed ML
🐍
Python
ML + Data Pipeline
πŸ—ΊοΈ
GIS / Spatial AI
Geospatial analysis
πŸ“
PCA / LDA
Dimensionality reduction
🌐
Kernel Density
Spatial statistics
πŸ•ΈοΈ
Graph / Network
Feature correlation
πŸ—οΈ
SaaS Platform
Alading / partner
πŸ“‘
REST APIs
Data ingestion
πŸ”€
Collab. Filtering
Recommendation
πŸ“Š
Regression Models
Performance forecast
πŸ—„οΈ
Big Data Pipeline
ETL Β· preprocessing
🎯
RFE / Feature Sel.
Model optimisation