Urban air pollution, significantly driven by high-emission vehicles, poses a severe threat to public health and environmental sustainability. Traditional enforcement methods are manual, sporadic, and incapable of providing the continuous, real-time surveillance required to effect meaningful change. EcoVision is an advanced, real-time roadside monitoring system that directly addresses this gap. By fusing deep learning-based computer vision with the parallel processing power of an AMD Xilinx Kria KR260 FPGA, EcoVision automatically detects excessive vehicular smoke, identifies offending vehicles via Automatic Number Plate Recognition (ANPR), classifies traffic, and generates immediate, evidence-backed alerts—all at the edge with ultra-low latency. This document presents a comprehensive overview of the EcoVision solution, detailing its motivation, system architecture, intelligent pipeline, and key features that make it a scalable, privacy-conscious, and powerful tool for next-generation smart cities.
1. Introduction and MotivationVehicular emissions are a dominant contributor to urban air quality degradation, releasing harmful particulate matter (PM2.5, PM10), nitrogen oxides (NOx), and carbon monoxide. According to the World Health Organization, ambient air pollution accounts for millions of premature deaths annually, with road transport being a primary culprit in dense metropolitan areas. While regulatory standards exist, the enforcement of vehicle emission norms remains a formidable challenge. Roadside inspections are labor-intensive, infrequent, and incapable of monitoring the vast, dynamic traffic flows of a modern city.
There is a critical need for an automated, objective, and always-on monitoring system that can:
Detect visible smoke emissions in real time, even under adverse weather conditions like fog or rain.
- Detect visible smoke emissions in real time, even under adverse weather conditions like fog or rain.
- Identify the specific vehicle responsible with high accuracy.
- Quantify the violation to provide legally defensible evidence.
- Operate with low latency and minimal power at the edge, reducing dependency on bandwidth-heavy cloud processing.
EcoVision was conceived to meet this exact need. It is not merely a smoke detector; it is a comprehensive traffic intelligence platform that combines environmental enforcement with rich traffic analytics. Its deployment offers a scalable pathway for municipalities to enforce clean-air policies, deter violators, and gather actionable data for urban planning.
2. The EcoVision Solution: A System-Level Overview
EcoVision is an embedded AI system that processes live, high-frame-rate video streams from a strategically positioned roadside camera. The entire intelligence pipeline—from pixel to actionable alert—is executed locally on an AMD Xilinx Kria KR260 Robotics Starter Kit, a production-ready FPGA-based platform. This ensures deterministic, real-time processing with an end-to-end latency of under 200 milliseconds, all while maintaining a low power envelope suitable for 24/7 outdoor operation.
The core operational cycle is an event-driven, multi-stage cascade:
Continuous Vehicle Detection & Classification: Every vehicle in the camera’s field of view is detected, tracked, and categorized (e.g., passenger car, motorcycle, bus, heavy goods vehicle) using a high-speed object detection model.
- Continuous Vehicle Detection & Classification: Every vehicle in the camera’s field of view is detected, tracked, and categorized (e.g., passenger car, motorcycle, bus, heavy goods vehicle) using a high-speed object detection model.
- Concurrent Smoke Analysis: Simultaneously, a dedicated, specialized vision pipeline analyzes the region around each vehicle’s exhaust for the presence of smoke. This model is trained to measure smoke density, color, spatial spread, and persistence, functioning reliably even in high-density fog or ambient smoke conditions.
- Event-Triggered ANPR: Only when a vehicle’s emission exceeds a predefined threshold (approximated via pixel-density Ringelmann opacity scores) does the system trigger its high-accuracy ANPR module. This privacy-by-design approach ensures that only polluting vehicles are fully identified.
- Evidence Capture & Alerting: Upon confirmation, the system captures a complete evidence package: timestamp, GPS location, vehicle type, license plate image and OCR text, smoke intensity metrics, and a short video clip. An automated alert is instantly generated and transmitted to environmental enforcement authorities via Google SMTP (or API integration), enabling rapid, data-driven intervention.
In parallel and independently of violation events, EcoVision continuously compiles anonymized traffic statistics—vehicle counts, class distributions, directional flow, and congestion levels—which are fed into a central dashboard for long-term urban analytics.
The figure below illustrates the logical flow of data through the EcoVision FPGA-accelerated pipeline.
Flowchart Description:
Frame Ingestion: An HD USB/MIPI camera feeds raw frames into the Kria KR260’s programmable logic.
- Frame Ingestion: An HD USB/MIPI camera feeds raw frames into the Kria KR260’s programmable logic.
- AI Acceleration: Custom deep learning processing units (DPUs) in the FPGA fabric accelerate all neural network inferences, from vehicle detection to OCR, ensuring high throughput and low latency.
- Detection & Tracking: A YOLO model identifies and tracks all vehicles. Metadata is passed to both the smoke analysis and traffic counting pipelines.
- Smoke Detection: A dedicated YOLO model, robustly trained on synthetic and real-world data including fog, haze, and dense smoke, analyzes the exhaust region and outputs a quantifiable smoke intensity score.
- ANPR Trigger: If the score crosses the violation threshold, a bounding-box crop of the vehicle is passed to the ANPR stack.
- License Plate Reading: A fine-tuned YOLOv8 model detects the license plate region, which is then processed by PaddleOCR—a highly accurate and lightweight optical character recognition engine—to extract the alphanumeric string.
- Alerting: The complete evidence package is logged locally and an alert containing all critical information is dispatched via Google’s SMTP server to the designated authorities, ensuring immediate notification without requiring a complex custom backend
EcoVision’s intelligence is built on a synergy of state-of-the-art models, each optimized for its specific task and for FPGA acceleration:
- YOLO for Vehicle and Smoke Detection: A family of YOLO models forms the backbone of our perception. The vehicle detection model is trained on a diverse dataset of traffic scenes and classifies vehicles into multiple categories. Critically, the smoke detection model has been specifically hardened to differentiate vehicular exhaust from environmental fog, dust, or shadows. It analyzes spatio-temporal consistency over multiple frames, measuring the opacity and dispersion pattern to compute a score against regulatory benchmarks like the Ringelmann scale, delivering reliable performance even in visually degraded conditions.
- YOLOv8 and PaddleOCR for ANPR: The ANPR pipeline is a two-stage process designed for edge efficiency. First, a lightweight YOLOv8 model, fine-tuned on a global plate dataset, accurately localizes the license plate region within the vehicle’s crop, regardless of skew, partial occlusion, or lighting variations. The detected plate patch is then directly fed to PaddleOCR, a production-grade OCR toolkit known for its speed and accuracy on embedded systems. PaddleOCR performs text detection and recognition, outputting the plate string with high confidence.
- Alerting via Google SMTP: For immediate and universal notification, EcoVision utilizes the Simple Mail Transfer Protocol. Upon a confirmed violation, the processor formats the evidence data and sends a secure email alert through Google’s SMTP service. This approach eliminates the need for a dedicated push-notification server, simplifies infrastructure, and ensures that alerts land directly in the inboxes of enforcement personnel as an actionable, documented record.
- FPGA Acceleration on AMD Xilinx Kria KR260: The Kria KR260 is the computational heart of the system, offering a perfect blend of a multi-core Arm processor and highly parallel programmable logic. All deep learning models are quantized and deployed on the FPGA’s DPU, enabling concurrent execution of vehicle detection, smoke analysis, and ANPR on a single, low-power device. This hardware acceleration is what makes the sub-200ms end-to-end latency a reality, allowing the system to keep pace with highway-speed traffic.
- Real-Time Dual-Mode Detection: Simultaneously performs continuous traffic monitoring and event-triggered emission violation detection with zero frame drops.
- Quantifiable Smoke Measurement: Goes beyond binary detection to provide objective, frame-stable smoke density metrics (Ringelmann-equivalent score, plume area) that form a robust evidential basis.
- High-Accuracy, Event-Triggered ANPR: Privacy-conscious design captures and reads a license plate only upon a confirmed violation, using a YOLOv8 + PaddleOCR pipeline tuned for edge deployment.
- Reliability in Adverse Conditions: Specialized models maintain high detection accuracy under challenging lighting, heavy rain, and dense fog, conditions that degrade conventional systems
- Ultra-Low Latency Edge AI: The AMD Xilinx FPGA platform guarantees deterministic, real-time processing, enabling immediate violation capture and eliminating cloud dependency for core intelligence.
- Automated Evidence & Alerting: Assembles timestamped, geo-tagged evidence packages and dispatches them instantly to authorities via standard SMTP, creating a seamless enforcement workflow.
- Comprehensive Traffic Analytics: Continuously generates data on vehicle count, type distribution, and traffic flow, providing municipal planners with invaluable, anonymized insights independent of emission events.
- Scalable and Deployable: A single, compact hardware unit (Kria KR260 + camera + enclosure) can be rapidly installed on existing roadside infrastructure, forming a mesh network of monitoring nodes for city-wide coverage
EcoVision represents a paradigm shift in environmental traffic enforcement, moving from sporadic, manual checks to continuous, intelligent, and automated surveillance. By harnessing the real-time computational power of an AMD Xilinx FPGA and the accuracy of custom YOLO and PaddleOCR models, we have created a system that is not only technically advanced but also pragmatic and ready for real-world deployment. It empowers cities to hold high-emission vehicles accountable, protect public health, and leverage traffic data for smarter urban futures—all from a single, energy-efficient device at the roadside. EcoVision is more than a project; it is a scalable, intelligent building block for the clean, breathable cities of tomorrow.






Comments