Team Tuber Divers:

Anubhav Sen

•

Ruizhe Fu

•

Xavier Verdugo Rivera

•

•

•

•

•

Miguel Angel Reina Ortega

•

Poornima Shandilya

•

JaeSeung Song

•

Ben Gigliotti

Published November 16, 2025

AI for Tubing Outer Diameter Wavy Condition Detection

We focus on designing a real-time AI model that can predict the onset of wavy conditions by analyzing OD data from the previous few seconds.

IntermediateFull instructions providedOver 8 days6

AI for Tubing Outer Diameter Wavy Condition Detection

Things used in this project

Software apps and online services

Tkinter

Scikit-learn

Matplotlib

ACME oneM2M CSE

Story

Overview

Sponsored by TE Connectivity, Team Tuber Divers is addressing a critical manufacturing challenge in which polyethylene tubing products develop a “wavy condition” that occurs during the expansion process. What is waviness? This is a condition that arises when a heat shrinking tube’s outer diameter (OD) contracts in response to external heat conditions, leaving behind chatter marks and a visibly uneven, bumpy surface. These defects compromise the tube’s performance and quality, which is unideal of course.

Story

What’s the problem?

TE Connectivity is a manufacturer of many products, including cable tubing. This tubing goes through a lengthy pipeline of operations during manufacturing. It is difficult to precisely configure every variable down in this chain (e.g., the tube’s exact composition/makeup, the vacuum chamber temperature, the speed at which the individual spools rotate in the chamber as the tube moves along them, etc.) while also accounting for the hundreds of tubing. As a result, the tubing could develop defects and irregularities where the outer shell has a varying diameter, which we refer to as “wavy conditions.” The tube’s OD is measured by a laser with high accuracy and precision.

Laser AccuScan 6050. The laser can sample at a rate of 2400 Hz

Currently, the on-site engineers sample a few feet of tubing out of hundreds to ensure quality control in the tubing and the samples must be manually checked to determine if a wavy condition has occurred, what went wrong in the production process, and what parameters might need to be modified to ensure these conditions do not occur again. This is where AI can come in and automate the entire process so that we can determine and predict when these wavy conditions might occur in advance, increasing product yield and cutting down on wastage and unnecessary costs.

Different types of wavy conditions that can be found in a tube

What does the current system look like?

The current system involves a process for monitoring and controlling the quality of the heat-shrinking tubing during its manufacturing. It begins with the tube being fed into a vacuum chamber, where it moves along two spools. As the tube passes through the chamber, four lasers scan its OD from different directions to detect any variations and non-uniformity. These lasers are highly accurate and precise, aiming to identify any potential wavy conditions on the tube's surface.

Currently, this system requires an engineer to manually process the tube and detect defects by classifying it as “OK” or “NG.” The “OK” or “NG” classification is crucial in determining whether the tube meets quality standards or if it should be discarded. The engineer assesses a few feet of tubing and examines the variations, adjusting the control parameters (e.g., the vacuum chamber temperature, the speed at which the individual spools rotate in the chamber as the tube moves along them, etc.) to identify the root cause of the issue. This manual process can be quite time-consuming and cumbersome, as it involves troubleshooting and adjusting multiple parameters in real-time. See the figure below for a schematic of the current tube inspection process.

A schematic of the current tube quality control system, where lasers scan the tube's OD, and an engineer manually classifies it as “OK” or “NG” based on detected variations.

How we solve the problem

At a high-level view, the project is composed of four critical phases: (1) collecting and preprocessing the data, (2) analyzing the data to uncover meaningful patterns and relationships among key control parameters such as cooling rate, air temperature, and spool speed, (3) training and evaluating several AI classification models to identify the most accurate performer, and (4) building an interactive dashboard that presents the model’s results and data trends to support operators and process engineers on-site. Each phase builds logically onto the previous one, gradually transforming the raw manufacturing data into a functional dashboard that provides interpretable insights.

First and foremost, preprocessing the data begins with importing and organizing the data that is currently distributed across multiple Excel spreadsheets generated during each production run. These spreadsheets contain timestamped sensor readings, including OD readings, temperature, line speed, and other machine settings. The data is cleaned to remove any noise and potential outliers. Obvious sensor errors (e.g., sudden spikes caused by logging faults) are filtered out to prevent them from biasing the learning process later and skewing the model’s results. By the end of this phase, the data is consistent and ready to be split into training and testing sets for training the AI classification models.

The second phase focuses on analyzing the data to uncover meaningful patterns and relationships that might exist across different control parameters. Since wavy conditions tend to emerge gradually rather than instantaneously, we implement a sliding window technique that captures the tube’s OD variations over time rather than looking at individual point readings. Within each time window, key statistical features such as the mean absolute deviation, coefficient of variation, and normalized variance are calculated to quantify small fluctuations. This phase captures the distinguishing characteristics of OK versus NG tubing and establishes a strong foundation for training and validating the AI classification models in later stages.

The third phase focuses on training and evaluating AI classification models to identify wavy conditions before defects are visibly detectable. Several models are tested to compare performance, including logistic regression for baseline linear separability, support vector machines for margin-based decision boundaries, and other relevant models such as random forests and gradient boosting for capturing complex nonlinear relationships (if any). We evaluate each model by calculating the corresponding accuracy, precision, recall, and F1 score. At this point in time, the random forest model has demonstrated the highest confidence accuracy and most promising results.

In the fourth and final phase, a dashboard interface is developed in Tkinter with embedded Matplotlib charts to present our analytics in a clear and usable format. While it does not yet connect to a livestream data feed from the manufacturing line, it processes static data files and applies the trained classification models to assess the likelihood of wavy conditions during each run. As the model evaluates the data, the interface displays a real-time classification status indicator along with visual plots of OD trends and relevant statistical metrics to help track the onset of wavy conditions. Each model prediction is accompanied by a confidence score that reflects the strength of the classification between OK and NG tubing. The dashboard also includes a playback feature that allows users to select historical runs and review them step-by-step to diagnose when and where significant variation first arose. To support practical use, we eventually hope to include notification alerts that trigger whenever the model detects an increasing risk of nonuniform tubing, helping operators recognize risk earlier and enabling timely corrective actions.

Together, these four phases comprise a pipeline that transforms raw production data into a meaningful, real-time decision support tool for on-site engineers. While still in development, the current, data-driven system already demonstrates clear advantages over traditional manual inspection by detecting wavy conditions earlier and with greater consistency. Moreover, it shifts detecting defects from a reactive process to a proactive one by providing interpretable insights that can guide corrective action before instability escalates. As we continue refining our model’s performance and enhancing our dashboard

capabilities and features, this framework lays the groundwork for a scalable, data-driven solution with the potential for deployment across TE Connectivity’s manufacturing lines.

Currently, this system requires an engineer to manually process the tube and detect defects by classifying it as “OK” or “NG.” The “OK” or “NG” classification is crucial in determining whether the tube meets quality standards or if it should be discarded. The engineer assesses a few feet of tubing and examines the variations, adjusting the control parameters (e.g., the vacuum chamber temperature, the speed at which the individual spools rotate in the chamber as the tube moves along them, etc.) to identify the root cause of the issue. This manual process can be quite time-consuming and cumbersome, as it involves troubleshooting and adjusting multiple parameters in real-time. See Figure 1 below for a schematic of the current tube inspection process.

Positive impact of our solution

Our solution aligns with the industry goals : flexibility, adaptability and sustainability. This solution is entirely risk-free, as it was built using Python scripting and open analysis tools, requiring no additional capital investment or hardware upgrades. Beyond the initial setup in this prototype environment, it represents a scalable and cost-effective strategy.

As for time and labor savings, previously, quality control for waviness relied heavily on manual inspection, where onsite engineers physically felt the material as it came off the line. This approach is labor-intensive, subjective, and prone to oversight.

With this dashboard, Real-time alerts are automatically triggered when waviness is detected or suspected in the data stream. Engineers can focus on critical interventions, rather than routine monitoring. This dramatically reduces the time per metric check and improves inspection consistency.

Additionally, this dashboard can be trained and seamlessly integrated into any other manufacturing pilling that have NDC value and wish to visualize waviness. Even though this dashboard was designed for a specific tubing system with NDC laser measurements, its architecture is highly modular and model-agnostic:

Any production line that uses similar sensors (NDC, OD, ovality, concentricity, thickness, etc.) can adopt this system with minor retraining.

The visualization and segmentation pipeline is modular: users only need to update the dictionary mapping and re-run the feature extraction to deploy it across new plants, product types, or machines.

By feeding in domain-specific raw logs from other machines, the system can efficiently retrain on new patterns of “waviness” or tolerance abnormalities.

As for scalability, retraining capabilities allow teams to fine-tune the waviness detection model for varied product geometries and surface finishing types.

Model Diagram

Overall Solution Architecture

For the implementation of the solution, the team implemented a dashboard with multiple functions that aim to increase the productivity of the engineers in charge of controlling the wavy conditions and reduce the amount of time inspecting the tubes. The dashboard includes various features with crucial information for the engineers. Specific parameters and dashboard design is subject to modification as determined by the lead engineer.

The dashboard provides TE Connectivity process engineers with a fast, Excel-first desktop tool to (a) load OD time-series data, (b) automatically segment the line into meaningful “waviness states, ” (c) visualize risk and trends live, and (d) explore relationships between OD and a secondary signal (e.g., ovality) to support root-cause analysis and earlier interventions.

OD streams from NDC/laser systems are long, noisy, and visually ambiguous. Engineers need interpretable states (e.g., STEADY, UNCERTAIN, STRONG, DRIFT, BURSTY) and a health signal to decide when to adjust the process. Floor personnel and engineers also need a quick way to test whether another metric moves with OD and see whether it leads or lags without running a separate analytics stack.

The application is a single-file desktop tool built with Tkinter (GUI) and Matplotlib (plots), using NumPy/Pandas for data handling and Scikit-learn for data analytics. A central DataStore manages time stamps, OD values, optional class segments, and an aligned secondary series. The interface organizes work into four pages (i.e., Data, Results, History, and Analysis), plus a dedicated Correlation window for exploring relationships between OD and any other secondary parameters.

The dashboard reads XLS/XLSX files and auto-detects time and OD columns by common header names (or uses an exact header, if configured). It converts numeric strings with units, thousands of separators, decimal commas into floats, and drops invalid rows. If timestamps are unusable, it synthesizes a monotonically increasing index to preserve alignment and analysis. Class segments can be loaded from a separate file with flexible headers; if class dates and data dates don’t overlap, the tool shifts class dates by whole days to align with the data span and logs for that decision.

Dashboard Layout

oneM2M server setup

To work with oneM2M, we first must install acmecse using pip on our command line. The command is python -m pip install acmecse. After having installed the library, we can type acmecse on the command line. When prompt about settings, for our purposes, we can just choose the defaults and continue clicking enter. Once the setup is complete, we will have established our oneM2M server set up, we will be getting a connection like http://localhost:8080/cse-in

We will use this link to send our data to the acme server, once the server has received the data, we will send our data to our dashboard where we can display live data and have our machine learning language model predict wavy conditions. To use the server, we will need to first create an Application Entry (AE). Once we have created AE, we will create a Container (CNT) where we will store the data that is being transmitted and later use the same CNT to transmit the data to the dashboard using instances.

After completing both steps, we can use curl.exe -x GET to confirm that our AE and CNT have been created. On the server side, once we start to send information, we will see how instances are being created and being requested, showing a successful implementation.

Implementation

Implementation Overview

The objective of this tab on the dashboard is to load the primary file, optionally load a secondary series, and open the correlation view. The page displays chosen columns, row counts, and pairing status.

To upload a new dataset of information to the dashboard, the user must click on Load XLSX. The dashboard reads the names of all present sheets in the loaded XLSX file. It will automatically detect the sheet containing the OD values, NDC_System_OD_Value. The dashboard will also parse timestamps correctly. The dashboard will only load data from when the speed is a non-zero value, so that we are not classifying any incoming data from a single standstill point.

User Interface for Socket Implementation and File Loading

In the event the user wants to introduce a secondary metric (e.g., ovality) into the program, the user can select such metric in the Select sheet… dropdown. The application will recalculate the correct time for the calculations and the value using the data from both files and align them with the nearest timestamp possible to join them into one while only looking at times where speed is non-zero. Since there is no guarantee there will be always close values, there is an inherent tolerance into the application based on each stream’s median sample interval. This measurement is calculated to avoid misleading information and mismatches. If the results are not what is expected and seem incoherent, the user can reupload new data.

Selected Secondary Measurement Comparison Drop-down

Correlation Window (OD ↔ Secondary)

Engineers may load a second file (e.g., ovality), the app lines it up with OD using a nearest timestamp using a tolerance derived from the median sampling rate of both streams. A separate Correlation window includes:

Time Overlay (z-scored): side-by-side shapes to see clear relationships and trends. Both series are standardized and plotted on the same axis, so you’re comparing shape, not units. If peaks and dips in OD have matching echoes in the secondary trace, you’ll see it immediately. This is usually the most convincing view for operators.

Time Overlay Page

Scatter & Density: Here we drop time and examine geometry. Each paired point becomes a dot, with a hexbin density map to show where the data cluster. A least-squares fit and Pearson r define the direction and strength. A tight, slanted ridge suggests a real relationship, even before you analyze lead/lag.

Scatter and Density Page

Correlation vs. Lag: To sort out who moves first, the app shifts one series against the other across a user-set range and recomputes r at each offset. The curve’s peak is the “best lag.” The interface calls it out (e.g., “best lag: 12 samples”), and a dashed line marks it so it’s unmissable.

Correlation v.s Lag Page

Rolling Correlation: A sliding window computes r repeatedly to see when coupling strengthens or decays. Sustained plateaus near +1 or −1 suggest reliable coupling; jagged swings or long troughs point to intermittent effects or mode changes.

Rolling Correlation

Model

The Model page is where the operator can select what type of model to use (i.e., logit, support vector machine, random forest, XGBoost) to classify the dataset as well as the desired window size the model would use. The Model page also includes an “Average likelihood of chatter being detected in all windows by each model over different window sizes” plot. This plot helps the operator tell the difference between the model selections and help them determine which to use for the data set.

Model Selection Tab

Results

The Results page is the operational heart of the prototype. On the left, a live chart called “OD vs. Samples (live)” streams the latest window of the OD signal. You’ll see two lines: a light trace for the raw OD values and a thicker, smoother line that acts as a visual baseline. Behind them, soft color bands translate the signal into plain language. Green means STEADY, amber means MILD_WAVE, red means STRONG_WAVE, blue means DRIFT, purple means BURSTY_NOISY, and gray means UNCERTAIN. Those bands aren’t decorative—they’re produced by DataStore.auto_classify(), which breaks the data into overlapping windows, extracts statistical features (i.e., relative deviation, mean absolute difference, etc.), gets the prediction probability from the selected model, and assigns a label using median-based thresholds. Adjacent windows with the same label are merged so the background reads as clean blocks instead of flickering stripes.

On the right, the OK/NG gauge gives a single, fast answer. The design is intentionally simple: a three-color semicircle, a bold needle, and a short status line. If the current sample sits inside a labeled span, the gauge converts that label to a calibrated risk (e.g., STEADY ≈ 0.05; STRONG_WAVE ≈ 0.90). If there’s no label yet (which is common right after loading a new file), the gauge falls back to a continuous NG score based on recent slope and peak-to-peak relative to a configurable spec band. Either way, the percentage under the needle is a clear “confidence” number, and the red/green call underneath answers the only question that matters in the moment, that is, are we OK or not?

Result Tab

History

If the Results page answers “what’s happening right now, ” the History page explains how we got here. The main panel is a high-performance canvas chart. The same OD signal is plotted with a faint raw line and a darker smoothed trajectory, but here the emphasis is on how things change over time: you see a long, continuous window with the same class colors washed across the background. Unlike Matplotlib, the canvas widget uses light rectangle fills with a stipple to emulate transparency. That design keeps redrawing fast enough to update continuously without tying up the UI thread, which matters on long runs.

On the right, three live Latest Stats give quantitative statistics: the current slope in mm/sample, the peak-to-peak amplitude over the most recent window, and the risk score on a 0–100 scale. Those values come from helpers in DataStore (i.e., trend_slope, volatility_p2p, ng_score), so an engineer can reconcile what they see in the bands with the numbers. In practice, when a strong red span appears, you’ll usually see a matching surge in the smoothed line confirmed by higher slope and peak-to-peak values. A small arrow near the title uses the slope’s sign to tag the trend as uptrend, downtrend, or stable, which is helpful when scanning multiple lines during a shift review.

The History page supports the Results page in two ways. First, it provides an early-warning alert: long sequences of amber or blue bands suggest mild oscillation or drift accumulating toward a possible NG state. Second, it is a diagnostic record. If a supervisor asks why the gauge went red, you can show the exact interval where the state changed and how quickly the signal recovered.

History Tab

Analysis

The Analysis pages features a plot that shows the chatter confidence (i.e., how likely chatter is present in the window based on the model’s prediction, in each window over time via the selected model). This graph helps analyze the dataset to see exactly what confidence the model is computing over time.

Analysis Tab Result

Implementation Detail

Load File

Ingest Data: Open Data → XLSX (OD export from NDC/laser). Confirm the status line shows the chosen time/OD columns and row count

Select Model: Based on the likelihood/confidence plot, pick a model and window size to analyze the dataset with.

Assess Current State: Switch to Results. Read the gauge (OK/NG + confidence) and glance at the live chart with class overlays.

Validate Trend: If confidence is drifting upward (toward NG), jump to History. Check the smoothed curve, colored bands, and Latest Stats.

Stabilize the Line: Make the process adjustment (speed, tooling, alignment). Keep History open; confirm the slope decreases and risk score trends down over ~1 minute.

Incidence Confirmation

Identify Mishap: On Results, a red/amber band or NG gauge reading appears.

Time-Box the Episode: On History, identify when the state switched and how long it persisted (the shaded spans show onset and recovery). If needed, grab a screenshot of the History panel showing the red span.

Hypothesis Test

Navigate: Go to Data → Compare to (e.g., Ovality.xlsx). Confirm the paired-row count looks reasonable.

Compare: Click OD ↔ Compare to.

Check :

Time Overlay: Do peaks/valleys align after z-scoring?

Scatter & Density: Is r strong, and does the fitted line have the expected sign?

Lag: Where does r peak? Does OD lead, or does the secondary lead (positive/negative samples)?

Rolling Correlation: Is the relationship stable or episodic?

Live Demostration

The live demostration can be found in the following link:

Live Demostration

AI for Tubing Outer Diameter Wavy Condition Detection