
In this study, we propose a simple and efficient approach for detecting referee whistle sounds in soccer match footage. Our method combines a fixed-length sliding window with feature extraction based on the short-time fourier transform (STFT). Specifically, we use a 1-second sliding window with 50% overlap to divide the audio input into smaller segments, which are analyzed to identify whistles characterized by prominent frequency components in the range of 3300–4200 Hz. For classification, we employ well-known machine learning algorithms, including Naïve Bayes, Support Vector Machines, and Neural Networks. To improve system performance, we integrate a post-processing technique based on Power Spectral Density (PSD) and predefined thresholds to reduce misclassification. Experimental results demonstrate that our method is highly effective at detecting whistles during soccer matches.