Name: | Description: | Size: | Format: | |
---|---|---|---|---|
986.25 KB | Adobe PDF |
Authors
Abstract(s)
A deteção e identificação de sons impulsivos aplicados num contexto específico, nomeadamente em eventos desportivos, possibilitam a análise e síntese de várias métricas e estatísticas associadas ao jogo, ou mesmo, ao desempenho do jogador. Neste contexto, a identificação automática de um som impulsivo, como o som da batida da bola na raquete do jogador, é uma contribuição importante para a criação de uma fonte de dados na qual podem ser realizadas análises específicas do jogo. Considerar todas as características de um determinado tipo de som impulsivo em várias condições e ambientes, envolve lidar com bastantes variáveis, sendo igualmente difícil encontrar, de forma eficiente, os valores dos híper-parâmetros que permitem obter a melhor configuração para um determinado algoritmo a ser executado num processo de aprendizagem automática (machine learning). O contributo deste trabalho é o de explorar o espaço de híper-parâmetros (HyP) na procura dos valores que otimizem o desempenho de todo o processo de classificação automática de sons impulsivos. Esse processo inicia na geração do conjunto de dados (dataset) a processar, prossegue com a aprendizagem de modelos de classificação, e termina com a avaliação dos modelos aprendidos. Neste trabalho propõe-se estender o conceito de HyP de modo a englobar também a construção do próprio dataset. Nas experiências realizadas, é considerado um problema de classificação binária, onde é necessário fazer a distinção entre o evento pretendido e o ruído. A validação do processo recorre a áudio extraído de vídeos de competições de eventos desportivos, nomeadamente, ténis e padel, onde se procuram identificar os sons de pancadas da raquete na bola. Os resultados experimentais, onde se observa uma taxa de sucesso (accuracy) do modelo a rondar os 93%, permitem evidenciar a influência das características do som na construção do dataset e no desempenho global do processo de classificação automática de sons impulsivos.
Abstract The detection and identification of impulsive sounds applied in a specific context, particularly in sports events, enables the analysis and synthesis of various metrics and statistics associated with the game or even the player’s performance. In this context, the automatic identification of an impulsive sound, such as a ball being hit by a player, is a major contribution to the construction of a data source on which game-specific analysis can be performed. Considering all the characteristics (features) of a particular type of impulsive sound in various conditions/environments involves dealing with numerous variables, making it equally challenging to efficiently find the values of the hyperparameters that allow obtaining the best configuration for a given algorithm to be executed in a machine learning process. The contribution of this work is to explore the hyperparameter (HyP) space in search of values that optimize the performance of the entire process of automatic impulsive sound classification. This process begins with the generation of the dataset to be processed, continues with the training of classification models, and ends with the evaluation of the learned models. In this work, we propose to extend the concept of HyP to also encompass the construction of the dataset itself. The experiments consider a binary classification problem, where a distinction must be made between the intended event and noise. The validation of the process resorts to an audio extracted from videos of sports event competitions, specifically in tennis and padel, where the goal is to identify the sounds of racket hits on the ball. The experiment results, where the model’s "accuracy" is observed to be around 93%, enable to evidence the impact of the influence of sound characteristics on the construction of the dataset and on the overall performance of the process of automatic classification of impulsive sounds.
Abstract The detection and identification of impulsive sounds applied in a specific context, particularly in sports events, enables the analysis and synthesis of various metrics and statistics associated with the game or even the player’s performance. In this context, the automatic identification of an impulsive sound, such as a ball being hit by a player, is a major contribution to the construction of a data source on which game-specific analysis can be performed. Considering all the characteristics (features) of a particular type of impulsive sound in various conditions/environments involves dealing with numerous variables, making it equally challenging to efficiently find the values of the hyperparameters that allow obtaining the best configuration for a given algorithm to be executed in a machine learning process. The contribution of this work is to explore the hyperparameter (HyP) space in search of values that optimize the performance of the entire process of automatic impulsive sound classification. This process begins with the generation of the dataset to be processed, continues with the training of classification models, and ends with the evaluation of the learned models. In this work, we propose to extend the concept of HyP to also encompass the construction of the dataset itself. The experiments consider a binary classification problem, where a distinction must be made between the intended event and noise. The validation of the process resorts to an audio extracted from videos of sports event competitions, specifically in tennis and padel, where the goal is to identify the sounds of racket hits on the ball. The experiment results, where the model’s "accuracy" is observed to be around 93%, enable to evidence the impact of the influence of sound characteristics on the construction of the dataset and on the overall performance of the process of automatic classification of impulsive sounds.
Description
Keywords
Inteligência artificial Aprendizagem automática Análise de áudio Dataset de eventos sonoros Otimização de híper-parâmetros Artificial ntelligence Machine learning Audio analysis Sound event dataset Hyperparameter optimization