|
Abstract This paper introduces a voice command-and-control system based on Large Language Models (LLMs), designed to handle the high dynamics, intense conflict, and multi-source, heterogeneous information in modern combat environments. The limitations of traditional 'point-and-click' interfaces and standard voice systems were addressed, which often struggle with noise robustness and limited semantic adaptability. By integrating advanced audio denoising and tactical hot word enhancement, speech recognition accuracy and domain adaptability in noisy conditions were improved. Domain-specific Prompt engineering, data augmentation, and LoRA fine-tuning further enhanced the LLMs’ understanding of non-standard expressions and tactical semantics, enabling end-to-end voice-to-command conversion. Experimental results show that the proposed approach outperforms baseline methods in battlefield noise conditions, giving a dependable, natural, and efficient framework for human-machine collaborative command.
|