Speech and Computer 22nd International Conference, SPECOM 2020, St. Petersburg, Russia, October 7-9, 2020, Proceedings / [electronic resource] :
edited by Alexey Karpov, Rodmonga Potapova.
- 1st ed. 2020.
- XIV, 689 p. 222 illus., 155 illus. in color. online resource.
- Lecture Notes in Artificial Intelligence, 12335 2945-9141 ; .
- Lecture Notes in Artificial Intelligence, 12335 .
Lightweight CNN for Robust Voice Activity Detection -- Hate Speech Detection Using Transformer Ensembles on the HASOC Dataset -- MP3 Compression to Diminish Adversarial Noise in End-to-End Speech Recognition -- Exploration of End-to-End ASR for OpenSTT - Russian Open Speech-to-Text Dataset -- Directional Clustering with Polyharmonic Phase Estimation for Enhanced Speaker Localization -- Speech Emotion Recognition using Spectrogram Patterns as Features -- Pragmatic Markers in Dialogue and Monologue: Difficulties of Identification and Typical Formation Models -- Data Augmentation and Loss Normalization for Deep Noise Suppression -- Automatic Information Extraction from Scanned Documents -- Dealing with Newly Emerging OOVs in Broadcast Programs by Daily Updates of the Lexicon and Language Model -- A Rumor Detection in Russian Tweets -- Automatic Prediction of Word form Reduction in Russian Spontaneous Speech -- Formant Frequency Analysis of MSA Vowels in Six Algerian Regions -- EmotionRecognition and Sentiment Analysis of Extemporaneous Speech Transcriptions in Russian -- Predicting a Cold from Speech using Fisher Vectors; SVM and XGBoost as Classifiers -- Toxicity in Texts and Images on the Internet -- An Automated Pipeline for Robust Image Processing and Optical Character Recognition of Historical Documents -- Lipreading with LipsID -- Automated Destructive Behavior State Detection on the 1D CNN-based Voice Analysis -- Rhythmic Structures of Russian Prose and Occasional Iambs (a Diachronic Case Study) -- Automatic Detection of Backchannels in Russian Dialogue Speech -- Experimenting with Attention Mechanisms in Joint CTC-Attention Models for Russian Speech Recognition -- Comparison of Deep Learning Methods for Spoken Language Identification -- Conceptual Operations with Semantics for a Companion Robot -- Legal Tech: Documents' Validation Method Based on the Associative-Ontological Approach -- Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition -- CTC-Segmentation of Large Corpora for German End-to-End Speech Recognition -- Stylometrics Features under Domain Shift: Do they Really "Context-independent" -- Speech Features of 13-15 Year-old Children with Autism Spectrum Disorders -- Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence -- Detection of Toxic Language in Short Text Messages -- Transfer Learning in Speaker's Age and Gender Recognition -- Interactivity-based Quality Prediction of Conversations with Transmission Delay -- Graphic Markers of Irony and Sarcasm in Written Texts -- Digital Rhetoric 2.0: How to Train Charismatic Speaking with Speech-melody Visualization Software -- Generating a Concept Relation Network for Turkish Based on ConceptNet Using Translational Methods -- Bulgarian Associative Dictionaries in the LABLASS Web-based System -- Preliminary Investigation of Potential Steganographic Container Localization -- Some Comparative Cognitive and Neurophysiological Reactions to Code-modifiedInternet Information -- The Influence of Multimodal Polycode Internet Content on Human Brain Activity -- Synthetic Speech Evaluation by Differential Maps in Pleasure-Arousal Space -- Investigating the Effect of Emoji in Opinion Classification of Uzbek Movie Review Comments -- Evaluation of Voice Mimicking using i-vector Framework -- Score Normalization of x-vector Speaker Verification System for Short-duration Speaker Verification Challenge -- Genuine Spontaneous vs Fake Spontaneous Speech: in Search of Distinction -- Mixing Synthetic and Recorded Signals for Audio-book Generation -- Temporal Concord in Speech Interaction: Overlaps and Interruptions in Spoken American English -- Cognitively Challenging: Language Shift and Speech Rate of Academic Bilinguals -- Toward Explainable Automatic Classification of Children's Speech Disorders -- Recognition Performance of Selected Speech Recognition APIs - A Longitudinal Study -- Does A Priori Phonological Knowledge Improve Cross-Lingual Robustness of Phonemic Contrasts -- Can We Detect Irony in Speech Using Phonetic Characteristics Only? - Looking for a Methodology of Analysis -- Automated Compilation of a Corpus-based Dictionary and Computing Concreteness Ratings of Russian -- Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients using the Electrolarynx -- Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization -- Uncertainty of Phone Voicing and its Impact on Speech Synthesis -- Grappling with Web Technologies: the Problems of Remote Speech Recording -- Robust Noisy Speech Parameterization Using Convolutional Neural Networks -- More than Words: Cross-Linguistic Exploration of Parkinson's Disease Identification from Speech -- Phonological Length of L2 Czech Speakers' Vowels in Ambiguous Contexts as Perceived by L1 Listeners -- Learning an Unsupervised and Interpretable Representation of Emotion from Speech -- Synchronized Forward-Backward Transformer for End-to-End Speech Recognition -- KazNLP: a Pipeline for Automated Processing of Texts Written in Kazakh Language -- Diarization based on Identification with x-vectors -- Different Approaches in Cross-Language Similar Documents Retrieval in the Legal Domain.
This book constitutes the proceedings of the 22nd International Conference on Speech and Computer, SPECOM 2020, held in St. Petersburg, Russia, in October 2020. The 65 papers presented were carefully reviewed and selected from 160 submissions. The papers present current research in the area of computer speech processing including speech science, speech technology, natural language processing, human-computer interaction, language identification, multimedia processing, human-machine interaction, deep learning for audio processing, computational paralinguistics, affective computing, speech and language resources, speech translation systems, text mining and sentiment analysis, voice assistants, etc. Due to the Corona pandemic SPECOM 2020 was held as a virtual event.
9783030602765
10.1007/978-3-030-60276-5 doi
Artificial intelligence.
Social sciences--Data processing.
Education--Data processing.
Data mining.
Application software.
Image processing--Digital techniques.
Computer vision.
Artificial Intelligence.
Computer Application in Social and Behavioral Sciences.
Computers and Education.
Data Mining and Knowledge Discovery.
Computer and Information Systems Applications.
Computer Imaging, Vision, Pattern Recognition and Graphics.
Q334-342 TA347.A78
006.3
Lightweight CNN for Robust Voice Activity Detection -- Hate Speech Detection Using Transformer Ensembles on the HASOC Dataset -- MP3 Compression to Diminish Adversarial Noise in End-to-End Speech Recognition -- Exploration of End-to-End ASR for OpenSTT - Russian Open Speech-to-Text Dataset -- Directional Clustering with Polyharmonic Phase Estimation for Enhanced Speaker Localization -- Speech Emotion Recognition using Spectrogram Patterns as Features -- Pragmatic Markers in Dialogue and Monologue: Difficulties of Identification and Typical Formation Models -- Data Augmentation and Loss Normalization for Deep Noise Suppression -- Automatic Information Extraction from Scanned Documents -- Dealing with Newly Emerging OOVs in Broadcast Programs by Daily Updates of the Lexicon and Language Model -- A Rumor Detection in Russian Tweets -- Automatic Prediction of Word form Reduction in Russian Spontaneous Speech -- Formant Frequency Analysis of MSA Vowels in Six Algerian Regions -- EmotionRecognition and Sentiment Analysis of Extemporaneous Speech Transcriptions in Russian -- Predicting a Cold from Speech using Fisher Vectors; SVM and XGBoost as Classifiers -- Toxicity in Texts and Images on the Internet -- An Automated Pipeline for Robust Image Processing and Optical Character Recognition of Historical Documents -- Lipreading with LipsID -- Automated Destructive Behavior State Detection on the 1D CNN-based Voice Analysis -- Rhythmic Structures of Russian Prose and Occasional Iambs (a Diachronic Case Study) -- Automatic Detection of Backchannels in Russian Dialogue Speech -- Experimenting with Attention Mechanisms in Joint CTC-Attention Models for Russian Speech Recognition -- Comparison of Deep Learning Methods for Spoken Language Identification -- Conceptual Operations with Semantics for a Companion Robot -- Legal Tech: Documents' Validation Method Based on the Associative-Ontological Approach -- Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition -- CTC-Segmentation of Large Corpora for German End-to-End Speech Recognition -- Stylometrics Features under Domain Shift: Do they Really "Context-independent" -- Speech Features of 13-15 Year-old Children with Autism Spectrum Disorders -- Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence -- Detection of Toxic Language in Short Text Messages -- Transfer Learning in Speaker's Age and Gender Recognition -- Interactivity-based Quality Prediction of Conversations with Transmission Delay -- Graphic Markers of Irony and Sarcasm in Written Texts -- Digital Rhetoric 2.0: How to Train Charismatic Speaking with Speech-melody Visualization Software -- Generating a Concept Relation Network for Turkish Based on ConceptNet Using Translational Methods -- Bulgarian Associative Dictionaries in the LABLASS Web-based System -- Preliminary Investigation of Potential Steganographic Container Localization -- Some Comparative Cognitive and Neurophysiological Reactions to Code-modifiedInternet Information -- The Influence of Multimodal Polycode Internet Content on Human Brain Activity -- Synthetic Speech Evaluation by Differential Maps in Pleasure-Arousal Space -- Investigating the Effect of Emoji in Opinion Classification of Uzbek Movie Review Comments -- Evaluation of Voice Mimicking using i-vector Framework -- Score Normalization of x-vector Speaker Verification System for Short-duration Speaker Verification Challenge -- Genuine Spontaneous vs Fake Spontaneous Speech: in Search of Distinction -- Mixing Synthetic and Recorded Signals for Audio-book Generation -- Temporal Concord in Speech Interaction: Overlaps and Interruptions in Spoken American English -- Cognitively Challenging: Language Shift and Speech Rate of Academic Bilinguals -- Toward Explainable Automatic Classification of Children's Speech Disorders -- Recognition Performance of Selected Speech Recognition APIs - A Longitudinal Study -- Does A Priori Phonological Knowledge Improve Cross-Lingual Robustness of Phonemic Contrasts -- Can We Detect Irony in Speech Using Phonetic Characteristics Only? - Looking for a Methodology of Analysis -- Automated Compilation of a Corpus-based Dictionary and Computing Concreteness Ratings of Russian -- Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients using the Electrolarynx -- Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization -- Uncertainty of Phone Voicing and its Impact on Speech Synthesis -- Grappling with Web Technologies: the Problems of Remote Speech Recording -- Robust Noisy Speech Parameterization Using Convolutional Neural Networks -- More than Words: Cross-Linguistic Exploration of Parkinson's Disease Identification from Speech -- Phonological Length of L2 Czech Speakers' Vowels in Ambiguous Contexts as Perceived by L1 Listeners -- Learning an Unsupervised and Interpretable Representation of Emotion from Speech -- Synchronized Forward-Backward Transformer for End-to-End Speech Recognition -- KazNLP: a Pipeline for Automated Processing of Texts Written in Kazakh Language -- Diarization based on Identification with x-vectors -- Different Approaches in Cross-Language Similar Documents Retrieval in the Legal Domain.
This book constitutes the proceedings of the 22nd International Conference on Speech and Computer, SPECOM 2020, held in St. Petersburg, Russia, in October 2020. The 65 papers presented were carefully reviewed and selected from 160 submissions. The papers present current research in the area of computer speech processing including speech science, speech technology, natural language processing, human-computer interaction, language identification, multimedia processing, human-machine interaction, deep learning for audio processing, computational paralinguistics, affective computing, speech and language resources, speech translation systems, text mining and sentiment analysis, voice assistants, etc. Due to the Corona pandemic SPECOM 2020 was held as a virtual event.
9783030602765
10.1007/978-3-030-60276-5 doi
Artificial intelligence.
Social sciences--Data processing.
Education--Data processing.
Data mining.
Application software.
Image processing--Digital techniques.
Computer vision.
Artificial Intelligence.
Computer Application in Social and Behavioral Sciences.
Computers and Education.
Data Mining and Knowledge Discovery.
Computer and Information Systems Applications.
Computer Imaging, Vision, Pattern Recognition and Graphics.
Q334-342 TA347.A78
006.3