Transcripción, indexación y análisis automático de declaraciones judiciales a partir de representaciones fonéticas y técnicas de lingüística forense

Pedro J.Vivancos Vicente; José Antonio García Díaz; Ángela Almela Sánchez-Lafuente; Fernando Molina; Juan Salvador Castejón Garrido; Rafael Valencia García

Transcripción, indexación y análisis automático de declaraciones judiciales a partir de representaciones fonéticas y técnicas de lingüística forense

Journal:

Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2020

Issue: 65

Pages: 109-112

Type: Article

DIALNET GOOGLE SCHOLAR RUA editor

More publications in: Procesamiento del lenguaje natural

Abstract

Recent technological advances have made it possible to improve the search for information in the judicial files of the Ministry of Justice associated with a trial. However, when judicial experts examine evidence in multimedia files, such as videos or audio fragments, they must manually search the document to locate the fragment at issue, which is a tedious and time-consuming task. In order to ease this task, we propose a system that allows automatic transcription and indexing of multimedia content based on deep-learning technologies in noise environments and with multiple speakers, as well as the possibility of applying forensic linguistics techniques to enable the analysis of witness statements so that evidence on its veracity is provided.

Bibliographic References

Almela, A., R. Valencia-García, y P. Cantos. 2012. Detectando la mentira en lenguaje escrito. Procesamiento del lenguaje natural, 48:65–72.
Ballesteros, M. C. R. 2011. La necesaria modernización de la justicia: especial referencia al plan estrat´egico 2009-2012. Anuario jurídico y económico escurialense, (44):173–186.
Hannun, A., C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates, y others. 2014. Deep speech: Scaling up endto-end speech recognition. arXiv preprint arXiv:1412.5567.
Jiménez-Zafra, S. M., R. Morante, M. Teresa Martín-Valdivia, y L. A. Ureña-López. 2020. Corpora annotated with negation: An overview. Computational Linguistics, 46(1):1–52.
Pfeiffer, S. y I. Hickson. 2013. Webvtt: The web video text tracks format. Draft Community Group Specification, W3C.
Ravanelli, M. y M. Omologo. 2017. Contaminated speech training methods for robust dnn-hmm distant speech recognition. arXiv preprint arXiv:1710.03538.
Salas-Zárate, M. P., M. A. Paredes-Valverde, M. A. Rodríguez-Garc´ıa, R. Valencia-García, y G. Alor-Hernández. 2017. Automatic detection of satire in twitter: A psycholinguistic-ased approach. Knowl. Based Syst., 128:20–33.
Snyder, D., D. Garcia-Romero, G. Sell, A. McCree, D. Povey, y S. Khudanpur. 2019. Speaker recognition for multispeaker conversations using x-vectors. En ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), p´aginas 5796–5800. IEEE.
Sun, M., D. Snyder, Y. Gao, V. K. Nagaraja, M. Rodehorst, S. Panchapagesan, N. Strom, S. Matsoukas, y S. Vitaladevuni. 2017. Compressed time delay neural network for small-footprint keyword spotting. En INTERSPEECH, p´aginas 3607–3611.
Zalman, M., L. L. Rubino, y B. Smith. 2019. Beyond police compliance with electronic recording of interrogation legislation: Toward error reduction. Criminal Justice Policy Review, 30(4):627–655. Zhang, Y., G. Chen, D. Yu, K. Yaco, S. Khudanpur, y J. Glass. 2016. Highway long short-term memory rnns for distant speech recognition. En 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), páginas 5755–5759. IEEE.

Data source: Dialnet