fredag 22 november 2013

Theme 3: Research and Theory post-reflection


Journal Description - Transactions of Audio, Speech and Language Processing


The journal covers science in the field of audio, speech and language processing and publishes papers concerning design, development and evaluation of such applications and its associated theory. Papers are mainly application-oriented and applied machine learning or pattern recognition analysis are also welcome, even though the journal does not include this in the description.

Impact factor: 1.675

Critical Review on "Machine Learning Paradigms for Speech Recognition:An Overview"


The papers main goal is to present an overview of the machine learning paradigm used for speech or speaker recognition. It gives an overview on the mathematical notion and terminology commonly used in the field. This explanatory exercise is done through a fundamental literature search after which the mathematical notion is "standardised" and the terminology is categorised and explained. The main hypothesis is that the two different scientific fields, machine learning and speaker recognition, should be more involved with each other, since machine learning is not only a tool for speak recognisers but also a source for new machine learning research.

reference:

Li Deng; Xiao Li, "Machine Learning Paradigms for Speech Recognition: An Overview," Audio, Speech, and Language Processing, IEEE Transactions on , vol.21, no.5, pp.1060,1089, May 2013
doi: 10.1109/TASL.2013.2244083

What is Theory?


Theory can be of different kinds in different fields of knowledge/science. The author Shirley Gregor presents a taxonomy (a classification) where she divides theory into five different parts: "Analyse", "Prediction", "Explanation", "Explanation and Prediction" and "Design and Action". The difference between these lies in what they aim to do. Analyse for example, is theory that aim to analyse something and give answers to the question: "what is it?". Such theory could be to analyse if somebody is sick or not, according to certain symptoms. To predict or to explain on the other hand, also includes analysing but also tries to either explain why this is so or explain what will come of this fact. E.g. Why is the symptoms such and such or what will the symptoms lead to if the patient is not medicated. Design and action, is also linked with the others, but this more like a recipe for success. A patient could get such theory by a doctor in form of a prescription or a guide to get better.

In other words, theory is not just a collection of data or other persons theories, theory is based on observation and other peoples theories. The way from observation through others theories to ones own theory, is a logic path where one has to prove the different steps you take in order to be sure that your theory is in fact not false.

What Kind of Theory is Found in the Article? 

The article reviewed above constitutes of a lot of theory, since it is an overview of the field of speech recognition and speaker recognition. But all these theories and the nature of them, also brings the author to a conclusion, i.e. their theory. The theory they present is a mix o what Gregor calls "Explenation and Prediction" and "Design and Action", meaning that aims to analyse what the field is and how to use it, and how it could benefit from this action. The theory is not the data they collected nor the diagrams they show, it is what they can see in the diagrams and the data.

Benefits and limitations


The clear problem with using theory types as they do in the article, is that the theories are just touched at the surface and are just mentioned and briefly explained. To clearly understand the outlined theories, one has to dig deeper into the references supplied in the text in order to understand why such theories are to be used. The article only gives an overview of the field and is not to be the only source for theory when trying to include machine learning in ones speech/speaker recognition application. If one is on the mission to understand the correlation and the relevance of including machine learning research in the speech community (and vice versa) suggested in the text, then the article suffice.

Inga kommentarer:

Skicka en kommentar