Dominic Widdows

Work place: Microsoft Bing, Bellevue WA, 98004, USA

E-mail: dwiddows@gmail.com

Website:

Research Interests: Quantum Computing Theory

Biography

Dominic Widdows works principally on information extraction from the web for Bing local search. A mathematician by training, Dominic has worked on differential and algebraic geometry at Oxford (1996-2000), natural language processing and search at Stanford (2001-2004), distributed databases and collaboration at MAYA Design (2004-2007), and information extraction at Google and Microsoft Bing (2007-2015). His main theoretical research focus for a number of years has been on vector models for learning and reasoning, and the interaction between this area and quantum theory. He continues to contribute research papers in a range of areas including quantum informatics and concept learning, and works on several program committees and review panels. As well as being part of the Google Sky Map team, his main contributions to open source projects have been in the areas of semantic mapping and semantic search, including the Semantic Vectors package, initially created in partnership with the University of Pittsburgh, and now maintained by a small group of researchers and developers internationally.

Author Articles
Exploring Semantic Relatedness in Arabic Corpora using Paradigmatic and Syntagmatic Models

By Adil Toumouh Dominic Widdows Ahmed Lehireche

DOI: https://doi.org/10.5815/ijieeb.2016.01.05, Pub. Date: 8 Jan. 2016

In this paper we explore two paradigms: firstly, paradigmatic representation via the native HAL model including a model enriched by adding word order information using the permutation technique of Sahlgren and al [21], and secondly the syntagmatic representation via a words-by-documents model constructed using the Random Indexing method. We demonstrate that these kinds of word space models which were initially dedicated to extract similarity can also been efficient for extracting relatedness from Arabic corpora. For a given word the proposed models search the related words to it. A result is qualified as a failure when the number of related words given by a model is less than or equal to 4, otherwise it is considered as a success. To decide if a word is related to other one, we get help from an expert of the economic domain and use a glossary1 of the domain. First we begin by a comparison between a native HAL model and term- document model. The simple HAL model records a better result with a success rate of 72.92%. In a second stage, we want to boost the HAL model results by adding word order information via the permutation technique of sahlgren and al [21]. The success rate of the enriched HAL model attempt 79.2 %.

[...] Read more.
Other Articles