Thursday, February 28, 2008
The Voronoy (or Voronoi) Tessellation (Voronoy 1908) is a technique that enables the division of a such multi-dimensional spaces into subspaces. Its application defines geometric areas equivalent to subspaces by defining several vectors as centres of subspaces. Any other vector in space can then be attributed to the closest centre vector effectively dividing the whole [...]
Tuesday, February 26, 2008
The INFOMAP project is an older but nevertheless interesting introduction into semantic vector space models. The related software is freely available. It uses a combination of approaches but mostly relies on Schütze’s Automatic word sense discrimination work. However, it does not use context vectors and concentrates on a SVD compressed HAL matrix.
Tuesday, February 26, 2008
Automatic word sense discrimination was publish in 1998 by Hinrich Schütze and can be seen as a further development of the HAL approach. He calls the underlying semantic vector space, Word Space, but it relates to the same basic matrix of word co-occurrences in a word by word matrix. His aim is to identify Senses [...]
Tuesday, February 26, 2008
Apperceptual comments on an interesting problem in one of his blog posts. He is discussing the importance of high order co-occurrences on word similarity measures in LSA. The part that interested me was the discussion of Singular Value Decomposition (SVD). My gut feeling has always been that SVD’s most useful characteristic was to amplify the [...]
Monday, February 25, 2008
Also known as semantic memory it was developed by Kevin Lund and Curt Burgress from the University of California, Riverside, California. You can download the corresponding paper, Producing high-dimensional semantic spaces from lexical co-occurrence, in PDF format.
The basic premise the work relies on is that words with similar meaning repeatedly occur closely (also known as [...]