SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Data Structures | Typedefs | Functions
cwr_lm_cache.c File Reference

Extra layer on top of the LM-interface. More...

Data Structures

union  _Union1_CWR_LM_CACHE_
 
struct  SprCwrHashP2XEl
 
struct  SprCwrHashP2X
 
struct  SprCwrLmcHashEl
 
union  SprCwrLMCacheElX
 trick to allow fast copies More...
 
struct  SprCwrLMCacheTbl
 
struct  SprCwrLmcHashTbl
 
struct  SprCwrSearchLmcHash
 

Typedefs

typedef SprCwrLmcHashElSprCwrLmcHashElPtr
 

Functions

SprCwrHashP2Xspr_cwr_hash_p2x_alloc (unsigned int max_n_el)
 
SprCwrHashP2XElspr_cwr_hash_p2x_find (const SprCwrHashP2X *htbl, const void *ptr)
 Find a given pointer in the hash table. More...
 
SprCwrHashP2XElspr_cwr_hash_p2x_add (SprCwrHashP2X *htbl, const void *ptr)
 Add a given pointer in the hash table. More...
 
void spr_cwr_lm_cache_init (SprCwrSearchLmcHash *search_lmc_hash)
 
SprCwrHashP2Xspr_cwr_lm_cache_check (const SprCwrSearchLmcHash *search_lmc_hash)
 
SprCwrHashP2Xspr_lm_cache_check_fst (const SprCwrSearchLmcHash *search_lmc_hash)
 
void spr_cwr_lmc_hash_lmcr_free (SprCwrSearchLmcHash *search_lmc_hash)
 
unsigned int spr_cwr_lmc_hash_clear (unsigned int *lost_cnt, SprCwrSearchLmcHash *search_lmc_hash)
 
void spr_cwr_lm_cache_clear (SprCwrSearchLmcHash *search_lmc_hash, int reset_counters)
 
int spr_cwr_lmc_hash_resize (SprCwrSearchLmcHash *search_lmc_hash, unsigned int size)
 
int spr_cwr_lm_cache_resize (SprCwrSearchLmcHash *search_lmc_hash, unsigned int size)
 
int spr_cwr_lmc_hash_lm_install (SprCwrSearchLmcHash *search_lmc_hash, const SprCwrSRMDesc *srm_desc, void *lm, const SprCwrSLex *slex, const int *const *word_sets)
 
int spr_cwr_lmc_hash_lm_uninstall (SprCwrSearchLmcHash *search_lmc_hash)
 Uninstall the LM (close interface to it). More...
 
void spr_cwr_lmc_hash_execute_release (SprCwrLmcHashEl *lmc_hash, SprCwrSearchLmcHash *search_lmc_hash)
 
void spr_cwr_lmc_hash_execute_release0 (SprCwrLmcHashEl *lmc_hash, SprCwrSearchLmcHash *search_lmc_hash)
 
float spr_cwr_mx_wset_prob (int word_id, const SprCwrLmcHashEl *lmc_hash, SprCwrSearchLmcHash *search_lmc_hash)
 
float spr_mx_wset_prob_fst (int word_id, const SprCwrLmcHashEl *lmc_hash, SprCwrSearchLmcHash *search_lmc_hash)
 
SprCwrLmcHashElspr_cwr_lmc_hash_add (const SprLMContext *lmc, SprCwrSearchLmcHash *search_lmc_hash)
 
void spr_cwr_lmc_hash_cond_release (SprLMContext *lmc, SprCwrSearchLmcHash *search_lmc_hash)
 
const LMCacheEl * spr_cwr_lm_cache_query (SprCwrLMWord *wrd, const SprCwrLmcHashEl *lmc_hash, SprCwrSearchLmcHash *search_lmc_hash)
 

Detailed Description

Extra layer on top of the LM-interface.

Extra layer on top of the LM-interface to improve the efficiency of the LM in a token passing application. The efficiency is improved by means of

  1. caching for fast retrieval of LM probs and contexts, and
  2. hashing of the LM-contexts for fast checking whether two LM-contexts are identical (recombination).

The structure allows for re-use of LM-contexts even after the LM-context was out of scope (there were no tokens with the given LM-context). This is implemented by means of a Least Recent Used re-use scheme in combination with a conservatively large hash table (2 times the required size).

The overhead of invalidating LM cache-lines when an LM-context gets out of scope (i.e. if the LRU assignment scheme re-uses a non empty cell) is avoided by sequential numbering of LM-contexts. Hence, only of the counter overruns, the cache needs to be flushed.

Note
For trivial LM's (LM's which do not depend on a context such as uni-grams or 0-grams), no LM-context hashing or LM-caching structures are allocated.
The 'lmi' and 'lm_data' sub-structures may be used by the calling process.
Author
Kris Demuynck
Date
xx/09/1995 - KD
First implementation (part of cwr_search.c)
20/08/2004 - KD
Extracted from cwr_search so that it could be used in other search implementations as well
13/04/2010 - KD
Added routines for the FLaVoR decoder