Empress Markets

News & Events


Press Releases

Product Profile

Success Stories

Technical News

White Papers

Join Mailing List

For More Info


      Empress Technical News January 2009

    Database Text SearchIndex Fast Text Data Retrieval Part 1



    Empress Ultra Embedded 10.20 offers many additional features to application developers. One of those features is a Text Search Index capability.

    Empress Text Search Index

    Empress Ultra Embedded V10.20 Text Search Index capability empowers application developers to implement an efficient search for database records using keywords/tokens/phrases. The most typical usage would be to associate those keywords with particular character/text based attribute in a database table.

    The search index capability is developed as an additional set of C calls that are used in conjunction with Empress C/C++ Kernel Level API mr Routines. The search index is a user maintained index an index not maintained via Empress database engine calls.

    Application would supply the list (array) of tokens/keywords/phrases on insertion in the Empress database in order to create a text search index.

    Lets take an example of storing song titles in an Empress TEXT data type attribute. For the famous Beatles song I Want To Hold Your Hand the list of tokens/keywords/phrases could look like as follows:

    Hold Your
    Your Hand
    Hold Your Hand

    In the above example the list contains seven tokens. Tokens can contain a single word or multiple words. Words do not necessarily have to be contained in the original song title. This could be beneficial when, for example, using either token Mc or Mac to get a song title Old Macdonald.

    Once created an Empress text search index enables very powerful searches using inserted tokens/keywords.

    In the presence of hundreds of thousands of song titles a search for the song title with a token Hold could result in the result set containing the following titles:

    You Really Got A Hold on Me (Beatles)
    If We Hold On
    (Diana Ross)
    I Want To Hold Your Hand
    Hold Me Back
    Hold Me, Thrill Me, Kill Me

    Retrieval Using Search Index

    The following example provides the actual program code (select_text.c) using Empress C/C++ Kernel Level API mr Routines to show how to retrieve records from the table songs using the search index on the same table. The search index C API calls are in bold. The example performs the retrieval of all the records from the table that contain token Hold. When translated into SQL the example does something like:

    SELECT id, title FROM songs WHERE title HAS Hold

    #include      < mscc.h>
          < tsi_api.hx>

    global_shared_func    (char* format, ...)
                    printf                      mssio_printf
            DATABASE    L" karaokedb"
              msmain (int      argc,
    char*  argv[])

    wchar_t*                                  id_value
    wchar_t*                                  title_value


                  = mropen (DATABASE, L" songs" , 'r')
    songs_recdesc = mrmkrec (songs_tabdesc)
    id_attrdesc = mrngeta (songs_tabdesc, L" id"
    title_attrdesc = mrngeta (songs_tabdesc, L" title" )
    index_handle= mstsi_open (songs_tabdesc , title_attrdesc, 'r')
    id_value = mrspv (id_attrdesc)
    title_value = mrspv (title_attrdesc)
    printf (" Songs that contain token Hold\n\n"
      printf (buf, %-12s %-22s\n" , " Id" Title" )
    record_list = mstsi_get (index_handle, L" Hold" , 0)
    if (record_list == 0)
    printf (" No songs with token Hold\n"
    return 0
    qual = mrqlst (songs_tabdesc, record_list)
    retrieve_desc = mrgetbegin (qual, songs_recdesc, (void*) 0)

                  (mrget (retrieve_desc))
    mrcopyv (songs_recdesc, id_attrdesc, id_value)
    mrcopyv (songs_recdesc, title_attrdesc, title_value)
            sprintf (buf, " %-12ls %-22ls\n" , id_value, title_value)
            printf (" %s" , buf)
    mrgetend (retrieve_desc)
    mrfree (id_value)
            mrfree (title_value)
    mrfrrec (songs_recdesc)
    mstsi_close (index_handle)    
    mrclose (songs_tabdesc)
      printf ( select1_text is done\n" )
    return 0

    Performance Considerations

    When executing the above mr program the performance could be several orders of magnitude better than when performing the same search using an SQL query, e.g.:

    SELECT id, title FROM songs WHERE title LIKE %Hold%


    SELECT id, title FROM songs WHERE title SMATCH *

    In another example a search for a keyword " Embedded" was performed against all titles stored in Empress database containing Wikipedia data. The table pages contained 7,649,051 titles and related information such as and author.

    Using the text search index, the search completed in 0.077 seconds with 131 titles in the result set.

    For the comparison, an SQL query was executed to do the similar search on the TEXT attribute title

    SELECT id, title FROM pages WHERE title SMATCH '

    It required 35.314 seconds for the SQL query to complete.

    The Empress text search index was about 500 times faster than the SQL query!!


    EmpressSoftware Inc.



GSA Contracts
Privacy Policy 
Contact US

Empress RDBMS
Empress Servers
Empress API
Free Trial 
Empress iPad Apps

Mobile Apps Development
Technical Support

Training Courses
Empress Extreme  

Business Alliances
Partner Programs  
GSA Program
Distributors Program

Empress Markets
News and Events
Press releases
Technical News
Success Stories

        USA: 301-220-1919                 Canada & International:       905-513-8888         
Copyright 2014           Empress Software Inc.            Join Empress Mailing List