Suppose you want to create a web page with some search capabilities over a datab
ID: 649954 • Letter: S
Question
Suppose you want to create a web page with some search capabilities over a database. Search has to be restricted both on some categorization of data and open text given by the users. Functional and non-functional requirements pose this constraints:
Search has to be fast
Data representation paradigm (i.e. Relational, Object-Relational, Semistructured...) should support big modifications on every stage of software lifetime.
Database's paradigm should permit easy refactoring for existing client code
Having this into account
Would you recommend a relational database? How would you tackle the constraints in this scenario?
Would you recommend a XML database? How would you tackle the constraints in this scenario?
If you think above solutions to be not useful in the light of this problem... what could be a useful approach?
Please give some bibliography (or some websites) supporting your answers.
thanks!
Explanation / Answer
Generally speaking, some of the big website (think server and database clusters) applications that I've written, I've used a Lucene/Lucene.NET index to create a search index of appropriate database fields. The main advantage of this is speed, an in memory db index as opposed to a round trip -> db -> webserver for a search.
From a code development perspective lucene is completely separate from your database implementation and so you have a separation of concerns between how you want to represent your data, and how you index your data for easy searching. You can use a NoSQL db, Relational DB, etc. The search engine wont care as it's only getting its results from the index, and only when you need to display the actual contents you can grab the contents by an indexed field on the db and you're done.
See the search engine on:
BigPondMusic
There are actually 2 lucene indexes behind the scenes, 1 index which is created specifically for predictive text (results returned are based on most popular searched on artist at the time), and the full db index which indexes albums, artists which match the search criteria.
So far the performance of the Lucene Index is exlemprarly, I have > 1 million db entries indexed by artist and album and can return N-Gram results data back < 1ms for the predictive text index.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.