[[start]]

Project description

It is a project that's still under development (Its name is also to be defined). It should be seen as a proof of concept

It is an online tool that will allow users to save and tag url's. The principles for this kind of application are not new (see Del.icio.us), but there will be some additional features in the search area and users group management.

We want to give users the opportunity to bookmark their favorite URL's and build a custom search engine with the ability to search within a bulk of website or url's.

The project is to host an online version of the product but also give companies the opportunity to host their own system.

Current status

Current status is : Private Alpha

The challenging part of the project is to define THE architecture that will be scalable and reliable. This is the kind of job that is currently performed.

The architecture running at Tatootags still needs some improvements, though the main components are now clearly identified.

Key open source components of the architecture

  • database (Mysql)
  • index/search manager (Lucene)
  • index/search server (Solr)
  • Java servlet container (Jetty)
  • memory cache (memcached)
  • http server (apache)

How scalability is ensured

  • Extensive use of caching mechanism (memory or disk) within each layer (OS, DB, indexer/searcher, application components, presentation components, …)
  • Master/Slave architecture (db, searchers)
  • Parallel processing (splitted indexes)

Hardware

  • Currently, all this mechanics is running on a single machine !

Performances

  • Majority of the requests take less than 1 sec. (0.002 sec when in cache)
  • Some requests take a long time (5-10s). The “related tags” tab is the one to be improved specially when tags are combined (ie: News+Sports)

Current test data set

  • > 200 000 tags
  • > 4 000 000 url's

This is a view of the Dmoz directory for which each tag maps a categorie.

Site Map

The following is a list of existing (or not!) url's:

 
start.txt · Last modified: 2009/04/20 18:13 by thierry
 
Recent changes RSS feed Valid XHTML 1.0 Valid CSS Driven by DokuWiki