Ambaradan: an open source storage system for language based information, whether textual, spoken or gestured

Ambaradan is an open source storage system for language based information, whether textual, spoken or gestured. It allows both semantic annotation and a simple management of individual and co-operative translation processes. It aims to serve third parties, that can use it to store, share and manage information through its API. It shall eventually develop into a distributed repository, so that users can become direct donors of server space.

Nowadays you can find dedicated storage systems for most kind of data, like PostGIS for geographic objects. It is, Mandelhowever, quite uncommon to hear that someone is developing a dedicated storage system for language. The reasons behind this are multi-fold, but mostly it all boils down to the fact that even 8 bit boxes already did text, so dealing with it is considered sort of retro.

Yet, as a matter of fact, coping with language diversity is one of the largest problems facing the Internet and the international economy. As a result of the current refactoring of the world's economics, the English language's dominant position is going to be further eroded, and open source tools that can manage language diversity are bound to become a strategic asset.

Separating content from its rendering technology

safeHaving a centralized repository for content and its semantic annotation means you have content that is not embedded into some application. It means your application may age and get substituted, but the data it shaped will stay. If you come to think about it, it's the same thing that happened when we stopped writing accounting applications and started to use relational databases, so that we could drop our UIs without loosing our data.

This aspect cannot be overstated, as the pace of innovation gets quicker and quicker, and the life expectancy of rendering technologies gets accordingly shorter. So moving to a semantic repository is not just about languages, it's about your capability to keep your knowledge safe while changing your technology arsenal.

Your own private cloud

cycloneCloud computing has caused a lot of rumour and doubts, as we all moved data to a destination unknown. It proved a good idea (all success stories are, and clouds are definitely a success story), but it also raised questions about privacy and ultimate information control. If we all live with clouds, the main power switch is in the hands of those who control the weather.

So the ultimate goal of ambaradan is to let end users choose their own clouds, be it a totally private system or a large co-operative environment made by volunteers. This is currently no more than a strategic direction. We know what we want to achieve, but we have not committed to any tactical choice of technology for our clouding.

A word about less resourced cultures

mundurukuYou may be surprised by finding that a charity assisting less resourced cultures ends up in developing a general purpose industrial tool. You should not, as this is a political stance. We believe that the road out of the ghetto is called dignity, and respect is something to be earned.

When solutions to the problems of less resourced languages become a general utility, these solutions may become robust, they can be financed and they can grow and spread. We would never be able to assess the complexity of linguistic dependent content if we did not study this complexity at large. So this complexity and the needs of less resourced cultures are the engine behind the development of this tool.

The need for tremendous individual productivity is much higher where you have but a handful of native speakers that can volunteer for work. It is by addressing this need for a dramatic surge in productivity that we become useful to everyone. Our message is that minorities' problems are everyone's problem, when we solve them, we have generated a positive outcome for everyone.

A spoonful of history

This storage engine was originally born as an upgrade for Omegawiki, but in time it has evolved to such a conceptual distance from Omegawiki that at this stage they can only be presented as remote relatives.

Ambaradan has already undergone 6 alpha releases by now, and it has chosen PostgreSQL as its relational backend. After the KDE sprint in Randa it has made a strategic choice towards a REST interface that makes it an online resource for third parties, an objective to be achieved by the upcoming 0.7.0 (codenamed Elastica) release.

The choice to produce data for KDE is strategic. The development of any storage system aiming for success needs to be driven by the needs of large data consumers, rather than by the vision of a handful of developers. We are aware that many things are going to change as people starts to complain about things, and are happy with this source of quality feedback.

Ambaradan in objects

Everything in Ambaradan is an object, and as in all decent programming families (hopefully you got the self-ironic tone here) our objects are made of hierarchies, inherited values and behaviours.

To present such uncanny pieces of black computing magic, the usual geek convention is to start from the root element, write loads of pages on how powerful and generic it is, and then slowly get to things a simple user can understand. Too bad that most simple users have long quit reading, by the time things get to be interesting for them.

We obviously cannot build a house starting from the roof, yet we did try our best to build this guide with things that non-programmers can immediately grasp, without any need to get deeply involved in object oriented theology. You may want to read the following pages to get a good grasp of how this engine works.