Crowdsourcing in science: Annotating, digitalising, collecting data

At the very latest when the term ‘big data’ entered public consciousness, it should have become clear to everyone that vast amounts of data are part and parcel of modern life. And of course science is not protected against this development. Ever-larger quantities of data must be collected and then analysed. With the aid of... Read more »

At the very latest when the term ‘big data’ entered public consciousness, it should have become clear to everyone that vast amounts of data are part and parcel of modern life. And of course science is not protected against this development. Ever-larger quantities of data must be collected and then analysed. With the aid of crowdsourcing, vast amounts of various types of data can be efficiently generated and indeed processed. The following examples demonstrate the wide variety of ways in which crowdsourcing can also be used for scientific purposes.

 

1. Crowdsourcing for science – Measuring light pollution

To anyone who moves from the city to the countryside, or who looks out of the window during a large-scale nocturnal power cut, or who travels to remote areas such as the Sahara, it will soon be clear what is meant by the term light pollution. Whereas far away from human civilisation, the eye is drawn to a spectacular starry canopy, even the brightest stars in the night sky above our cities fade to a grey twilight. This is due to the countless artificial light sources being reflected from particles in the night sky. Being able to measure this light pollution is difficult to accomplish without expensive camera installations and laborious manual analysis. The Loss of the Night app from the Leibniz Institute of Freshwater Ecology and Inland Fisheries provides data from across a number of countries. People can download the app and open it during nocturnal walks or when spending evenings on the balcony. They are presented with visual guides to various stars and asked if they can see them. The result is an extensive, precise and up-to-date map of light pollution in entire countries.

 

2. Annotating cells – Hunting cancerous cells and building proteins

It is often the more entertainment-orientated crowdsourcing applications that promise a particularly rapid accumulation of useful data. One example of the fun analysis of relevant data is a project from the Technical University of Munich. Players annotate cells from histologic tissue specimens and must detect cancerous cells on the basis of their typical characteristics. This occurs in the form of a computer game in which players must ‘shoot’ the cancerous cells. This approach ensures that players are motivated to provide precise results and at the same time achieve a high score. The data acquired in this manner is fed to a computer program, whose own purpose is to independently analyse cell samples for cancerous cells. Players therefore assist the software with its learning process. Another example of the benefits of crowdsourcing in science taking the form of computer games is the game FoldIt, in which players can recreate the tertiary structure of proteins. Within ten days of the software’s release, the crowd was able to figure out the structure of one of the key proteins of the HI virus.

 

3. Crowdsourcing for science – Digitalising handwriting

Whoever thinks that crowdsourcing in science is just something used in biology should think again. Historians too can benefit from crowdsourcing, especially with regard to the rendering of information handwritten using the old Germanic script. For example, in the project genpas, the text of old postcards that are no longer subject to copyright is converted into digital form. Because these postcards were written in the old German style, which not everyone these days can read, researchers from the genealogical postcard archive, who are carrying out this research, rely upon the assistance of the crowd. Users experienced in the old style of German handwriting transcribe the postcard texts and addresses into digital form and also check the work of other users for errors. The result is a digital dataset that genealogists and other historians can quickly and thoroughly search for important information.

 

4. Conclusion

The examples given here represent just a few of the possibilities offered by crowdsourcing in the scientific world. Wherever large amounts of data must be processed or manually prepared, wherever new impulses must be selected from wide ranges of options, or wherever artificial intelligence must be trained for scientific applications, assistants in the crowd can deliver quick, precise and economical results. Even though the public mostly becomes aware of the projects that address emotional issues or which are presented in an interesting manner, it is often the groundwork that particularly benefits from crowdsourcing. In such cases, partnership with professional crowdsourcing providers such as us here at Crowd Guru can be the best way to get the desired results quickly and economically.