Fake tweet campaigns come under fire from Indiana scientists

Check out the conclusions of a recent analysis from scientists at the School of Informatics and Computing at the University of Indiana, on fake vs real tweet memes that could have serious implications for corporate social marketing campaigns in the future. Scroll down to the interesting point highlighted in bold. (PDF: Fake-tweets-identifier)

In this work we proposed a framework to deal with the problem of clustering memes in social media streams, Twitter in particular. Our system is based on a pre-clustering procedure, called protomeme detection, aimed at identifying atomic tokens of information contained in each tweet. This strategy only requires text processing, therefore is particularly efficient and well suited for a streaming scenario. Memes are thereafter obtained by aggregating protomemes on the basis of the similarity among them, computed by ad-hoc measures defined according to various dimensions including content, the social network, and information diffusion patterns. Such measures only adopt information that can be extracted in a streaming fashion from observed data, and they allow to build clusters of topically related tweets. The meme clustering is carried out by using a vari ant of Online K-means, which integrate s a memory mechanism to keep track of the least recently up dated memes. We used a dataset comprised of trending hashtags on Twitter to systematically evaluate the performance of our algorithm and we showed that our method outperforms a baseline that only uses tweet text, as well as one that assumes full knowledge of the underlying social network.

One crucial feature of our system is that it can b e extended to work with any clustering algorithm based on similarity (or distances). In this paper, for example, we chose to present Onlin e K-means b ecause of its simplicity; however, during our design we also tested other metho ds including density-based and hierarchical data stream clustering algorithms (e.g., DenStream [10] and LiarTree. Although a complete benchmark and tuning of these alternative methods was out of the scope of our analysis, we collected evidence of the ease of extension of our framework to different algorithms.

In the future one could extend the set of features incorporated by our clustering framework, considering for instance entities such as images. Furthermore, our preliminary analysis suggests that the introduction of time series as features may yield significant performance improvements. Our long-term plan is to integrate the meme clustering framework with a meme classifier to distinguish engineered types of social media communication from spontaneous ones. This platform will adopt supervised learning techniques to classify memes and determine their legitimacy, with the aim to detect misinformation and deception campaigns in their early stages. The platform will be optimized to work with the realtime, high-volume streams of messages typical of Twitter and other online social media.

100_0131

A mini case study – growth hacking within the enterprise

‘Growth hacking’ is a fashionable subject with the rise of startups, but it’s not so easy to established marketers to know how to use some of the insights to help in improving performance in day to day business activity.

Part of this is down to the fact its as much about mindset, as it is using tools to achieve growth.

I therefore wanted to share a mini ‘hack’ I achieved at Shopping,com UK, improving our email subscriber rate by 360% at virtually no cost, which was down to taking a growth hacking approach to the problem.

The challenge: I needed to significantly added subscribers to our email newsletter. I did this by finding and then mining an existing SDC e-marketing database which contained a historic list of inactive fans.

The result: Coupled with the design input from a creative marketing executive leading to improvements in content and design of email achieved significant increase (360%) in site subscriber sign ups: from 4318 – (Aug 2010 newsletter with 12.5% open rate) – to 19,934 for July 2011 newsletter (with 57.9% open rate)