Wednesday, January 18, 2012

Some statistics on new Twitter accounts

Some of you may have noticed that Twitter is approaching 500 million registered accounts. From the daily rate of new registrations Twopcharts is currently estimating that the moment when 500 million accounts are registered will be sometime in the second half of February.
The question that has already been raised many times is how many of these new Twitter accounts are actually active users. Although only Twitter can tell what the exact number is, and how often people log into the system, we can make an effort to estimate how active new accounts are.

We have gone back to October 28th, 2011 when @AdenMo registered account 400 million. So far this account has only sent 1 tweet just after registration, is following 1 account, and is followed by 3 accounts. After we checked this account, we also checked the following 99,999 account that registered, and came up with the following results:

Out of the 100,000 accounts we checked and that registered almost 3 months ago, 12% has disappeared. It is safe to assume that most of these accounts were deleted, while some will have been suspended or cancelled by Twitter. Some other facts:

  •  From the 88,052 still existing accounts, 54,879 (62.3%) have not changed their profile image and are still showing us an egg, with a variety of 7 different background colors.
  • 20.2% of active accounts have filled in the location field, and 17.3% have filled in the bio field.
  • From the 88,052 accounts, 3,871 (4.4%) has chosen to hide their tweets from the public eye, and have protected their accounts.
  • 19,721 existing accounts are not following anyone and are not followed by anyone.
  • Total amount of tweets sent by this sample is 3.8 million, with an average per person of 43.8 tweets for the period, or just over 0.5 tweets per day per account.

The number of 3.8 million tweets can be analyzed to show the difference in active tweet behavior by these relatively new accounts. Total distribution is as follows:


The table shows that 53.8% of our sample size has never sent a tweet, while 5.2% has sent more than 100 tweets over the period from October 28th, 2011. This group is also the most active; 61.2% has sent a tweet less than 2 days ago and 92.4% less than 30 days ago. Obviously with the less active account these figures are different. From the accounts that only sent 1 tweet, 83.6% was sent more than 2 month ago, with the majority sent on the registration date.
On the last summary line, it can be seen that 10.1% of these new accounts has sent a tweet in the last week and 17.1% in the last 30 days, and could be regarded as, more or less, active accounts.

With respect to following and followers the following results were measured, based on the still existing new accounts:


The table shows that from the active accounts 52.7% have no followers and 24.3% is not following anyone else.  A very significant majority of 96.8% of the accounts have 50 followers or less. 90.6% of the accounts are following 50 accounts or less.

Conclusion

Based on a sample of 100,000 accounts that were registered about 3 months ago, one has to be very cautious to draw any general conclusions. It is however likely that from the huge amount of new accounts that have been registered in the last year, the statistics will not differ greatly from what we found here. The vast majority of new accounts only has a limited amount of followers and following, and based on the amount of tweets sent, and the submission date of last tweets, probably only a maximum of 20%-25% convert to active new twitter users, with only about 10% accounts that actively send tweets.



Tuesday, December 13, 2011

Some Data on the Top Global Twitter Accounts

Unless you have been hiding under a rock for the last few years you will know about Twitter, the micro-blogging website which has grown at a tremendous pace since its beginning more than 5 years ago. We would like to share some data with you that has been compiled by Twopcharts, the website that tracks Twitter users in a number of languages and cities, and offers a number of tools relating to Twitter.

It also tracks the Top Global Twitter Accounts, and has been doing this for the last two and a half years. Twopcharts shows that there are now at least 618 Twitter accounts that have more than 1 million followers, and 38 Twitter accounts with more than 5 million followers. Leader of the pack is Lady Gaga who had almost 17 million followere at the time of writing.

The number of followers for the 1,000 most followed accounts is very impressive, with an average number of followers of 1.6 million per account for the group and still 633,000 followers for the number 1,000 global account at this moment. The following graph shows the average amount of followers per 10% group of the top 1,000 Global Twitter accounts:


What you can read in this graph is that for the 7th 10% group, or accounts ranked 601 to 700, the average number of followers per account is just below 1 million, with an amount of about 950 thousand followers.

The average number of new followers was more than 91 thousand per account in December 2011. Where some accounts reached more than 1 million new followers, even the group of accounts ranked between number 900 and 1,000 reached an average amount of new followers of more than 38 thousand followers. The following graph shows the complete overview:


Although the top-100 Global Twitter accounts clearly attract the most new followers, it is obvious from the chart that there is a large group of accounts that add new followers with more than 1,000 new followers per day.

The average number of 91 thousand new followers in the month of November for the total group of Top-1000 accounts compares to an amount of about 40 thousand new followers per account at the end of 2010 and 25 thousand at the end of 2009.

At this moment Twitter is still growing very fast, and is registering more than 800 thousand new accounts per day. Not all of these accounts will convert into active and engaging Twitter users, but it is clear that the success story of Twitter is far from over, and we can expect that follower numbers will keep increasing in the near future, which will keep expanding the reach of individual Twitter users.

Friday, July 22, 2011

Some data on Twitter followers

Twopcharts is a website that started two years ago to find interesting Twitter users in the Dutch language. Gradually this was expanded, and the website is now tracking Twitter users in 11 languages. In addition to these languages it is also tracking the most followed Twitter users around the globe, which is dominated by people who tweet mostly or occasionaly in the English language. With the vast amount of historical data that is available, interesting things can be done to analyse Twitter use in the different language groups. For the purpose of this blogpost an analysis was made of the number of followers that the most popular Twitter users have by language. For each language group the average amount of followers was caclulated for the Top-1000 accounts at the end of June 2010 and 2011. The results of these calculations are the following:


The numbers in the table clearly show how fast Twitter has grown in the last year. The 1,000 most followed acounts in the world now have an average amount of followers of 1.2 million. In the middle of March the Global top-1000 reached a combined total of 1 billion followers, which means an average of 1 million followers each.

The second largest language is Portuguese, with an average number of followers of almost 196 thousand. The popularity of the Portuguese language is almost entirely caused by the popularity of Twitter in Brazil.
The third language, with an average of almost 154 thousand followers, is Spanish. From the Spanish Twopcharts it can be observed that Spanish tweeting Twitter users come from a wide variety of locations like Spain, Argentina, Mexico, Colombia and Venezuela.
With an average of more than 84 thousand followers for the Top-1000 accounts, Twitter is also extremely popular in Indonesia.

Disclaimer: The table only shows languages tracked by Twopcharts. Other languages may or may not be more widely used than the ones represented here. For example the Arabic language is only tracked since March 2011. At the end of June the average number of followers for the Arabic Top-1000 accounts was about 11 thousand. Other languages, like Japanese, are also very popular, but are currently not tracked by Twopcharts.

Wednesday, July 13, 2011

A picture says more than a thousand words

For many people the amount of followers an account has is an important measure of its success. Most people who use Twitter prefer more followers to less followers, and greet their new followers with excitement. The fact that the amount of followers is perceived as important by many people is demonstrated by all the services that claim to be able to help an account to get more followers. Getting lots of followers is not very difficult, if that is what you are after, but the problem is that the quality of these followers is not very high. In many case the followers of these accounts do not understand the language of the tweets, or are not actually trying to read those tweets at all. Because of all the tricks that are being used, the amount of followers becomes a much less reliable measure of success. Luckily there is a way to determine the quality of followers in case an acccount publishes pictures.

The assumption is that many people who read tweets with a link to a picture will be interested to click on the link. So the more followers an account has that read their tweets, the more clicks they will have on the links to the pictures in their tweets. We have checked about 5,000 pictures published on Twitpic in June and counted the amount of views. Subsequently we compared these views with the amount of followers of the account holder. We found that the average reach of a picture was 6.3%, where reach is defined as the total amount of views of a picture divided by the amount of followers. 2/3 of all investigated pictures have a reach that is plus or minus 3.2% of the average, so fall within a range of 3.1 - 9.5%. Based on these numbers one is tempted to conclude that this range can be called reasonably normal, where a reach outside this range would be abnormal.

Before we try to reach conclusions, we did an additional check under the assumption that accounts with a lower amount of followers will have a more engaged group of followers. We therefore calculated the average reach for various groups with an increasing amount of followers. The outcome of these calculations is shown in the table below:


The shape of the graph is very interesting, and indicates that the average reach declines up to an amount of followers of 2,000 after which it increases gradually again. The group with a maximum amount of followers has the highest average reach with a percentage of 8.3%. The group with an amount of followers between 1,500 and 2,000 has the lowest average reach of 4.5%.

Based on these calculations it is of course very tricky to draw hard conclusions about the quality of followers. Pictures may of may not have been very interesting to followers, the amount of clicks will be different based on the time when a tweet was placed, and obviously pictures will not be only viewed by the followers of an account. The consistency of the calculations is however very appealing, and we will try to draw some conclusions, within a wide margin of error, based on our findings:
  • The reach of 8.3% in the group of account with a maximum of 500 followers plus or minus about 3% gives the best indication of what can be normally expected.
  • When accounts grow in followers, it is likely that the level of engagement will decrease and it can be assumed that the lower range of 1.5% - 7.5% reach for accounts up to 2,000 reflects this.
  • In many cases it is "hard work" to reach 2,000 followers, and a low percentage is very likely partly caused by the use of "tricks" to reach those followers.
  • The increasing average reach for larger accounts may have something to do with the fact that these acccount can be interesting to many followers, because the accounts have either very interesting and influential tweets, or just because of the simple fact that the account holder is well known.
  • With larger accounts there is probably also a wider impact of retweets, although we did not investigate this. Because of more retweets the group of people who view the pictures of an account can be much wider than only the followers of an account.  
  • A reach percentage lower than 3% will in most cases indicate that there is low engagement of the followers of that account. We noticed that in many cases where reach was very low, there was also a very high ratio of following/followers, sometimes even higher than 1. This is an important indication that there is low mutual engagement.
  • A reach percentage higher than 9% is in many cases achieved because of retweets. So even in cases where engagement of own followers may be low, this is compensated by reaching followers of other accounts through retweets.
At Twopcharts we try to calculate the number of "true" followers of an account based on the views of their pictures. We are inclined to use a margin of about 3%-6% as the multiplier for views. If we see an account with 30,000 followers and an average number of views of pictures of only 150, we believe a more accurate number of engaged followers will be between 2,500 and 5,000 followers for that account.

With this little investigation we still have not answered the big question of how many people actually read a tweet. Based on the fact that pictures are very appealing to view, a multiplier of 2 may be a defendable assumption. This would mean that your tweets are probably read by a maximum of about 10%-15% of your followers....

Sunday, May 22, 2011

What does 300 million registered Twitter accounts mean?

Last week we observed that Twitter had registered 300 million accounts, based on the assumption that all accounts are registered sequentially and started with number 1, back in March 2006. From observing registration dates from Twitter accounts and their id’s this seems a reasonable assumption.

As some observers noted, registered accounts are not the same as active accounts. In the beginning of February, Twitter communicated that there are “about 200 million accounts on Twitter now". Currently you can read here that Twitter has 200+ million registered users, 155 million tweets per day and 460,000 new sign-ups every day. With an account base of 200 million in the beginning of February and about 0.5 million sign ups a day, this should equate to about 260 million accounts currently.

How does this compare with our observations and estimates?

We believe that from all registered accounts only 12% gets cancelled, which means that out of 300 million registered accounts 36 million were cancelled and 264 million are theoretically still active. This seems consistent with data communicated by Twitter.

From the estimated 264 million currently registered accounts, we believe about 45% has never sent a tweet. This does not necessarily mean they are not active users of Twitter. There may be accounts that are only used to follow other accounts, without feeling the need to send tweets themselves. About 20% of the active users have sent more than 10 tweets, while 10% of users have sent more than 100 tweets. While these numbers seem fairly consistent between older and newer accounts,they don't say much about currently active users.

From the estimated 264 million active accounts, it is estimated that almost 40 million have sent a tweet less than 2 days ago, 60 million in the last week and more than 85 million users have sent at least 1 tweet in the last 30 days.

Based on our assumption that 45% of accounts have never sent a tweet, 55% of users must have sent at least 1 tweet. The average of tweets sent per day for this group is estimated to be a little more than 1 tweet per day. This equates to about 150 million tweets per day, not far from the official number by Twitter....

Please note that these estimates are based on limited information and interpretations and may or may not be accurate. Only Twitter, with acccess to all account information and log-in data, will be able to provide complete acccurate data.

Tuesday, September 28, 2010

De Nederlandse twitterpopulatie per 27 september 2010

Sinds afgelopen zaterdag is twitteraccount khtweets actief. Dit account wordt bijgehouden door de Rijksvoorlichtingsdienst en geeft met name informatie over Koningin Beatrix, de Prins van Oranje en Prinses Máxima. Dit account werd na de eerste 3 dagen dat het actief is reeds door meer dan 20,000 twitteraars gevolgd, waaruit blijkt dat er veel interesse is op Twitter rondom het reilen en zeilen van het Koninklijk Huis.

Interessant bij dit account is dat men aan mag nemen dat vrijwel al deze 20,000 volgers actief twitterende Nederlanders zijn. Als we vervolgens aannemen dat deze twitteraars een aselecte steekproef vormen uit de totale Nederlandse twitterpopulatie, kunnen we conclusies trekken over deze totale populatie. Indien de lezer van mening is dat dit onzinnig is, kunnen de volgende feiten en conclusies alleen gezien worden als geldend voor de volgers van khtweets.

De grafieken zijn verdeeld in groepen van 10% van de populatie, van laag naar hoog voor het betreffende onderwerp, met gemiddelde waardes per groep. Allereerst is gekeken naar de twitterleeftijd in dagen van de twitterpopulatie:


In bovenstaande grafiek zien we dat de gemiddelde leeftijd op Twitter inmiddels 380 dagen bedraagt. De grafiek loopt redelijk gelijkmatig op, waarbij te zien is dat de jongste 10% van de twitteraars gemiddeld een maand actief is, en de oudste 10% al gemiddeld bijna 2.5 jaar actief is op Twitter. Uit de grafiek valt af te leiden dat ruim de helft van de twitteraars inmiddels meer dan een jaar actief is. Een zeer kleine groep twitteraars is inmiddels meer dan 4 jaar actief.


Als naar het aantal volgers wordt gekeken valt op dat 80% van de twitteraars minder dan 100 volgers heeft. De hoogste 10% heeft een gemiddelde van 1,196 volgers. Binnen deze 10% zijn de uitslagen erg groot, daar er een beperkt aantal twitteraars is dat een zeer hoog volgers aantal heeft. Het gemiddeld aantal volgers over de hele populatie bedraagt 159.


Als gekeken wordt naar het aantal mensen dat men zelf volgt, valt op dat voor 90% van de twitteraars geldt dat men zelf meer mensen volgt, dan dat men gevolgd wordt. Een verklaring hiervoor zou kunnen zijn dat binnen een beperkte groep mensen elkaar terugvolgen, terwijl er ook altijd een aantal mensen worden gevolgd, die grote aantallen volgers hebben vanwege hun bekendheid of andere bepaalde specifieke kwaliteiten. Het totaal gemiddeld aantal mensen dat men volgt bedraag 156.


Gemiddeld is door de hele populatie een totaal aantal van 1,068 tweets per account verzonden. Uiteraard is dit aantal sterk afhankelijk van het dagelijks aantal tweets dat gestuurd wordt en het aantal dagen dat men reeds twittert. Duidelijk is wel dat er een beperkte groep twitteraars is, die zeer actief is. De volgende grafiek laat dit duidelijker zien.


In de grafiek is te zien dat het gemiddeld aantal tweets dat per dag verzonden wordt 2.6 bedraagt per account. Dit gemiddelde wordt echter bepaald door een relatief kleine groep twitteraars. 80% van de twitteraars verstuurt gemiddeld minder dan 2.3 tweets per dag, terwijl de meest actieve 10% van de twitteraars bijna 16 tweets per dag verstuurt. Ook binnen deze groep is de frequentie erg divers. Er zijn een aantal twitteraars die meer dan 100 tweets per dag versturen en dit al langere tijd volhouden.

Tot slot kijken we nog naar het moment dat door twitteraars hun laatste tweet verstuurd is:


Hier valt duidelijk te zien dat twitteraars erg actief zijn. 60% heeft niet langer dan 1 dag geleden een tweet verstuurd, terwijl 90% dit nog in de laatste 10 dagen heeft gedaan. De laatste 10% heeft gemiddeld 90 dagen geleden hun laatste tweet verstuurd, maar heeft dus wel de laatste dagen ingelogd op hun account. Er is blijkbaar een grote groep die vooral andere accounts volgt, zonder zelf erg actief tweets te versturen.

Tenslotte willen we nog een gewaagde voorspelling doen omtrent het aantal actieve Nederlandste twitteraccounts. Deze voorspelling doen we onder de aannname dat de totale Nederlandste twitterpopulatie zich hetzelfde gedraagt als het aantal twitteraars dat door ons gevolgd wordt in de Twopcharts. In deze Twopcharts worden op dit moment 8,738 twitteraars gevolgd. Dit zijn allemaal actieve Nederlandstalige accounts die minimaal 350 volgers hebben en/of account nl_twop_1000 volgen.
Van deze groep twitteraars volgt inmiddels 13.3% khtweets. Als we derhalve aannemen dat de eerste 20,000 volgers van khtweets ook 13.3% van de Nederlandse twitterpopulatie vormen komen we op ruim 150,000 actieve Nederlandse twitteraccounts

We gaan er dan voor het gemak vanuit dat het aantal Nederlandse twitteraars dat vooral Engelstalig tweet (en niet in de Twopcharts staat, maar wel Nederlands is) ongeveer gelijk is aan het aantal Belgisch / Vlaamse twitteraars (die wel in de Twopcharts staan, maar niet Nederlands zijn).

Monday, September 27, 2010

Using Excel as Twitter client with OAuth

Many users of twitter have found out that Excel is an excellent client for Twitter, and I am sure that through this post on Chandoo.org, a lot of curious people opened their VBA editor, tried it, and subsequently posted their first automated tweet. For me this was just the beginning, and after studying the Twitter API documentation, I wrote functions for just about all API methods. Using Excel proved to be great for all kinds of tasks, like scheduling tweets, sending tweets for different accounts from one dashboard, archiving tweets, mentions and direct messages, and many more bigger or little tasks.

The reason that it was so simple to use Excel was because of the use of basic authorization where with just a username and password it was possible to “talk” to the Twitter API. Unfortunately Twitter has switched off Basic Auth since the beginning of September, and many Excels sheets that required authenticated requests, will no longer function. Since transitioning from Basic Auth to OAuth looks complicated, many users of Excel sheets may have given up, and accept that their work is now lost. I also started and stopped looking at the documentation a couple of times, until I finally decided it just had to be possible to make it work. And indeed after a couple of days of hard work (and plenty of frustration) I was able to transition to OAuth. This means that I can take all my Excel sheets and just change the VBA code where the API requests are made, leaving everything else intact.

I will explain here how I did it, so hopefully other Excel users can take advantage of my efforts. I will use a straight forward example, and if you have some understanding of VBA, you should be able to follow the process and later take the code and optimize it for your own needs. In the example I will build an Excel sheet that allows me to send a tweet, going through the process step by step. You can see all the VBA code I used in the Excel sheet which you can download for your own programming pleasure.....

Step 1: Register an application with Twitter

There is no way around this. In order to use OAuth you need application codes and user codes. By registering an application with Twitter you will obtain a consumer key and a consumer secret. For your Twitter account you will get an access token and an access token secret. You will need these four codes before you can continue.

Step 2: Get all required data

For our example the following information is required:

API method = POST
API resource = http://api.twitter.com/1/statuses.update.xml
Oauth consumer key = your_consumer_key
Oauth consumer secret = your_consumer_secret
Oauth signature method = HMAC-SHA1
Oauth version = 1.0
Oauth token = your_token
Oauth token secret = your_token_secret

The API method and resource relate to what you want to do, which in this case is to send a tweet. The signature method and version are given as per the current requirements.

Step 3: Get a nonce (number used once) and a timestamp

You can calculate this yourself by writing a couple of simple functions. The timestamp simply calculates the number of seconds passed since the unix epoch which seems to be January 1, 1970. For the nonce I observed in the Twitter examples that it is 41 characters long and only uses letters and numbers. I just created a string which holds all possible and legal characters. Then I randomly picked a character 41 times from this string and put them all together in an output string which becomes the nonce. Twitter seems happy with that solution. A shorter nonce will probably also be ok, as long as it is unique.

Step 4: Calculate the base string

This is where thing started to get ugly, because it is so easy to do it wrong, and it is not so easy at first to figure out what went wrong. The base string is used to calculate a signature which Twitter requires as part of the API request. If you follow the instructions from Twitter exactly, you should get the correct signature. Important is to keep in mind that there are three parts connected with an ampersand (&), where the second and third parts need to be url encoded. While I was testing my output, I found a very useful tool on http://quonos.nl/oauthTester/ to check the base string.

Step 5: Calculate composite signing key

In this case, where you want to send a status you need to create a string, where your oauth_consumer_secret and oauth_token_secret are combined through an ampersand (&). No url encoding is required here.

Step 6: calculate oauth_signature

After mastering the base string, this was the next part which proved to be tricky. First I needed to understand (a little) what a HMAC-SHA1 signature is and than figure out a way to calculate it. Luckily I found all I needed on http://www.cryptosys.net/, which has a library that you can include as a VBA module. Make sure you do the installation, because there is a DLL file required, and please also check the license document.
When I calculated the signature, it did not match the required signature. The issue was that the obtained signature needed to be converted to bytes first, than base64 encoded and subsequently url encoded. I already had the code for the url encoding but did not have a function to perform the other two tasks. I wrote a simple function to convert the output string to bytes, and also found the correct answer for the base64 encoding. When all this finally worked, I had my moment of joy when the signature was exactly identical to the one calculated in the example in the twitter documentation.

Step 7: calculate authorization header

Creating the authorization header is quite similar to creating the base string. No url encoding is done here, and besides reading the Twitter documentation, I also carefully read Appendix A.5. from the Oauth Core 1.0 document.

Step 8: Send the API request

Finally everything has to be brought together in an API request, which look very similar to the one you probably used before, when basic auth was still working. In my Excel sheet it looks like this:

Set xml = CreateObject("MSXML2.XMLHTTP")
xml.Open cApi_method, cApi_resource & "?status=" & cStatus, False
xml.setRequestHeader "Authorization", cHeader
xml.Send
tResult = xml.responsetext
Debug.Print tResult
Set xml = Nothing

And that's all there is to it!

Please note that I my goal was to get to my required end result, which was to send a tweet. I have not attempted to create the most efficient and robust code and process to get there. I am convinced that there are far more talented programmers out there, who will be able to take this code and improve it. Please go ahead and share this with the rest of us.....

You can download the Excel sheet I created here.