When you run a project at Eclipse, you are very likely interested in getting some indicators regarding the health of your community.
And as Randall Munroe suggests, maybe these indicators will be a good way for you to extrapolate on when your project will actually rule the world 🙂
Download stats
One key indicator is the number of downloads for your deliverables. With Koneki, we have three main channels of distributions of our Lua IDE, Lua Development Tools:
- Our ready-to-use RCP distro which is served by download.eclipse.org and its mirrors,
- We are part of the Juno aggregator and serve an Eclipse feature for LDT
- And, last but not least, we use the awesome Eclipse MarketPlace to reach even more users and simplify the installation process
Making sure you track your eclipse.org downloads
Prior to even trying to consolidate your download statistics, you have to make sure that the files you deliver are correctly tracked by the eclipse.org infrastructure.
For our RCP product, it means that we have to make sure that we use mirror URLs. Not only does it mean that it is likely that the best mirror will be picked, and the download experience will be as fast as possible for the end user, but it also means that this “hit” will be tracked by eclipse.org servers.
We’ll see in just a few moments how you can actually access the data collected by this script.
When it comes to your update sites, whether your are aggregated in the Simultaneous Release train our have separate repositories, you should have correctly configured your p2 repositories so as, again, download.eclipse.org is correctly “pinged” every time such or such IU is installed by someone.
You very likely don’t want to track all your IUs (except maybe if you have platform-specific fragments and want to track then), and will usually only track your main feature(s).
Accessing the download stats
The statistics of the files downloaded via the mirror script and p2 downloads stats mechanism mentioned earlier are accessible to all Eclipse committers via the My Foundation Portal.
If you are distributing stuff via the Eclipse MarketPlace, you probably already know that the Metrics tab of your project gives you access to the download stats.
Consolidating
Now that you have download stats being collected by your downloads, and installations from your update sites or via the Marketplace, I am sure you’d like to monitor them easily, right?
So hopefuly you’ll be interested in the following Ruby script:
#!/usr/bin/env ruby
require 'pp'
require 'date'
content = STDIN.read()
REGEX = %r{
(?:
\/koneki\/products\/.*/?org.eclipse.koneki.ldt.product-
(?<product-name>.*?\..*?\..*?)\.(?:.*)
\t
(?<xxx>(?:
(?<date>\d+-\d+-\d+) \t (?<hits>\d+) \n
)+)
.*\n
)+
}x
REGEX_SIMREL = %r{
(?:
\/stats\/releases\/juno\/
(?<release>.*?\..*?\..*?)\.(?:.*)
\t
(?<xxx>(?:
(?<date>\d+-\d+-\d+) \t (?<hits>\d+) \n
)+)
.*\n
)+
}x
glitches = Hash.new
glitches["linux.gtk.x86_64"] = Hash.new
glitches["linux.gtk.x86_64"]["2012-10-05"] = 1
glitches["linux.gtk.x86_64"]["2012-10-06"] = 1
glitches["linux.gtk.x86_64"]["2012-10-07"] = 2
glitches["linux.gtk.x86_64"]["2012-10-08"] = 1
glitches["linux.gtk.x86_64"]["2012-10-09"] = 2
glitches["linux.gtk.x86_64"]["2012-10-10"] = 1
glitches["macosx.cocoa.x86_64"] = Hash.new
glitches["macosx.cocoa.x86_64"]["2012-10-05"] = 2
glitches["macosx.cocoa.x86_64"]["2012-10-06"] = 2
glitches["macosx.cocoa.x86_64"]["2012-10-07"] = 3
glitches["macosx.cocoa.x86_64"]["2012-10-08"] = 2
glitches["macosx.cocoa.x86_64"]["2012-10-09"] = 1
glitches["macosx.cocoa.x86_64"]["2012-10-10"] = 1
glitches["win32.win32.x86_64"] = Hash.new
glitches["win32.win32.x86_64"]["2012-02-15"] = 15
glitches["win32.win32.x86_64"]["2012-03-03"] = 10
glitches["win32.win32.x86_64"]["2012-10-05"] = 2
glitches["win32.win32.x86_64"]["2012-10-06"] = 2
glitches["win32.win32.x86_64"]["2012-10-07"] = 2
glitches["win32.win32.x86_64"]["2012-10-08"] = 3
glitches["win32.win32.x86_64"]["2012-10-09"] = 2
glitches["win32.win32.x86_64"]["2012-10-10"] = 2
glitches["win32.win32.x86"] = Hash.new
glitches["win32.win32.x86"]["2012-02-20"] = 60
glitches["win32.win32.x86"]["2012-03-11"] = 30
glitches["win32.win32.x86"]["2012-03-30"] = 20
glitches["win32.win32.x86"]["2012-10-05"] = 3
glitches["win32.win32.x86"]["2012-10-06"] = 2
glitches["win32.win32.x86"]["2012-10-07"] = 2
glitches["win32.win32.x86"]["2012-10-08"] = 3
glitches["win32.win32.x86"]["2012-10-09"] = 2
glitches["win32.win32.x86"]["2012-10-10"] = 3
#####################
# downloads
#####################
content.scan(REGEX) do |m|
#pp "#{m[1]}"
res = m[1]
res.gsub!(/(\d+-\d+-\d+)/, "#{m[0]}\t\\1")
glitches.each do |platform, days|
days.each do |day, count|
# pp day
# pp count
res.gsub!(/(#{platform}\t#{day}\t)\d+/, "\\1#{count}")
end
end
puts res
end
#####################
# juno stats
#####################
content.scan(REGEX_SIMREL) do |m|
puts m[1].gsub(/(\d+-\d+-\d+)/, "juno\t\\1")
end
#####################
# marketplace stats
#####################
marketplace = []
marketplace[2011] = []
marketplace[2012] = []
marketplace[2013] = []
marketplace[2011][11] = 103
marketplace[2011][12] = 225
marketplace[2012][1] = 229
marketplace[2012][2] = 234
marketplace[2012][3] = 247
marketplace[2012][4] = 210
marketplace[2012][5] = 242
marketplace[2012][6] = 257
marketplace[2012][7] = 326
marketplace[2012][8] = 273
marketplace[2012][9] = 281
marketplace[2012][10] = 300
marketplace[2012][11] = 299
marketplace[2012][12] = 284
marketplace[2013][01] = 164 * 31/14
(Date.new(2011,11,01) .. Date.new(2013,01,14)).each do |day|
puts ("marketplace\t" + day.to_s + "\t" + (marketplace[day.year][day.month] / 30).to_s)
end
As you can see, I need to improve the code/comment ratio 🙂 but I am sure you can tweak it to suit your needs.
The main thing that may not be obvious at first sight is that the script expects on the standard input a raw HTML corresponding to the download stats, as served by My Foundation Portal, you are interested in parsing and consolidating.
Please make sure you are in the “Daily download stats per file” view mode before running your query. For Koneki, I run the query against the partial file name “koneki” to get all the informations regarding downloads of file whose name contain “koneki”.
The script will then use regular expressions black magic to arrange your download stats, p2 repo stats, as well as MarketPlace stats (for which you can see the values are stored in the script itself, lines 109-129) into “downloadtype-date-# of downloads” triplet.
You’ll also see that, probably because of some nasty bots, some download stats are erroneous and have to be fixed manually (lines 41-79).
If all goes well, the script will output consolidated stats on stdout, … something like this:
linux.gtk.x86 2012-12-17 26 linux.gtk.x86 2012-12-15 7 linux.gtk.x86 2012-12-12 2 linux.gtk.x86 2012-12-11 1 linux.gtk.x86 2012-12-10 2 win32.win32.x86 2012-12-17 44 win32.win32.x86 2012-12-16 2 win32.win32.x86 2012-12-15 8 win32.win32.x86 2012-12-14 3 win32.win32.x86 2012-12-13 5 juno 2012-09-10 74 juno 2012-09-09 66 juno 2012-09-08 59 juno 2012-09-07 53 marketplace 2011-11-01 3 marketplace 2011-11-02 3 marketplace 2011-11-03 3 marketplace 2011-11-04 3
It should now be trivial for you to feed this into Excel, or BIRT, and create a crosstab that you can use as is, or for getting nice charts
Forum activity
Another great metric for evaluating the success of your community is the activity on your forum.
Since the FUDForum instance hosted at eclipse.org exposes RSS feeds for each forum, it is pretty trivial to use these feeds for knowing who posts on your forum, and when.
Again, a small Ruby script is gonna be of great help for consolidating the number of posts per day, as well as knowing who your top contributors are.
require 'rss'
require 'pp'
require 'rss/dublincore'
$totals = Hash.new(0)
$authors = Hash.new(0)
FORUM_ID = 221 # replace this with your frm_id
def calculate(feed)
feed.items.reverse.map.each do |item|
$totals[item.date.strftime("%d/%m/%Y")] += 1
$authors[item.dc_creator] += 1
end
end
# it's not possible to get the whole RSS feed at once, so we fetch it by chunks of 50 items
# you might want to iterate more than 8 times in the following loop, if you want to retrieve more than just the 400 last posts...
for i in (0..8)
rss = RSS::Parser.parse("http://www.eclipse.org/forums/feed.php?mode=m&l=1&basic=1&frm=#{FORUM_ID}&n=50&o=#{i*50}", true)
if rss.respond_to?('items') then calculate(rss) end
end
$totals.each do |k,v| puts "#{k}\t#{v}" end
$authors.sort_by {|k,v| v}.reverse.each do |k,v| puts "#{k}\t#{v}" end
This script is way simpler than the previous one, and you should have nothing to adapt besides using your own frm_id
instead of Koneki’s.
As for the downloads, you can feed the output of this script into your favorite spreadsheet, and visualize the activity on your forum.
eclipse.org resources are precious so please try to avoid running this script for digging into the whole history of your forum, especially if it is pretty large.
I hope you found this all useful, feel free to comment, fork, adapt, and improve these scripts, and share about the metrics you are monitoring!