Eclipse IoT Day Singapore Videos are Available!

Last month, we held our very first Eclipse IoT Day in Asia—Singapore more specifically, and it was a blast!

Ian Craggs, one of our speakers, did a really great job of writing up his summary of the event and I don’t feel like I have much to add! It was great to see that for this first event in the area, we ended up not only with an impressive line-up of speakers from all around the world but also a room literally packed with attendees.

In case you couldn’t attend, you will certainly be happy to hear that all the sessions were recorded, and are now available on our YouTube Channel. Check them out! You can also find a copy of the presentations here.

How many lines of open source code are hosted at the Eclipse Foundation?

Spoiler alert: 162 million!

That’s right, as of August 1st, there are 330 active open-source projects hosted at the Eclipse Foundation and if you look across the 1120 Git repositories that this represents, you will find over 162 million physical source lines of code. But beyond this number, let’s look at how it was obtained, and what it really means.

I’ve blogged several times about the importance of using metrics to monitor the health (and hopefully, growth!) of an open source project/community, and lines of code are just one. You should always have other metrics on your radar like the number of contributors, diversity, etc.

There are many ways, and many tools available out there, to count source lines of code. Openhub (previously known as ohloh) used to be a really good tool, but it doesn’t seem to be actively maintained. For a few years now, I’ve been relying on a home-made script to analyze Eclipse IoT projects, and it’s only recently that I realized I should probably run it against the entire eclipse.org codebase!

In this blog post, I will briefly talk about how the aforementioned script works, why you should make sure to take these metrics with a pinch of salt and finally, go through some noteworthy findings.

Line counting process

The script used to count the number of lines of code is available on Github. It takes a list of Eclipse projects’ identifiers (e.g ‘iot.paho’) and a given time range as an input and outputs a consolidated CSV file.

The main script (main.js) uses the Eclipse Project Management Infrastructure (PMI) API to retrieve the list of Git repositories for the requested projects and then proceeds to clone the repos and run the cloc command-line tool against each repo. The script also allows computing the statistics for a given time period, in which case it looks at the state of each repository at the beginning of each month for that period.

Once the main script has completed (and it can obviously take quite some time), thecsv-concat.js script can be used to consolidate all the produced metric files into one single CSV file that will contain the detailed breakout of lines of code per project and per programming language, the affiliation of the project to a particular top-level projects, the number of blanks or comment lines, etc.. It is pretty easy to then feed this CSV into Excel or Google Spreadsheets, and use it as the source for building pivot tables for specific breakouts.

Caveats

Just like virtually any KPI, you want to take the number of lines of code in your project with a grain of salt. Here are a few things to keep in mind:

All lines of code are not created equal

There is an incredible diversity of projects at Eclipse, and while a majority is using Java as their main programming language, there’s also a lot of C, C++, Python, Javascript, … 10M lines of Java code probably don’t carry the same value (i.e. how much effort has been needed to produce them) as 10M lines of C code.

Trends are more important than snapshots

It is nice to know that as of today there are 162 million lines of code in the Eclipse repositories, but it is, in my opinion, more important to look at trends over time. Is a particular programming language becoming more popular? Are all the top-level projects equally active?

I didn’t have a chance to run the scripts for a longer time period yet, but I will make sure to share the results when I get a chance!

Generated code, should it count?

There is a fair amount of generated code in some projects (in the Modeling top-level project in particular, of course), which certainly accounts for a few million lines of code. However, generated code often is customized, so I think it doesn’t necessarily skew the numbers as much as one would think.

Development does not always happen in a single branch

My script just looks at the code stored in the main (HEAD) branch of the Git repository. Some projects may have more than one development stream and may e.g. have a “develop” branch that is ahead of the main stable branch. Therefore, there is very likely more code in our repositories than what this quick analysis shows.

Additional findings

As my script outputs pretty detailed statistics, it is interesting to have a quick look at e.g. how the different top-level projects and programming languages compare.

Top 3 top-level projects: Runtime, Technology & Modeling

Top-level projectPhysical SLOC
rt54,961,728
technology28,887,621
modeling27,140,344
tools14,214,182
webtools9,651,900
eclipse6,401,518
ee4j5,809,126
ecd3,114,768
polarsys3,105,229
iot2,930,217
birt2,235,624
science1,670,051
datatools939,424
mylyn767,652
soa752,774

Top programming language: Java

Programming languagePhysical SLOC
Java72,349,870
HTML61,119,106
XML7,543,689
ANTLR Grammar3,161,339
JSON2,313,556
JavaScript2,251,418
C++2,245,759
C1,446,013
XMI1,355,914
C/C++ Header1,019,368
TTCN923,098
Maven884,271
CSS805,073
Assembly717,771
XSD688,764
PHP459,237
Python316,553
Markdown304,421
XSLT256,857
Scala229,560
Bourne Shell214,142
Go184,306
SWIG152,062
JSP142,190
Gencat NLS125,251
Ant113,133
TypeScript108,217
AsciiDoc105,552
Windows Module Definition64,843
TITAN Project File Information64,014
Groovy55,261
Sass53,915
XQuery51,432
XHTML51,166
DTD51,052
make48,021
Perl43,643
DITA42,526
yacc39,876
TeX36,400
m434,438
AspectJ33,717
Ruby28,355
Scheme27,484
YAML26,348
CMake25,182
Lua23,646
LESS18,712
SQL16,070
Cucumber15,454
IDL12,564
INI12,171
Bourne Again Shell11,978
Pascal11,915
lex11,795
DOS Batch11,675
Windows Resource File10,278
Blade8,295
C#7,983
Tcl/Tk7,611
Stylus7,477
Fortran 907,211
ERB7,048
Vuejs Component6,281
Visualforce Component5,047
MSBuild script4,538
Freemarker Template4,077
Dockerfile3,696
Velocity Template Language3,649
awk3,068
Rust2,903
Qt2,772
CUDA2,533
Puppet2,084
diff1,880
Haml1,819
Oracle PL/SQL1,778
ProGuard1,739
Objective C1,469
ActionScript1,459
Visual Basic1,365
Mathematica1,247
RobotFramework1,074
Korn Shell1,023
D1,007
Smalltalk911
R887
TOML826
Ada668
Lisp618
Objective C++589
Fortran 77588
Arduino Sketch480
MATLAB476
sed461
Protocol Buffers454
WiX source446
JavaServer Faces440
PowerShell284
Qt Project176
Windows Message File139
Expect120
NAnt script110
Smarty109
HCL78
CoffeeScript78
Skylark74
Forth69
Qt Linguist61
WiX include52
XAML49
QML48
Handlebars46
Clojure38
Prolog37
Razor32
PO File29
Haskell27
JSX24
ASP.NET21
HLSL15
F#11
Swift10
GLSL8
Kotlin7
C Shell7
Mustache1

If you end up using my script and have any question, please let me know in the comments or directly on Github!

Eclipse Kura on Steroids with UPM and Eclipse OpenJ9

So it’s been a while since the last time I blogged about a cool IoT demo… Sorry about that! On the bright side, this post covers a couple projects that are really, really, neat so hopefully, this will help you forgive me for the wait! 🙃

UP Squared Grove IoT Development Kit

At the end of last year, a new high-performance IoT developer kit was announced. Built on top of the UP Squared board, it features an Intel Apollo lake x86-64 processor, plenty of GPIOs, two Ethernet interfaces, USB 3.0 ports, an Altera MAX 10 FPGA, and more. You can get the kit from Seeed Studio for USD 249.

The UP Squared Grove IoT Development Kit

Of course, it wouldn’t be a Grove kit without the Grove shield that can be attached on top of the board to simplify the connection to a wide variety of sensors and actuators (and there’s actually a few of them in the kit).

Running Eclipse Kura on the UP Squared board

Enough with the hardware! With all this horsepower, it is of course very tempting to run Eclipse Kura on this. The UP Squared being based on an Intel x86-64 processor, it is incredibly easy to start by replacing the default OpenJDK JVM by Eclipse OpenJ9. Here’s your two-step tutorial to get Eclipse OpenJ9 and Eclipse Kura running on your board:

In case you are wondering how much faster OpenJ9 is compared to OpenJDK or Oracle’s JVMs, here’s a quick comparison of the startup time of Eclipse Kura on the UP Squared:

Eclipse Kura start-up time on Intel UP Squared Grove kit

UPM

UPM logo

UPM is a set of libraries for interacting with sensors and actuators in a cross-platform, cross-OS, language-agnostic, way.

There are over 400 sensors & actuators supported in UPM. Virtually all the “DIY” sensors you can get from SeeedStudio, Adafruit, etc. are supported, but beyond that, UPM also provides support for a wide variety of industrial sensors.

Thanks to Eclipse Kura Wires and the underlying concept of “Drivers” and “Assets”, Kura provides a way to access physical assets in a generic way.

In the next section, we will see a proof-of-concept of UPM libraries being wrapped as Kura “drivers” in order to make it really simple to interact with the 400+ kind of sensors/actuators supported by UPM.

Integrating UPM in Kura Wires

UPM drivers are small native C/C++ libraries that expose bindings in several programming languages, including Java, and therefore calling UPM drivers from Kura is pretty simple.

The only thing you need is a few JARs for UPM itself (and for MRAA, the framework that is supporting it), the JARs for the driver(s) of the particular sensor(s) you want to use, and the associated native libraries (.so files) for the above. As you may know, OSGi makes it pretty easy to package native libraries that may go alongside Java/JNI libraries, so there is really no difficulty there.

In order for the UPM drivers to be accessible from Kura Wires, and to expose “channels” corresponding to the methods available on them, they need to be bundled as Kura Drivers. This is also a pretty straightforward task, and while I created the driver for only a few sensor types out of the 400+ supported in UPM, I am pretty confident that Kura drivers can be automatically generated from UPM drivers.

You can find the final result on my Github: https://github.com/kartben/org.intellabs.upm.

See it in action!

So what do we end up getting, and why should you care? Just check out the video below and see for yourself!

Eclipse, Open Source for the Internet of Things, and other random stuff