Measuring the Source Code

We have selected for our case study the ArchLinux software distribution (http://archlinux.org), which contains thousands of packages, all open source. ArchLinux is a lightweight GNU/Linux distribution whose maintainers refuse to modify the source code packaged for the distribution, in order to meet the goal of drastically reducing the time that elapses between the official release of a package and its integration into the distribution. There are two ways to install a package in ArchLinux: using the official precompiled packages, or installing from source code using the Arch Build System (ABS).

ABS makes it possible to retrieve the original, pristine source code of all the packages. This is different from other distributions, which make copies of the source code of the packages and often patch it to adapt it to the rest of the distribution. With ABS, we can gather source code from its original location, at the upstream projects’ websites and repositories, in an automatic fashion. This ensures that the source code has not been modified, and therefore that the case studies in our sample are independent. As we will show later in the results section, this property of independence is crucial for the validity of the results.

Because of the size of ArchLinux, using it as a case study gives us access to the original source code of thousands of open source projects, through the build scripts used by ABS (see Example 8-1).

Example 8-1. Header of a sample build script in ArchLinux

pkgname=ppl
pkgver=0.10.2 1
pkgrel=2 2
pkgdesc="A modern library for convex polyhedra and other numerical abstractions."
arch=('i686' 'x86_64')
url="http://www.cs.unipr.it/ppl"
license=('GPL3')
depends=('gmp>=4.1.3')
options=('!docs' '!libtool')
source=(http://www.cs.unipr.it/ppl/Download/ftp/releases/$pkgver/ppl-$pkgver.tar.gz) 3
md5sums=('e7dd265afdeaea81f7e87a72b182d875') 4
1

Version of the package. Used to build the download URL.

2

Minor release version number. Also used to build the download URL.

3

Source code download URL.

4

Checksum of the source tarball.

Example 8-1 shows the header of sample build script in ABS. The header contains meta-information that is used to gather the sources, retrieve other dependencies from the package archives, and classify the package in the archives once it is built. We have used the fields highlighted in Example 8-1 to retrieve the source code of all the packages in the ArchLinux archives.

For all the source code gathered, we determined the programming language of every file using the SlocCount tool (http://www.dwheeler.com/sloccount). Using only C language code for our sample, we measured several size and complexity metrics using the Libresoft tools’ cmetrics package (http://tools.libresoft.es/cmetrics).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset