Scientometric Study of the IEEE Transactions on Software Engineering ...

3 downloads 0 Views 327KB Size Report
about eminent researchers in their field, identify key papers to read (typically re- ... 1 Growth of the TSOFT literature (a) and (b) Overall distribution of co- ...
Scientometric Study of the IEEE Transactions on Software Engineering 1980-2010 Brahim Hamadicharef

Abstract. In this paper a scientometric study on the IEEE Transactions on Software Engineering 1980-2010 is presented. Using the full records from the Thomson Reuters (ISI) Web of Science (WoS), the journal’s bibliometric measures are examined in terms of growth of literature, authorship characteristics, country of origin, distribution of articles’ citations and references, and finally graph network of the research collaborations. A keyword occurrence analysis was carried out on the articles’ title and the results used to create TagClouds showing perceptually their research importance. Furthermore, these TagClouds were created for the full 3 decades, shorter 5-year periods and most recent years, providing insights into potential research trends and help to relate them to major historic contributions in software engineering.

1 Introduction The journal IEEE Transactions on Software Engineering (TSOFT) has been publishing article for more than 30 years and it was timely to carry out a scientometric study on its body of literature. The usefulness of such studies has been demonstrated as it allows not only to identify key bibliometric aspects of a particular research topic [1][2], a journal [3][4], or a series of conferences [5][6]. More than just providing bibliometric measures, it can also help young novice in their field to learn about eminent researchers in their field, identify key papers to read (typically reviews and surveys), journal to publish in. Furthermore, keyword analysis also allow to create network of research based on co-authorship (i.e. collaboration), research topics, etc. The remainder of the paper is organized as follows. In Section 2 we detail the analysis methodology and results from the bibliometric study. Finally in Section 3, we conclude the paper.

Brahim Hamadicharef Tiara, 1 Kim Seng Walk, Singapore 239403 e-mail: [email protected] F.L. Gaol et al. (Eds.): Proc. of the 2011 2nd International Congress on CACS, AISC 144, pp. 101–106. c Springer-Verlag Berlin Heidelberg 2012 springerlink.com 

102

B. Hamadicharef 1100

2500 1000

N = 157.9398xYear0.82734 (R2 = 0.9804) 900

2000

700

Number of Papers

Accumulative number of papers

800

1500

600

500

1000 400

300

500 200

100

Year

(a)

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1985

1984

1983

1982

1981

1980

0 0 0

1

2

3

4

5

6

7

8

9

10

11

Number of Coauthors

(b)

Fig. 1 Growth of the TSOFT literature (a) and (b) Overall distribution of co-authorship

2 Methodology and Results The Thomson Reuters ISI Web of Science (WoS) online interface was used to retrieve the records (with abstract) of the journal. Each year has one file with all the entries, saved as text files, and subsequently processed using MATLAB scripts. We present the growth of literature, authorship, citations, countries of origin, number of references, and international collaborations.

2.1 Growth of Literature The growth of the TSOFT literature over the last 30 years (1980 to 2010) is shown in Figure 1(a). A quasi-linear growth can be observe, which is different from other studies [7], which usually showed power law characteristics. Although it is interesting to observe such growth, one useful bibliometric measure to calculate is its obsolescence. In the 70s, researchers established mathematical law for describing the temporal loss of utility [8], an indicator of obsolescence [9] called half-life h (associated to an annual loss), that can be calculated for a set of documents. Using the 30 years of records, the half-life (h) for the TSOFT literature is equal to h = 15.62 with an annual loss of 4.34%.

2.2 Authorship In Figure 1(b) the distribution of co-authors is shown with a peak at two co-authors. Looking at the (color-coded) yearly basis distribution revealed more details, as shown in Figure 2(a). For e.g., the contributions from recent years show a slightly larger number of co-authors compared to early years. In the 80s, in particular, a large number of articles had only one or two co-authors, whereas in recent years

Scientometric Study of the IEEE Transactions on Software Engineering 1980-2010

103

50

45

40

Number of Papers

30

25

20

15

2

10

log(Number of papers)

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

35

1

10

10

5

0

10

0

1

2

3

4

5

6

7

8

Number of Coauthors

(a)

9

10

0

10

1

2

10

10

3

10

log(Number of citations)

(b)

Fig. 2 Yearly distribution of co-authorship (a) and (b) Citations of the TSOFT literature

this seems to increase more papers having 5, 6 or 7 co-authors. One paper on Detection of Mutual Inconsistency in Distributed Systems has the highest number (10) of authors [10].

2.3 Citations In Figure 2(b), the citation distribution (i.e. the number of citations versus the number of papers) is shown as a loglog plot. Such typical distribution can be fitted with a regression model of the form y = a × x + b, with a=-1.32 and b=7.20 (goodnessof-fit R2 = 0.90). Notice that the model is a power law but shown as linear due to the loglog scales. The most cited paper is cited 811 times, TSOFT papers are on average cited 22.22 times (median 9.00 times). Some 13.38% of these publications have never been cited.

2.4 Countries of Origin There are in total 46 countries contributing to the TSOFT literature (WoS has 1632 entries with the field country of origin) and include USA (932, 57.11%), Canada (108, 6.62%), Italy (103, 6.31%), U.K. (82, 5.02%), Germany (56, 3.43%), France (42, 2.57%), Norway (35, 2.14%), Japan (30, 1.84%), Taiwan (21, 1.29%), Australia (19, 1.16%), India (18, 1.10%), Israel (17, 1.04%), China (14, 0.86%), Belgium (13, 0.80%), Korea (13), Spain (11, 0.67%), Sweden (11), Greece (10, 0.61%), etc. In Figure 4(a), the graph of international collaborations is shown, with link between countries. Of all countries, the main strength is between the USA and Canada, and with U.K., Italy, Germany and France. In Figure 3(b), the international collaboration is shown at researchers’ level (100 top researchers publishing in TSOFT).

AR

CA

IT

NZ

AU

PL

JP

AT

CZ

KR

* CN

*

G

*

*

BY *

HR

BALSAMO, S

ZAVE, P * DEMILLO, R *

IL

RU

HU

BG

TW

IN

BY *

TR

(a)

OSTERWEIL, L

CERI, S * SARKAR ,D * ZHAN G, X * BINK LEY, D FENT ON, MU N RP HY BA ,G TO * RY GR ,D ISW * TU OL R D, KIM SK W I, W FE , K * * C RR H EU ARI, N D G * ,T *

* * ,R A ER S, LZ EW BA DR C AN NG, D * YA RY, R ,V * PE LICH C J R, RA RE ME KE * E, J AO A ER, PORT T * LEE, * EN, S ZWEB MOK, A *

SG

R

SI

GOEL, A

GOUDA, M * * SINHA, M ,I * AKYILDIZ L, B KORE * S, N ULO PO ,M N SO MA * ROUS HAR ,L KE AR ,P CL BU * N N VA * JI, DE S AV * S, DH G FO MA A K, , S NT VA EE O L N

BALSAMO, S

CH

BAGRODIA, R *

*

US

S HU ELB Y AN W AT G ,R HO ,J ER FF MA S, R * LZ N * MA , TH NN D * OM , G AS * IAN ,A GH OS H, S * RYDE * R, B * JEFF ERY, D FEATHE R, M * GEHANI ,N *

MX

HO

BR

K

SE

DE

LAM, S * BHAR GAVA ,B * DILL ON, L * JAJO DIA, S * PAR NA S, D LIT TL * EW CH O AN OD JA G, ,B L S * KR OTE KIT AME , P * R, C R J AM HE AM NH AM O ,B O R TH Y, C

UK IE

D

*

* N ,B D, EH WIN IB E * SE EID ,W NU HN EN SC WD HO , S U I, F YA TAN ,R S ER * BA ER MM N, M KE NSE GE JOR J HT, IG KN N SON, VE LE D, L BRIAN R, E WEYUKE BASILI, V

IY ER , MA YU R * MR , ITH LUQ C I WO AM SIL OD ,K * BE S * IDE RS CH ,C AT Z, AR A ISH * OLM ROSE NBLU , E M, D ROMA N, G VONB OCHM ANN, G * YU, P NI, L RA

CU

SHIN, K *

NL NO

FI

TIAN, J * C GHEZZI, D, M HARROL KL, P FRAN T * ATA, * MUR ,B EHM K * BO I, TA V R, B IGO , GL AH W ,R OR YL M, G TA G RA L, KA ME ER TH RO

BE

FR

ES

ES

B. Hamadicharef

PT *

104

(b)

Fig. 3 TSOFT (a) countries collaboration and (b) researchers collaboration network

The evolution of accumulated contributions per countries is shown in Figure 4(a). From the very early years, the USA has the most significantly contributed to the TSOFT literature. It is worth noticing the increasing contributions from Canada, U.K, Italy, Germany and France, as well as Norway and Japan.

2.5 Number of References From 1980 to 2010 the yearly average number of references per article grew from 12.89 to 42.85 (median 9.96 to 41.07). The article with the greatest number of reference is by M. Jørgensen and M. Shepperd on a systematic review of software development cost estimation studies [11] with 311 references. On the same year, J. Hannay and colleagues also published a TSOFT paper on a systematic review of theory use in software engineering experiments [12] with 182 references.

2.6 Keywords Analysis The tool WORDLE [13] was used to to create TagClouds from the keywords distribution of the TSOFT articles’ title. As shown in Figure 5, there are many keywords which are all equally sized, showing their importance in the research published in TSOFT. The 30 years of contribution was split into smaller 5-year periods, to be able to compare different advancements in software engineering from the 80s, the 90s and the first decade after the year 2000 (e.g. Figure 5(a) and Figure 5(b)). Focus was given to the recent years as well as an overall picture for the 30 years of publications as shown in Figure 5(c).

Scientometric Study of the IEEE Transactions on Software Engineering 1980-2010

CANADA USA ENGLAND FRANCE NORWAY GERMANY ITALY JAPAN

1600

300

200

1400

100

1200

50

Number of references

Accumulative Contributions

105

1000

800

30

10

600

5 400

3

200

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1985

1984

1983

1982

1981

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1985

1984

1983

1982

1981

1980

1980

1

0

Year

Year

(a)

(b)

Fig. 4 Accumulated contribution to the TSOFT literature per countries (a) and (b) Distribution of the number of references

(a) 2000-2004

(b) 2005-2009

(c) All

Fig. 5 TagClouds of TSOFT articles’ title

3 Conclusions After 30 years of publications, it was timely to present a scientometric study of the IEEE Transactions on Software Engineering. In this paper, based on bibliometric measures, characteristics such as growth of literature, authorship, country of origin, citations, references, were detailed. A graph was created to present the journal’s network of collaborations between authors. Keywords from on the articles’ title, and in particular their occurrence, were analyzed and the results used to create TagClouds. These are showing the key research topics within the software engineering research field. Furthermore, using the full 3 decades, shorter 5-year periods and most recent years, insights into potential research trends could be investigate and thus help to relate these topics to major historic contributions in software engineering.

106

B. Hamadicharef

References 1. Patra, S.K., Mishra, S.: Bibliometric study of bioinformatics literature. Scientometrics 67(3), 477–489 (2006) 2. Chiu, W.-T., Ho, Y.-S.: Bibliometric analysis of tsunami research. Scientometrics 73(1), 3–17 (2007) 3. Hamadicharef, B.: Scientometric Study of the Journal NeuroImage 1992–2009. In: Proceedings of the 2010 International Conference on Web Information Systems and Mining (WISM 2010), Nanjing, China, October 23-24, pp. 201–204 (2010) 4. Thanuskod, S.: Journal of Social Sciences: A Bibliometric Study. Journal of Social Sciences 24(2), 77–80 (2010) 5. Hamadicharef, B.: Bibliometric Study of the DAFx Proceedings 1998-2009. In: Proceedings of the 13th International Conference on Digital Audio Effects (DAFx 2010), Graz, Austria, September 4–6, pp. 427–430 (2010) 6. Bartneck, C., Hu, J.: Scientometric Analysis of the CHI Proceedings. In: Proceedings of the Conference on Human Factors in Computing Systems (CHI 2009), Boston, USA, April 4–9, pp. 699–708 (2009) 7. Hamadicharef, B.: Brain–Computer Interface (BCI) Literature – A Bibliometric Study. In: Proceedings of the 10th International Conference on Information Science, Signal Processing and their applications (ISSPA 2010), Kuala Lumpur, Malaysia, May 10–13, pp. 626–629 (2010) 8. Brookes, B.C.: The Growth, Utility, and Obsolescence of Scientific Periodical Literature. Journal of Documentation 26(4), 283–294 (1970) 9. Line, M.B.: The ’half-life’ of periodical literature, apparent and real obsolescence. Journal of Documentation 26(1), 46–54 (1970) 10. Parker, D.S., Popek, G.J., Rudisin, G., Stoughton, A., Walker, B.J., Walton, E., Chow, J.M., Edwards, D., Kiser, S., Kline, C.: Detection of Mutual Inconsistency in Distributed Systems. IEEE Transactions on Software Engineering 9(3), 240–247 (1983) 11. Jørgensen, M., Shepperd, M.: A Systematic Review of Software Development Cost Estimation Studies. IEEE Transactions on Software Engineering 33(1), 33–53 (2007) 12. Hannay, J.E., Sjoberg, D.I.K., Dyba, T.: A Systematic Review of Theory Use in Software Engineering Experiments. IEEE Transactions on Software Engineering 33(2), 87–107 (2007) 13. Viegas, F.B., Wattenberg, M., Feinberg, J.: Participatory Visualization with Wordle. IEEE Transactions on Visualization and Computer Graphics 15(6), 1137–1144 (2009)