Interactive Poster: Treemap Visualizations of Newsgroups

1 downloads 0 Views 1MB Size Report
dirt), they appear inside its bounds, and the parent's size reflects the cumulative number of ... Sub-hierarchies like alt.music and soc.culture have fairly flat ...
Interactive Poster: Treemap Visualizations of Newsgroups Andrew T. Fiore MIT Media Laboratory E15-468 Cambridge, MA 02139 USA +1 617 253 4576 [email protected] Abstract In this paper, we describe treemap visualizations of Usenet newsgroup. These images illuminate patterns of behavior and suggest interfaces for easier exploration of large-scale social cyberspaces.

1. Introduction In our treemaps [1, 2, 3] of Usenet, the size of a box reflects the number of messages that have been posted in the newsgroup that it represents. If a newsgroup has children (e.g., rec.motorcycles contains rec.motorcycles.dirt), they appear inside its bounds, and the parent’s size reflects the cumulative number of messages in the newsgroup itself and in all of its children. The number of posts made in the parent group is reflected in the area of the parent that is not occupied by any children. Assigning colors to the rectangles that represent newsgroups allows us to encode another dimension of information about the groups. Our first treemaps of Usenet indicated growing groups with green and shrinking groups with red; the intensity of the color revealed the extent of the growth or decline. But color can represent other metrics as well. We have also examined treemaps colored by number of posters, number of replies, average message length, and number of. The metric that determines the size of each newsgroup’s rectangle — in this case, number of messages — must aggregate up the hierarchy. (That is, if comp.sys has 1,000 messages, then comp must have at least 1,000 messages, because it encompasses comp.sys and, very likely, other groups as well.) For color metrics, this constraint does not hold, so we are free to use descriptors, like the average message length, that do not add up from child to parent. To avoid jarring discontinuities and to capture the flavor of subtrees of newsgroups, we determine the color of a newsgroup that contains other groups by taking the weighted average of its children’s colors.

2. Learning from treemaps of Usenet When seen printed in a small space at a low resolution, treemaps lose much of their detail, but high-level patterns

Marc A. Smith Microsoft Research One Microsoft Way Redmond, WA 98052 USA +1 425 706-6896 [email protected] in the structure of Usenet remain visible even when tree maps are presented at limited resolutions (cf. micro/macro design [5]). The relative activity levels of the various hierarchies, for example, become immediately apparent. Hierarchies vary also in terms of the extent to which they are subdivided. Sub-hierarchies like alt.music and soc.culture have fairly flat structures, with few further sub-divisions below the third level. In contrast, hierarchies like microsoft.public and comp have grown deeper structures, with more levels but fewer nodes at each level, so they appear as busy, overlapping spaces. In the following sections, we present some of our observations gleaned from studying treemaps and then looking to tables of data from the Netscan project to quantify our visual observations.

Top-level hierarchies The top-level hierarchies, which correspond to the first part of each newsgroup’s name, provide the crudest topical organization of the groups. Figure 1 shows a treemap of the activity, by number of posts, of all newsgroups in December 2001. The largest separate regions represent the top-level hierarchies. In this treemap, green groups have grown since November 2001, while red groups have shrunk. More intense color indicates greater change. The activity levels of many newsgroups and subhierarchies vary dramatically over time, frequently growing or declining by 30 to 40 percent in a month. In Figure 1, the alt hierarchy looms over the rest, occupying more than half of the image, a massive continent of loosely related newsgroups making up 36 percent of all newsgroups and receiving 47 percent of all messages from 44 percent of all posters. Alt has grown so large in part because its newsgroup creation process operates less restrictively than that of the other hierarchies, which follow formally established procedures to create new groups. This means that the most active area of the Usenet is not governed by the same political system that rules the others. Activity does not necessarily equal quality, value, or user satisfaction, but it does demonstrate a way in which different patterns of social regulation affect the growth and structure of social cyberspaces [4].

Figure 1. Tree map of all of Usenet newsgroups by number of posts in December 2001. Green groups have grown since November 2001; red groups have shrunk. The intensity of the color indicates the degree of growth or decline.

3. Conclusions

5. References

Treemaps hold promise for the study and exploration of large-scale social cyberspaces. These maps capture highlevel patterns in such spaces while preserving the ability to examine fine detail. As an on-screen interface, treemaps suffer from the low resolution of typical display devices. The most rewarding use of treemaps of Usenet comes from up-close study of a high-resolution printed copy. We are presently developing an interactive panning and zooming interface that permits “drilling down” through the hierarchy of newsgroups, which should provide an effective alternative to high resolution for making detail accessible. Such an interface could make the process of finding a group to read quicker and more certain as to the quality of the conversation, especially if the maps were colored by some indicator of interactivity, like number of replies.

1. Bruls, M., K. Huising, and J. J. van Wijk. Squarified treemaps. In Proceedings of Joint Eurographics and IEEE TCVG Symposium on Visualization (TCVG 2000) IEEE Press, pp. 33-42. 2. Shneiderman, B. and M. Wattenberg. Ordered Treemap Layouts. In Proceedings IEEE Symposium on Information Visualization 2001, October 2001. 3. Shneiderman, B. Tree visualization with treemaps: a 2d space-filling approach. ACM Transactions on Graphics, vol. 11, 1 (January 1992), 92-99. 4. Smith, Marc, “Invisible Crowds in Cyberspace: Measuring and Mapping the Social Structure of USENET” in Communities in Cyberspace, M. Smith and P. Kollock (eds.), Routledge Press, London (1999). 5. Tufte, E. The Visual Display of Quantitative Information. Graphics Press, Cheshire CT, 1983.

4. Acknowledgements We thank the Netscan team, in particular Duncan Davenport; Helena Mentis of Cornell; and Fernanda Viegas of MIT.