Big Data & Digital Innovation

8 downloads 104515 Views 727KB Size Report
Jun 26, 2015 - implement Big Data (analytics) or utilize Big Data to its full capacities. .... service or processes, in order to advance, compete and differentiate.
VU University Amsterdam

Big Data & Digital Innovation Academic Paper

Name:

P.V. da Cruz Caria (Pedro)

Student number:

2542322

E-mail:

[email protected]

Specialization:

BA: Information & Knowledge Management

Supervisor:

Dr. M. Agterberg

Date:

26/06/2015

1

Academic Paper Premaster IKM

P.V. da Cruz Caria

Preface This research paper is the final report and conclusion of the premaster Business Administration: Information & Knowledge Management at VU University Amsterdam. The copyright of this research paper rests with the author. I would also like to thank both companies –referred to as ComA and ComB- and its employees which made themselves available for interviews. At last I like to give my final acknowledgement to my supervisor Dr. Agterberg, who ‘kept me on the right path’ during this research. I hope that this research paper offers a deeper insight and understanding about Big Data and digital innovation.

P.V. da Cruz Caria, Premaster student VU University Amsterdam 26-06-2014

2

Academic Paper Premaster IKM

P.V. da Cruz Caria

Table of Contents 1

Introduction .................................................................................................................................... 5 1.1

2

Research questions .................................................................................................................. 7

Theoretical background .................................................................................................................. 7 2.1

Big Data................................................................................................................................... 7

2.1.1 2.2

Innovation.............................................................................................................................. 11

2.2.1 2.3

3

4

Data driven innovation .................................................................................................. 12

Conceptual Model ................................................................................................................. 13

2.3.1

Big Data enablers .......................................................................................................... 14

2.3.2

Big Data to discovery .................................................................................................... 16

Methods ........................................................................................................................................ 17 3.1

Research design & Context ................................................................................................... 17

3.2

Data collection & Analysis .................................................................................................... 18

Results .......................................................................................................................................... 20 4.1

Big Data enablers .................................................................................................................. 21

4.1.1

Absorptive Capacity & Technical Connectivity ............................................................ 21

4.1.2

Resource Governance .................................................................................................... 22

4.2 5

Big Data views ................................................................................................................ 9

Big Data and the development stage ..................................................................................... 24

Discussion & Conclusion ............................................................................................................. 25 5.1

Synthesis................................................................................................................................ 25

5.2

Theoretical contributions ....................................................................................................... 26

5.3

Practical implications ............................................................................................................ 27

5.4

Limitations and implications ................................................................................................. 28

5.5

Conclusion ............................................................................................................................. 28

6

References .................................................................................................................................... 29

7

Appendix ...................................................................................................................................... 31 7.1

Interview Guide ..................................................................................................................... 31

7.2

Interview transcripts .............................................................................................................. 32 3

Academic Paper Premaster IKM 7.3

P.V. da Cruz Caria

Codebook............................................................................................................................... 33

4

Academic Paper Premaster IKM

P.V. da Cruz Caria

1 Introduction In the continuous changing world of technical innovation, progress and development, a lot of data is ‘created’. People use mobile devices, browse the internet and generate digital transactions. This kind of data (information) can be user generated or machine generated (machine data or user data) (Power, 2014). When doing so, the data which is created can become ‘Big Data’. Big Data got my attention since various popular news articles have been published about this subject. Many claims have been made that Big Data will revolutionize businesses and allow for the creation of new products (Ekbia, et al., 2014). As an example, Big Data can be used by retail companies to monitor social media sites to get a detailed view about customer’s preferences, behavior and product perception. Big Data can be of value for the financial sector as well, banks could use all (or aggregated data) of the transaction data from customers to create a detailed profile of their (financial) behavior. This may be used for creating more personalized offers and/or products/services. In academic articles the subject Big Data can be more precisely described as “data sets and analytical techniques in applications that are so large and complex that they require advanced and unique data storage, management, analysis, and visualization technologies” (Chen, Chiang, & Storey, 2012). This is a broad verbal definition of Big Data. In order to determine the meaning of Big Data on innovation processes, a sound definition needs to be chosen from the existing literature. It seems that the literature offers several definitions and explanations. The theoretical background of this paper aims to clarify the differences between them. From the literature two Big Data enablers are identified as well. These enablers contribute and stimulate the existence of Big Data (analytics). An analysis of the used definition is this paper is given as well. Research which has been conducted on Big Data mainly focuses on the definition, analytics, storage and implementation of Big Data. As many researchers point out (Ekbia, et al., 2014) Big Data in itself does not always automatically represent value to a firm, and only becomes valuable when it is analyzed (and ‘used’) by companies for various processes. Researchers agree that, with the new capabilities and possibilities of Big Data, it has an influence on the way organizations do business (Chen & Zhang, 2014). Whether this influence effects the innovation process at organizations remains unclear and remains to be studied in more depth. In today’s ever changing and fast moving world, organizations need to create lasting and durable value for their stakeholders in order to survive. Organizations or companies need to perform well in order to keep up or stay ahead of their competition. Digital innovation may be a vital part of the path success. As Baregheh et al. point out “[o]rganizations need to innovate in response to changing customer demands and lifestyles and in order to capitalize on opportunities offered by technology and changing marketplaces, structures and dynamics” (Baregheh, Rowley, & Sambrook, 2009). Innovating and the act of innovating can be described as new knowledge or ideas which are created and which facilitate 5

Academic Paper Premaster IKM

P.V. da Cruz Caria

new business outcomes (ibid). Digital product innovation is known as “a product, process, or business model that is perceived as new, requires some significant changes on the part of adopters, and is embodied in or enabled by IT” (Fichman, Dos Santos, & Zheng, 2014). An innovation goes through several stages before it is created. From the idea itself to the impact it has on society or the market. Four broad stages can be identified in a digital innovation process: discovery, development, diffusion, impact. The theoretical background provides more details about various views on the matter. Based on the theory the expectation is created that Big Data changes the stage of discovery in the digital innovation process. This expectation is visualized in an conceptual model which formed the basis for the data collection. The relations are based on the fact that in existing literature data itself has already been identified as an innovation ‘driver’ or stimulator (Kusiak, 2009; Jetzek, Avital, & Bjorn-Andersen, 2014). So, ‘normal’ or ‘convential data analytics’ stimulates the discovery stage of digital innovation (Kusiak, 2009; Jetzek, Avital, & Bjorn-Andersen, 2014). As pointed out earlier in this intro, Big Data brings new characteristics and possibilities. It posses more and more detailed information, and besides these characteristics (which are further explained in the theoretical background), Big Data analytics also offers deep analytical tools. These tools, often reffered to as Big Data analytics, have tremendous data processing capabilities and are powerful enough to conduct deep statistical analytics (Chen & Zhang, 2014). These deep statistical analytics are made easy and quickly accessible, so innovation managers have access to these results and it’s made easy to use. During the data collection the direction of research took an different turn. Because the innovation processes at the organizations used for this research turned out to be different than expected. Both companies, which delivered the interviewees, do not use Big Data to truly stimulate the discovery stage of innovation. Which made it difficult to analyze Big Data’s impact on this approach. Because of the open approach of this study, an exploratory study, it was possible to come up with interesting findings on two Big Data enablers and how Big Data influences the development stage at organizations. This research paper aims to clarify what this effect is by not only analyzing the exising literature but also conducting qualitative research. For some organizations the problem is that it’s difficult to implement Big Data (analytics) or utilize Big Data to its full capacities. Better insights for practitioners are given so organizations can adjust their innovation processes to the capabilities of Big Data.

6

Academic Paper Premaster IKM

P.V. da Cruz Caria

1.1 Research questions After analyzing the literature on both Big Data and innovation the following questions arise; RSQ: How can Big Data support innovation processes in organizations and what are the requirements for doing so? In order to support this question these sub-questions are introduced: Sub 1: What are the profound characteristics of Big Data and how different is it from ‘normal’ data? Sub 2: What is digital product innovation and which stages can be identified within digital innovation processes? Sub 3: What are the enablers of Big Data and how can Big Data enable discovery through ‘openness’?

2 Theoretical background Before this theoretical background clarifies the subjects Big Data and digital innovation an example is given of how Big Data can be used to create new products or services (the act of innovating). Bank businesses generate huge amounts and huge ‘flows’ of transaction data generated by and from customers. This data (and aggregated data) can be used to create automatic personal offers based on customers’ buying and spending behavior. Another example is an insurance company which can calculate casualty insurance premium based on where and how people drive their cars. A different example is The Weather Company, which not only uses Big Data to calculate and improve weather predictions but also uses it to predict how shoppers’ habits sway with the weather (ZDNet, 2015). It offers insights for retail companies to see patterns in which storm forecasts cause direct effect on the sales of goods. This information gives retail companies, with the use of Big Data, the ability to adjust their supply chain strategies and their staffing needs. In short, most of the times Big Data is about connecting different complex data streams which require a high amount of storage and big calculation power to be ‘used’ in the form of analytics.

2.1 Big Data In the introduction of this research paper a brief definition of Big Data has been given and this section further discusses what Big Data is. Researchers have been conducting research on the subject Big Data and all agree that Big Data is digital information (Agarwal & Dhar, 2014; Tambe, 2014). Every computer and mobile device contains digital information, but this does not means it automatically is Big Data. From this statement the following question can be derived: What are the profound characteristics of Big Data and how different is it from ‘normal’ data?

7

Academic Paper Premaster IKM

P.V. da Cruz Caria

Many researchers refer to the 3Vs for the definition and characterization of Big Data. The 3Vs are developed by (Laney, 2001) and stand for volume, velocity and variety. Volume refers to the size of the data set and more literary describes Big from Big Data. Velocity explains the ‘speed’ of the data, which could refer to the data generation and/or the analyses. The range, type and source of data sets is covered by variety. Interesting to point out about the 3V’s is that they were first identified in 2001 to explain the three dimensions along which data change. At first it was not foreseen that these three dimensions form the basis for the definition of Big Data. In their research on Big Data, Chen & Zhang (2014) also add a fourth V which can be value. Chen, Chiang and Storey (2012) do not identify Big Data as a different field but use the term interchangeably with business intelligence and analytics (BI&A). In their research paper they provide a framework that identifies the evolution, applications, and emerging research areas of BI&A. Two complementary views have been built on the 3V’s. Those two views are the 4V’s created by IBM researchers and supported by Chen and Zhang among others and the five dimensions for Big Data (Power, 2014). The 4V’s are based on four dimensions: volume, velocity, variety, and veracity or value. Veracity or value refers to the quality of the data, can the data which has been collected be trusted? (Chen & Zhang, 2014). The five dimensions are identified by Power (2014) and are (1) data volume, (2) data variety, (3) data velocity, (4) data variability and (5) data complexity. Power used the field research papers from Gartner, IDC, IBM and SAS to create the dimensions. He suggests that Big Data “is supposedly at the extreme end of one or more of the [mentioned] dimensions” (Power, 2014). (1) Data Volume means how many units of data are stored. (2) Data Variety refers to all the different types of digital data, for example photos, videos and text documents. (3) Data Velocity focusses on both the ‘speed’ of the data by which it is produced and the ‘speed’ by which the data must be processed to be analyzed. (4) Data Variability refers to the data flows which “can be highly inconsistent with periodic peaks” (Power, 2014). Data is gathered from various sources and therefore brings the challenge how these different types of data can be linked, matched and be transformed across data systems. This challenge explains the fifth dimension (5) Data Complexity. The Five Dimensions of Data are finally described as “Data is an expanding ‘box’ with multiple attributes” (Power, 2014). Chen, Chiang and Storey have developed a different view in their research paper. They define Big Data as BI&A (Business Intelligence and Analytics) and use the terms interchangeably (Chen, Chiang, & Storey, 2012). BI&A “is often referred to as the techniques, technologies, systems, practices, methodologies, and applications that analyze critical business data to help an enterprise better understand its business and market and make timely business decisions”. In their research about BI&A they have created a data dimension evolution model. This model contains three levels of ‘data evolution’ and its analytics: BI&A 1.0, BI&A 2.0 and BI&A 3.0. In their comprehensive model the researchers

8

Academic Paper Premaster IKM

P.V. da Cruz Caria

Chen, Chiang and Storey (2012) focus, beside data evolution, on the application side and emerging research of BI&A. BI&A 1.0:

Founds its foundation in data management and warehousing. The data at this level is

“structured, collected by companies through various legacy systems, and often stored in commercial relational database management systems” (ibid) and is structured content. This means the data is mainly created by internal company processes. BI&A 2.0:

At this level the data is collected by the unique data collection and analytical research

and development opportunities offered by the Internet and the Web. This evolution level surfaced since the early 2000s and is also called “web-based unstructured content”(ibid). BI&A 3.0:

The data is derived from the ‘Internet of Things’, which means that the data is collected

from all internet-enabled devices equipped with various sensors. In 2012 Chen, Chiang and Storey pointed out that “although the coming of the Web 3.0 era seems certain, the underlying mobile analytics and location and context-aware techniques for collecting, processing, analyzing and visualizing such largescale and fluid mobile and sensor data are still unknown” (ibid). They call BI&A 3.0 “Mobile and Content”(ibid).

Sensor-based

For organizations their model means that every type of information or every level of BI&A could require different analytical applications, which need to be analyzed first before Big Data is being ‘used’ to create value for organizations. This proves the relevance of BI&A for ‘using’ Big Data.

2.1.1

Big Data views

The following table displays the four views on Big Data: 3V’s1 Volume Velocity Variety

4V’s2 Volume Velocity Variety Value

Five Dimensions3 Data Volume Data Velocity Data Variety Data Variability Data Complexity

BI&A4 BI&A 1.0 BI&A 2.0 BI&A 3.0

Table 1: four views on Big Data

After examining the different views on Big Data the conclusion can be made that they overlap. The 3V’s, 4V’s and Five Dimensions are all based on volume, velocity and variety. The 4V’s analysis adds another V, Value. The Five Dimensions include variability and complexity next to the known 3V’s. The

1

(Laney, 2001) (Chen & Zhang, 2014) 3 (Power, 2014) 4 (Chen, Chiang, & Storey, 2012) 2

9

Academic Paper Premaster IKM

P.V. da Cruz Caria

BI&A evolution model is different in its approach and focuses more on defining different levels of data gathering and analyzing instead of identifying the Big Data set itself. For this research paper Power’s (2014) definition of Big Data is used, the ‘Five Dimensions’. This is the most comprehensive and most recent definition. ‘Normal’ data can have some volume, velocity, variety, variability and complexity. So, how differs Big Data from normal data? Big Data has a higher volume, higher velocity, higher variety and higher variability than ‘normal’ data (Power, 2014; Chen & Zhang, 2014). Altogether it creates a high data complexity. This also means that “data is from multiple sources and it is difficult and challenging to link, match, cleanse and transform data across systems” (Power, 2014). That could be identified as the biggest challenge in Big Data and Big Data analytics. In this research Big Data is visualized as a circle with the five dimensions. The extent to which data ‘moves’ along each dimension makes it ‘big’. When only one of the five dimensions is identified it does not mean one can automatically speak of Big Data. All the dimensions are therefore necessary and to analyze which stage of digital innovation is influenced it is necessary ‘to take’ Big Data as a whole. A visualization of Big Data and its dimensions is given below.

Figure 1: Big Data and its dimensions

10

Academic Paper Premaster IKM

P.V. da Cruz Caria

2.2 Innovation In this research paper Big Data’s influence on digital innovation processes is studied. This raises the following two questions; What is digital product innovation and which stages can be identified within digital innovation processes? First these questions need to be answered before the influence of Big Data on digital product innovation and innovation processes is known. Innovation has been the subject of many research papers. In these papers an understanding of innovation and the process of innovation has been created. Kadar, Moise & Colomba view (product) innovation and innovation processes as two different types. Product innovation is described as “the introduction of a good or service that is new or significantly improved with respect to its characteristics or intended uses” (Kadar, Moise, & Colomba, 2014). These researchers refer to process innovation as “the implementation of a new or greatly improved production or delivery method” (ibid). The terms innovation and the innovation process are also used interchangeably because innovation is the process of creating something new according to Baragheh, Rowley and Sambrook. They also conclude that “[i]nnovation is the multi-stage process whereby organizations transform ideas into new/improved products, service or processes, in order to advance, compete and differentiate themselves successfully in their marketplace” (Baregheh, Rowley, & Sambrook, 2009). Fichman, Dos Santos and Zheng (2014) define three areas in the ‘definition’ of digital innovation. 1. Digital Process Innovations: “are significantly new ways of doing things in an organizational setting that are embodied in or enabled by IT” (Fichman, Dos Santos, & Zheng, 2014). 2. Digital Product Innovations: producing new digital products (ibid). 3. Business Model Innovations: “a significantly new way of creating and capturing business value that is embodied in or enabled by IT” (ibid).

Discovery

Development

Diffusion

Impact

Figure 2. Stages of digital innovation (Fichman, Dos Santos, & Zheng, 2014)

Within the digital innovation process Fichman et al. (2014) define four stages: Discovery, Development, Diffusion, and Impact. A visualization of this model is shown above. Discovery: Fichman et al. (2014) describe this stage as “new ideas are discovered for potential development into a process, product, or business model innovation”. They define two key activities in the discovery stage: invention and selection. Invention means the “creation of something new through a firm’s own process” (ibid). At this stage several ideas for creation are discovered and a selection has to be made to develop certain ideas in the development stage. 11

Academic Paper Premaster IKM

P.V. da Cruz Caria

Development: After the creation of an idea it needs to be developed in an usable innovation. “For product and business model innovations, this involves developing and refining the core technology plus packaging. Packaging means surrounding the core technology with complementary products and services that together form a solution that can be effectively used for a given purpose by a target adopter” McKenna 1985; Teece 1986 (as cited in Fichman et al., 2014). Diffusion: The idea is already developed and at this stage the new product/service is spread across a population of potential users. Fichman et al. (2014) point out that the central activity of this stage is deployment, which is “marshaling of the resources necessary to persuade and enable a population of firms or individuals to adopt and use the innovation” (Fichman et al., 2014). Assimilation is the final step in the diffusion stage. Which means that “individuals and other units absorb the innovation into their daily routines” (ibid). Impact: This stage focusses on the intended and unintended effects of the new digital innovation which has been diffused, or deployed for that matter. A new digital innovation can effect individuals, organizations, markets and societies. The impact of digital innovation can positively influence the cost side and the revenue side (ibid).

2.2.1

Data driven innovation

Jetzek, Avital and Bjorn-Andersen have conducted research on data driven innovation by focusing purely on open government data. Open government data is information which is information generated by a (sub-)government and available for ‘free usage’. Their model of data driven innovation is based on three ‘building blocks’: enabling factors, innovation mechanism and impact. Jetzek, Avital and Bjorn-Andersen point out that this model is not only applicable on open government data but on ‘normal’ data as well. Their model builds on the innovation value chain which has been developed by Hansen and Birkinshaw (2007) (as cited in Kusiak, 2009). The innovation value chain is the phase where this research is most interested in as this represents the innovation process. The innovation value chain exists of three main phases: idea generation, idea conversion and idea diffusion (Jetzek, Avital, & Bjorn-Andersen, 2014; Kusiak, 2009). Kusiak (2009) agrees with Jetzek, Avital and Bjorn-Andersen (2014) about the existence of these phases and their meaning. Idea generation “represents the firms’ efforts to acquire different types of knowledge necessary for innovation” (Jetzek et al., 2014). Idea conversion “is the process of transforming this knowledge into new products, services, business processes or behavioral innovations” (ibid). Idea diffusion “spreading developed ideas within and outside the company” (ibid). These three phases form the innovation mechanism within their data-driven innovation model. The data driven innovation mechanism shows resemblance with Fichman’s et al. (2014) stages of digital innovation (which was explained under 2.2 ‘Innovation’). Idea generation has the same 12

Academic Paper Premaster IKM

P.V. da Cruz Caria

meaning as the stage of discovery. Idea conversion shows resemblance with the stage of development and idea diffusion with the stage of diffusion. Whereas the meaning of ‘impact’ is the same for both Fichman et al. (2014) and Jetzek et al. (2014), they each place it at a different stage or place in their model. Fichman et al. (2014) see ‘impact’ as a different stage within the process of innovation itself and Jetzek et al. (2014) see ‘impact’ as a final level in their data driven innovation model. Finally, Jetzek’s et al. data driven innovation model exists of three levels: 

Enabling factors: Absorptive capacity, openness, resource governance, technical connectivity.



Innovation mechanism: Idea generation, idea conversion, idea diffusion.



Impacts: Social value, economic value.

Both Jetzek et al. (2014) and Fichman et al. (2014) agree that without a ‘successful impact’ the created service/product or organizational change would not be an innovation. This makes the impact stage as crucial as the discovery stage (idea generation) otherwise the created ‘thing’ would be a creation or invention. The market relevance and market acceptance determine this impact. So, only a creation with a high market relevance and market acceptance is an innovation (Kusiak, 2009).

Kusiak (2009) points out that “[t]he most important toll gates of innovation are the generation of new ideas and their evaluation”. As Big Data its influence on digital innovation has not been researched before it seems logical to ‘start’ at the first stage, the discovery stage. This choice is also strengthened by the fact that ‘normal data’ is accepted as an enabling factor of the discovery stage (and digital innovation) (Jetzek, Avital, & Bjorn-Andersen, 2014). In the next chapter, 2.3 ‘Influence of Big Data on the discovery stage’, a conceptual model is created which displays the expected relation between Big Data and the discovery stage of innovation.

2.3 Conceptual Model A clear overview of both Big Data and digital (product) innovation has been given. The following sub question will be answered in this chapter: What are the enablers of Big Data and how can Big Data enable discovery through ‘openness’? The conceptual model presented below (Fig. 3) is based on literature and consists of two Big Data enablers and one innovation enabler. Through that innovation enabler it is expected that Big Data influences the stage of discovery. The two Big Data enablers are explained in 2.3.1 ‘Big Data enablers’. Big Data’s influence trough ‘openness’ is explained in 2.3.2 ‘Big Data to discovery’, where ‘openness’ is an innovation enabler. This visualization helps to better understand the relation between the theoretical constructs explained. 13

Academic Paper Premaster IKM

P.V. da Cruz Caria

Figure 3: Conceptual model ‘from Big Data to discovery’. 2.3.1

Big Data enablers

From the literature two enablers of Big Data can be identified. These are absorptive capacity & technical connectivity and resource governance.

2.3.1.1

Absorptive capacity & Technical connectivity

The presence of both internal and external knowledge (data) does not automatically lead to the effective creation of ideas. When a company has a higher absorptive capacity its capacities to capturing, maintaining and using internal and external information to create ideas for an innovation could be higher. Absorptive capacity can more precisely be described as; “a firm’s ability to identify, assimilate, transform and apply valuable external [or internal] knowledge” (Jetzek et al., 2014). In regard to datadriven innovation they point out that, “the firm’s IT-specific absorptive capacity matters more than its general absorptive capacity” (ibid). Two types of absorptive capacity can be identified; (i) ‘general absorptive capacity’ and (ii) ‘IT-specific absorptive capacity’. General absorptive capacity is more concerned with the culture of an organization, to what extent do internal company policies allow the employees to access and use all kinds of internal and external data streams? But shows somewhat resemblance to what is discussed in the literature as IT-alignment (Setia & Patel, 2013), which means the extent to which IT infrastructures and implementation matches the business strategy and goals of an organization. The IT-specific absorptive capacity is merely focused on the IT infrastructure supporting those policies and therefore enabling the employees of organizations to access complex internal and external data 14

Academic Paper Premaster IKM

P.V. da Cruz Caria

streams. The focus with IT-specific absorptive capacity lies on enabling the employees of organizations to access complex internal and external data streams. In regard to IT infrastructure and innovation Cui, Ye, Hai Teo and Li (2015) point out that ‘with advancements in IT applications, firms are able to actively use these applications to engage in innovation virtually with other distant firms’. With a high quality of IT integration and flexibility organizations are better at quickly and economically adapting IT applications (data analytics) to ‘use’ and identify internal and external knowledge (Cui, Ye, Hai Teo, & Li, 2015). This involves integrating IT structures within a certain organization, some researchers refer to this as IT alignment (ibid) and means to overcome the threat of having a structure with a low system legacy. The compatibility is important, datasets from various systems must be in a format which allows it to be analyzed by one analytical tool (Chen & Zhang, 2014). When an organization achieves this, product development managers have the ability to use the information from the whole organization to come up with new ideas for products or product improvements. In the conventional data analytics a product development mangers is still reliant on his own data analytic skills or on analytic reports. When a company has a high quality Big Data infrastructure with several types of data streams and analytic tools, such as the earlier mentioned BI&A 1.0, 2.0 and 3.0 (Chen, Chiang, & Storey, 2012), then Big Data empowers the employees, and therefore the organization to achieve a high IT-specific absorptive capacity. Product development managers can, with these advanced Big Data analytics, quickly come up with analytic reports to identify patterns, preferences of customers. All to come up with new ideas for products or product improvements. There is not a clear distinction between technical connectivity and absorptive capacity. Although, technical connectivity only focuses on the ability “to analyze, mash up and make sense of different types of data” (Jetzek et al., 2014). The true technical capabilities to access and link data from various sources. These sources can be both internal as well as external, which could mean ‘complex’ internal and external data streams from different sources. Technical connectivity can be more precisely described as: “the availability of technologies to capture value from data” (ibid), which can help “individuals and organizations to integrate, analyze, visualize and consume the growing torrent of available data” (ibid). This matches the description of IT-alignment (Cui, Ye, Hai Teo, & Li, 2015), which supports the overall absorptive capacity of an organization. The higher the IT-alignment of an organization the higher the homogeneity of the connection between different IT systems and its data sources. This offers the ability to better collect data and analyze it. These analyses can contribute to the creation of new ideas for products (Kusiak, 2009). In order for a company to ‘possess’ Big Data, these technical connectivity’s should actually be present. If all the bank transactions of customers have to be captured, with all the aggregated data, these techniques should already be in place. One can argue that these specific and complex techniques are a prerequisite for Big Data (Chen & Zhang, 2014). With a high Big Data infrastructure an organization is better in matching different data sources and combining it with powerful analytical tools (Chen, Chiang, & Storey, 2012).

15

Academic Paper Premaster IKM

P.V. da Cruz Caria

2.3.1.2

Resource governance

Resource governance (see Fig. 3) is mainly concerned with organizational policies, data governance procedures, data dissemination skills or data analytic skills of employees. The organizational policies involve data management which should secure the quality and reliability of the data available within an organization. Important is an free flow of information, a product development manager should not be constrained by such policies and should have the information available that he or she requires to develop a product (Jetzek et al., 2014). Higher and lower management of organizations should ensure a high quality of data management. This could mean policies and procedures for employees who are involved in data entry tasks. Actually, Big Data, which is based on different and large data streams, must be of sufficient quality to exist which leads to a decrease of the risks for validity and relevance of this data (Chen, Chiang, & Storey, 2012; Power, 2014; Chen & Zhang, 2014). As explained in the Big Data part of this theoretical framework (see chapter 2.1), one of the characteristics of Big Data is ‘data which contains a high amount of value to a firm or company’. When an organization has a high level of data management then it means it has all the policies in place so that the data it possesses is of a high quality and reliability (Jetzek et al., 2014). This implies more on how and on what way data is collected for the various systems. For example, what are the policies for data entry. A high level of data management within an organization is one of the prerequisites in order to have Big Data (analytics) available (Kambatla, Kollias, Kumar, & Grama, 2014). According to Jetzek et al. (2014) resource governance can also be conceptualized as a function of leadership and skills regarding product development (or innovation for that matter). If an organization has employees with creative skills, then this could provide useful in the idea generation stage of innovation. Leadership should provide the right environment and ‘room’ for creativity but is also responsible for governance of policies.

2.3.2

Big Data to discovery

In this section Big Data’s role or influence as a discovery stimulator through the working factor ‘Openness’ (see Fig. 3) is analyzed. 2.3.2.1

Openness

The factor ‘openness’ which stimulates discovery is described by Jetzek et al. (2014) as “[that] firms with a more open knowledge search strategy, having access to a larger number of information sources that can provide ideas and resources, tend to be more innovative”. A company may over emphasize internal sources and under emphasize external sources when they have a lack of openness to the gathering of external data (ibid). Openness is also described in the literature as ‘open innovation approaches’ (Cui, Ye, Hai Teo, & Li, 2015). Which means “the practice of leveraging the discoveries of others” (ibid). This is achieved by “searching, acquiring and integrating the discoveries of others 16

Academic Paper Premaster IKM

P.V. da Cruz Caria

[external knowledge]” (ibid) and integrate that knowledge or technology into an organization its R&D operations. This offers the ability to unlock the potential of internal innovation and to configure or reconfigure the existing knowledge allocation and exploitation for innovation (ibid). An open innovation approach can also be achieved by seeking external organizations to create partnerships, alliances, cooperation and joint ventures. In this way co-sharing of information and R&D sources allow product development managers to come up with new ideas for products or processes. When Big Data analytics is implemented it offers a high level of internal and external IT systems integration and alignment (Chen, Chiang, & Storey, 2012). These structures and tools allow a better integration within or between companies. One can argue that when an organization has a high level of Big Data integration it is easier to conduct open innovation approaches. An example of external data could be twitter feeds about a certain subject, combined with the arrival and departure data of public transport in Amsterdam of all the bus, metro and train lines. Big Data analytics does allow for the efficient use of these complex data streams and for example create correlation analyses between them.

3 Methods 3.1 Research design & Context In order to answer the research questions stated in chapter 1.2 existing literature has been reviewed and found by searching in various peer reviewed journals. Besides the literature 4 interviews have been conducted as well. The main databases used for this search were Google Scholar, Vu E-library and ScienceDirect. Keywords to select articles were: ‘big data’, ‘big data definition’, ‘big data innovation’, ‘definition of innovation’, ‘defining innovation’, ‘digital product innovation’, ‘data driven innovation’, ‘innovation process’, ‘digital innovation’, ‘digital innovation process’. Various combinations between these keywords were created as well. A visualization of the study’s research approach is given in the scheme below:

Theory •Big Data •Definitions •Big Data analytics •Innovation •Innovation models •Data-driven innovation •Conceptual Model

Research

Results

•Interviews •Coding •Analysis

•Propositions

Figure 4: Research approach

17

Academic Paper Premaster IKM

P.V. da Cruz Caria

First, the theory and theoretical relevance about Big Data and Innovation are analyzed. This section is outlined in the theoretical background. An conceptual model is formed, which forms the basis for empirical research. At second, empirical data is collected by conducting interviews. Secondly data collected is analyzed and together with the information from the theoretical background propositions within the conceptual model can be formulated. This study is conducted with an inductive approach and could be viewed at as an exploratory research paper. A novel topic ‘Big Data’ and its relation to innovation is investigated. Data necessary for this research is gathered by qualitative means. This method has been chosen because this paper aims to contribute to the theory building of innovation and in connection with Big Data (gathering, processing and analyzing). According to Myers (2013), for theory building and for a deeper understanding of certain phenomena qualitative research suits the best. Four large organizations have been approached to conduct an interview at. Two organizations have each agreed to contribute to this research by organizing interviews with two employees. Due to anonymity reasons the names of these two organizations remain classified. Both organizations are national and international orientated with several branches across Europe and consist of a total of 200-350 employees. The unit of analysis during this research are the two organizations mentioned and the unit of observation are the employees of each organization. Please recall that the research questions focus on the process of innovation at large organizations, this justifies both the unit of observation as well as the unit of analysis. The in-depth interviews are conducted on-site, at the offices of that particular organization. A semi structured interview protocol is used because it allows room for any probing questions during the interview (Myers, 2013). The detailed verbatim transcription of the interviews can be found in the appendix. An important part during the data collection is that an open view is maintained. This means that the interviews can result in an adjustment or adding of other factors to the conceptual. Another concern with qualitative research is validity and reliability. According to Meyers (2013), validity is concerned with: are you measuring what you want to measure? And reliability is concerned with: is your measurement accurate? All in all, several steps have been taken in account to maintain both validity and reliability in this research. All the audio of the interviews is recorded and transcribed as well, which makes this study verifiable. The interview topic questions are created with existing literature which contributes to the validity of this research.

3.2 Data collection & Analysis As explained, a semi-structured interview protocol is used for the collection of the data. It shows by which questions the topics Big Data and innovation are measured by focusing on the concepts

18

Academic Paper Premaster IKM

P.V. da Cruz Caria

‘Absorptive capacity’, ‘Technical Connectivity’, ‘Openness’, ‘Resource Governance’ and ‘Discovery’. A detailed copy of the interview guide can be found in the appendix. In order to analyze data which is gathered from the semi-structured interviews, transcriptions are coded. The coding method used is descriptive open coding, which means labels are used to assign labels so summarize data in one word (Miles, Huberman, & Saldana, 2014). At first every line is analyzed by the first cycle of coding (open coding), these codes are in constant comparison between the interviews conducted. The first level codes are organized and synthesized into categories (second level codes). These second level codes can provide an index of topics which can be categorized to structure the data. Patterns and interesting differences can therefore be identified. In total four employees have been interviewed from two different organizations. The first organization is called ComA and the second ComB. ComA is a financial services company and ComB is a communication & multimedia company. Type

Company Function

Date collected

Interview 1

ComA

Product Development and project manager

1-6-2015

Interview 2

ComA

Manager Marketing Intelligence

2-6-2015

Interview 3

ComB

Manager Online analytics and optimization

5-6-2015

Interview 4

ComB

E-commerce Product Development

10-6-2015

Table 2: Data collection scheme

19

Academic Paper Premaster IKM

P.V. da Cruz Caria

4 Results In this section the results from the conducted interviews are discussed. The innovation processes at the organizations used for this research turned out to be different than expected. Both companies (or organizations), who delivered the interviewees, do not have an ‘open innovation approach’ (or use ‘openness’ for that matter). Which makes it difficult to analyze Big Data’s impact on this approach. Both companies do use external information for their products or in support of their products, but it was not an important starting point for the creation of an idea or product. At first, an explanation is given how the innovation processes started at these organizations. When the product development manager is asked how the discovery of an idea for an new product or innovation starts he says the following: “yes, we actually always start with an hypotheses or problem statement, which can be data-based or based on a strategy goal from higher management. Then when we have our problem or statement, then we do a quantitative search and a plan of analytics to search for the solution of this problem. This could be a hiccup point. Then when we know what the hiccup is we can start with the qualitative improvement, for example: how does it needs to look? But we always start with an hypotheses or problem, and it does not automatically rolls out of our analytical systems (yet). The statement (hypotheses) is based on the vision of a CEO or higher management”. This means that not a pure discovery is started with just the data analytics, or Big Data analytics. The idea for a new product or improvement start with an statement or strategy goal from higher management. The product development departments then check if such a goal is realistic. But Big Data analytics still has its important role at developing a product, as one interview illustrates: “We don’t do anything without data. With every improvement, data is needed to make the case ‘hard’ and supply it with proof. One could argue that innovation at our organization is always data-driven, but the start is not based on data. By which is meant the Big Data analytics report”. This quote states the importance of Big Data analytical reports in order to get to the root of a problem. Which again is based on a problem statement from higher management. Another setback during the interviews is that the organizations maintain a low level of ‘openness’ or an open knowledge search strategy. This makes it difficult, if not impossible, to check some of the relations presented in the conceptual model. An advantage of this study’s approach, as mentioned in the methods section, is that an open view is maintained during data collection (interviews). This allowed room for other interesting findings about Big Data and digital innovation.

20

Academic Paper Premaster IKM

P.V. da Cruz Caria

4.1 Big Data enablers In this paragraph findings are presented in relation to the Big Data enablers ‘Absorptive Capacity & Technical Connectivity’ and ‘Resource Governance’. About every enabler an proposition is formed. In order to gain these insights a codebook is created, which is added as an appendix (7.3 ‘Codebook’).

4.1.1

Absorptive Capacity & Technical Connectivity

At first, please recall that absorptive capacity & technical connectivity is mainly concerned with organizational capabilities, IT capabilities, communication infrastructure, technical implementation and the availability of multiple platforms (Setia & Patel, 2013). One of the organizations uses an analytical Big Data which can connect data streams or ‘subjects’ which are not automatically related. It has the power to ‘crunch’ the data in a fast way as one of the interviewees puts it. With such a tool it is possible to come to new insights for processes. He points out a valuable point in regard to implementing and maintaining Big Data tools/analytics: “Frankly, I also find Big Data a sort of a pitfall. I believe so because you can simply add data to a platform without a clear business goal or strategy in mind could lead to an ‘analysis paralysis’. Just adding data to a large server with big computer powers to analyze it and just look, gosh what shall we do with it? It can come to the point that the powerful Big Data analytics server is not used because nobody knows what to do with it”. This is definitely something that an organization does not want to happen. As we have seen in the theoretical background, this seems to imply to IT-alignment. IT-alignment means that an organizations business goals and strategy should be in line with its IT strategy (Cui et. al, 2015). An organization implements Big Data to gain new insights and link lots of huge data streams. But as the interviewee points out, this should be done with IT-alignment in mind to prevent an expensive Big Data platform with no use. IT-alignment as argued before, is viewed as an part of an organization its absorptive capacity. A good IT-alignment means that an IT-platform or a Big Data platform is implemented with a clear business strategy in mind (Cui et. al, 2015). This contributes to an organization its absorptive capacity. Another important factor in order to achieve a Big Data platform is technical connectivity. Various different IT systems which deliver different streams of data should be connected in one way or the other to eventually achieve one ‘data pool’ on which a Big Data platform can be built and various analyses can be conducted over different types of data. When systems are ‘chattered’ then this has a negative impact on the technical connectivity and thus on Big Data. A high technical connectivity is very important but not always easy to achieve, not only because of the costs that comes which such improvements and implementations but also because some employees 21

Academic Paper Premaster IKM

P.V. da Cruz Caria

and/or departments are depended on one certain system for their reports. It could take some time before employees are trained to use certain Big Data applications. “In our organization we applied Hadoop, which is a coding language to build a Big Data platform on. It is flexible so it gives the opportunity that different departments with different needs can use the data on their own way. Which means retrieving valuable information and discovering patterns. But in order for the employees to get used to such new platforms it could take a long time. The advantage is that when the organization grows data-wise it [Hadoop] offers the opportunity to easily grow and match and cleanse this data”. In their experience before such a system was in place, they had lots of different systems and data streams. Analyses were difficult to conduct and sometimes even impossible. An Big Data platform takes away these problems. An example is also given how a high technical connectivity and absorptive capacity ‘through Big Data’ contributes to digital innovation: “One of the things we can analyze and prove now is a correlation between how much a customer checks the support site and his contract status with his contract cancellation. This is an mix of website information and call center records. He/she could be unhappy about the how the product works, this could be a wrong installation and be also checking his contract ending date. This is a huge predictor of contract cancellations”. This means they can now act proactively on this cancellation risk. By checking the online visiting information of this customer (about contract and installation) and making sure his product is installed correctly. In order to achieve this they had to match website information and call center records which come from different systems. This actually resembles technical connectivity. Another advantage which was not foreseen by implementing these analytics is that they were able to improve support information on their website so their call center had to process less calls, which resulted in lower costs. From the results presented above the following propositions can be created: P1: A high IT alignment has a positive effect on absorptive capacity & technical connectivity. P2: Absorptive Capacity & Technical Connectivity has a positive effect on Big Data.

4.1.2

Resource Governance

Please recall that resource governance is mainly concerned with organizational policies, data governance procedures, data dissemination skills or data analytic skills of employees. An interesting quote from one of the interviewees regarding resource governance is:

22

Academic Paper Premaster IKM

P.V. da Cruz Caria

"So you have data governance which is a little problem with us, the ownership of the data - that it is recorded in a correct way- a certain department logged the data, and for example telephone records need to be correctly logged. That's a responsibility there for the ownership, for us that is governance. This is very important because it affects your quality data for instance". As we have seen in the theoretical background an prerequisite for Big Data, besides a high volume, high velocity and high variety, is a high data value. Data management as part of data governance plays an important role in achieving this ‘characteristic’ (data value). At the organizations interviewed, data management consisted of policies, data entry rules and also data maintenance. The high value of Big Data is important to ensure the quality of Big Data analytics (Chen, Chiang, & Storey, 2012). The systems, which can contribute to the technical connectivity, are important for a company to possess Big Data, but the skills of the employees in regard to using those Big Data platforms are as important as the technical connectivity. “I would like to call that datascience; people, employees which are data-aware are necessary besides the systems and applications”. Another important aspect which surfaced during the interviews is the term ‘awareness’, which could be added as an variable to the construct resource governance. With awareness is meant the awareness of different departments within an organization that extensive Big Data analytical reports are available and important. This situation is sketched by an interviewee: “Yes, you can generate your targeted conversion by only mailing 10%, 20% of your clients. So you don’t have to mail the other 80% to reach the same number. It’s inefficient and our marketing department knows that, but we just never get time to really prove and analyze this and only mail the needed 10/20%. They want to send it sooner than later, and that prevails over the option by only mailing 10/20% and reach the same conversion”. So, this organization could work more efficient if the full capabilities which are present are also used. The employees should be aware of the advantages such analytics have, this increases its use. The interviewee from a different organizations adds to this: "So you can say here you go, enjoy [the Big Data platform and its analytics]. But you could still get a bit of the effect that you try to enforce the different departments and employees to use the Big Data platform. The problem can evolve that we have this powerful technology and all the data you can do so much with it. But if you are in terms of organization, people and knowledge just not ready, then it is of course a risk that it is going nowhere". Organizations should guard for this not to happen. Otherwise the extensive Big Data applications and techniques implemented are still not used and an organization could not gain advantage from such a platform. This awareness factor could be added as a factor of or prerequisite to resource governance. 23

Academic Paper Premaster IKM

P.V. da Cruz Caria

From the analyses of data above the following proposition can be formulated: P3: Resource governance has a positive effect on Big Data.

4.2 Big Data and the development stage As mentioned in the first part of this results section, the companies used for this research didn’t use an open knowledge innovation strategy. Therefore the effect of Big Data on the discovery through ‘openness’ is almost impossible to investigate. Nevertheless, other findings can be presented in regard to the discovery stage. Surprisingly, Big Data influenced the way these organizations conducted the development stage of innovation. This is the stage in which business models for an idea are developed, or the idea is transformed into an usable innovation (Fichman, Dos Santos, & Zheng, 2014). One of the interviewees gave an example by which they developed and redefined a business model for one of their products: “for product XXX we calculated the pricing on analyzing client behavior. This was only possible with that specific Big Data analytics tool. We analyzed client behavior on almost every aspect, from online behavior to online spending and trading behavior”. This implies the usability and advantage of Big Data analytics for calculating the pricing scheme (or plan) of a product. The biggest advantage was the time by which such difficult calculations were possible. This also came up as an advantage in other literature, the speed by which huge calculations are possible increases with Big Data analytics applied (Chen & Zhang, 2014). Another interesting finding which came up is the use of business cases during innovation in the development process of an organization. As one of the participants point out: “As you know, at first an idea is created. Then that idea is transformed into a business case. This will show the subject, consumers, competitors, how does it fit into the company, the financials, what is the expected revenue, what are the expected costs and the risks in regard to legal and compliance. Those are the elements of a business case and is the center point of our development process”. This gives a clear idea about what a business case is and which information should be added to it. A marketing intelligence manager points out that creating client segments is an important part of a business case, because a specific client segment is needed to target your product on: “We want to know what specific client behavior is, before, during and after an change. What is used by those clients? What are the historical records? Which all helps to decide if it is smart to execute a plan or start developing a product”. With Big Data analytics it becomes possible to combine various data sources with information about clients to eventually create better and more detailed client segments. These detailed client segments contribute to better business cases and on turn contribute to the development stage of innovation. By combing data from various sources about those clients 24

Academic Paper Premaster IKM

P.V. da Cruz Caria

From the results in this section the following proposition is created. P4: Through the creation and support of business cases, Big Data has a positive relation to the development stage of innovation.

5 Discussion & Conclusion In this section the results and outcomes of this research paper are discussed. This discussion includes the conclusion, theoretical contributions, the practical implications and the limitations and implications for further research.

5.1 Synthesis Before an answer is given to the research question of this paper every sub question is briefly answered. For the complete explanations please see the theoretical framework.

Sub 1: What are the profound characteristics of Big Data and how different is it from ‘normal’ data? From the literature it can be concluded that Big Data consists of five profound characteristics (Power, 2014). The characteristics are: ‘volume’, ‘velocity’, ‘value’, ‘variety’, ‘complexity’. Big Data consists of a high volume (1), which means lots of storage is needed to obtain and maintain it. Data travels a high rate and large processor power is needed in order to analyze it, which resembles the high velocity (2) of Big Data. Data needs to represent a high value (3) to an organization. Big Data ‘catches’ lots of different data streams and makes it possible to analyze these, which resembles the variety (4) of data. Altogether, the four previous characteristics makes Big Data highly complex (5).

Sub 2: What is digital product innovation and which stages can be identified within digital innovation processes? Innovation is a process with multiple stages whereby an idea is transformed into new services, products or processes. With a focus on competing and differentiating successfully in their own marketplace (Baregheh, Rowley, & Sambrook, 2009). When we speak of digital innovation the addition IT is made, which means that the innovation is embodied or enabled by IT (information technology) (Jetzek et al., 2014; Fichman et al., 2014). A digital innovation process consists of 4 stages (Fichman et al., 2014). A stage in which the idea for a new product is generated, the discovery stage (1). Then the idea is 25

Academic Paper Premaster IKM

P.V. da Cruz Caria

transformed into an usable innovation, the development stage (2). The product or service needs to be spread and deployed, the diffusion stage (3). The final stage, the impact stage (4), is the stage in which the impact of the innovation is determined.

Sub 3: What are the enablers of Big Data and how can Big Data enable discovery through ‘openness’? From theory two enablers of Big Data can be identified. Absorptive capacity & technical connectivity (1) and resource governance (2). These enablers are a prerequisite in order to obtain and implement Big Data in an organization. Absorptive capacity & technical connectivity is mainly concerned with organizational capabilities, IT capabilities, communication infrastructure, technical implementation and the availability of multiple platforms (Setia & Patel, 2013). Resource is mainly concerned with organizational policies, data governance procedures, data dissemination skills or data analytic skills of employees (Jetzek, Avital, & Bjorn-Andersen, 2014). The literature suggests that through an open innovation approach (also called openness in this paper, and explained in chapter ‘2.3.2 Big Data to discovery’) the discovery stage is stimulated. This relation was not found in this research paper, but does not means this relation would never exist. This will be elaborated more in ‘5.2 Theoretical contributions’.

5.2 Theoretical contributions Big Data and Big Data in relation to digital innovation is still an underdeveloped and novel topic. Most research is aimed at IT-alignment, data analytics or just on digital innovation processes (Chen, Chiang, & Storey, 2012). At first clarification was given about the different definitions of Big Data (analytics). This could provide clarification for further research, as there is discussion about which definition to use. This research paper could be viewed as an exploratory research. As already explained in the results section of this paper, the innovation processes at the organizations used for this research turned out to be different than expected. Both companies, who delivered the interviewees, do not have an ‘open innovation approach’ (or use ‘openness’ for that matter). Which made it difficult to analyze Big Data’s impact on this approach. Both companies do use external information for their products or in support of their products, but it was not an important starting point for the creation of an idea or product/service. As it turned out, Big Data was more of influence on the development stage of innovation at these organizations. A reason for this could be that that product development managers are not always creatively ‘free’. At every organization used for this research it became clear that a strategy goal or hypothesis is an important starting point for an digital innovation. At first a strategy goal or hypothesis is outlined, most of the times by upper management. Secondly, a business case should be presented with the proper support and ‘proof’ of data in order to achieve that 26

Academic Paper Premaster IKM

P.V. da Cruz Caria

goal or solve that specific problem. Regarding why this certain product will ‘work’, the costs, projected revenue and cost reductions, if any. Big Data could provide all the support needed for such a business case, but for the creation of a hypotheses or a strategy goal Big Data could only deliver a small contribution and the vision of upper management seems more important. It turns out that Big Data is mainly used as support for building business cases or calculating the feasibility of an innovation. This usage of Big Data has been identified by others as well (Chen & Zhang, 2014; Chen, Chiang & Storey, 2014). At one of the organizations Big Data offers possibilities to calculate new business models or to develop them further, this is confirmed by other researchers as well (Tambe, 2014). A new finding to resource governance is added as well. Which consists of two factors, awareness in usage (i), to what extent is the organization aware of the opportunities and capabilities of Big Data? and do the different departments use it?. An organization and actually every department involved with the development of new products should be aware of the ‘power’ of Big Data analytics and should know how to efficiently use it. Awareness in implementation (ii) which is more in line with IT-alignment, is the Big Data platform developed with a strategy and bussiness goals in mind? The higher these awerenesses are, the better the Big Data implementation will be. One can argue that this could be a ‘logical’ evolution from normal IT-alignment (Cui, Ye, Hai Teo, & Li, 2015) and might not be a completely new addition to the theory. But further research is needed in order to determine so. This study also aims to cleary identify the earlier mentioned Big Data enablers. Absorptive capacity & technical connectivity and resource governance. These are not completely new, but the combined format of absorptive capacity and technical connectivity could be. Some researchers use the terms apart from each other (Setia & Patel, 2013) but defiding ‘line’ between them is thin and sometimes vaguely. As argued in the theoretical framework, these terms are viewed as one enabler for Big Data.

5.3 Practical implications In this section the practical implications of this research are discussed. This report offers managers from various organizations insights into what Big Data is and what the main focus points are when implementing it. These focus points are the two Big Data enablers, (i) absorptive capacity & technical connectivity and (ii) resource governance. It became clear that when the factors of these enablers are applied to an organization it increases the usability and success of an Big Data platform. These factors are clearly explained and mentioned in chapter ‘2.3.1 Big Data enablers’. The examples mentioned in the results display some of the advantages organizations can gain when they implement Big Data. Unfortunately, the relation between Big Data and the discovery stage of innovation is not explained by the gathered data. But some insights into how Big Data can be used in the development stage are given. The use of Big Data analytics as an extensive tool to determine profitability and creating business models

27

Academic Paper Premaster IKM

P.V. da Cruz Caria

of products/services. But also the use of business cases in the development stage is an interesting insight as well, as these business cases can be supported by Big Data analytics.

5.4 Limitations and implications In an ideal world this study would be conducted without any limitations, but this study does has some limitations. Data for this research was collected by qualitative means, more precisely, four interviews were conducted at two organizations from two different sectors. There was a good balance between the job descriptions of the interviewees, namely two product development managers and two data analytic managers. But with more interviews of different managers more detailed information could have been collected. A higher level of theory contribution may be achieved when conducting more interviews. The interviews were also translated from Dutch to English, which could create the risk of a wrong interpretation. Further research is needed to verify the propositions stated and analyzed in this research paper. Furthermore, this paper focused on the first stage of digital innovation, the discovery stage, and it processes. But this relation remains unclear as it was not possible to analyze this relation with the interviews conducted. The conceptual model could be used as a starting point for further research about the relation of Big Data on discovery. Some influences of the effect which Big Data has on the development stage were found, these effects and the proposition stated can be further tested. But it remains unclear what the effect or influence of Big Data is on the stages of ‘discovery’, ‘diffusion’ and ‘impact’.

5.5 Conclusion In the introduction the following research question was stated: How can Big data support innovation processes in organizations and what are the requirements for doing so? Although the research question was adjusted from the focus on the discovery stage to innovation processes, and the effect of Big Data on discovery remains to be investigated. Some interesting findings were found about the development stage of innovation. The way how Big Data can be used in the development stage is mainly by supporting the creation of business cases and through Big Data analytics. With Big Data analytics it is possible to calculate new business models or gain new insights about to which segments the newly created product can be aimed at. The requirements for doing so mainly concern the two Big Data enablers resource governance and absorptive capacity & technical connectivity. According to the literature and findings, if a company has a high resource governance and absorptive & technical connectivity it positively effects Big Data.

28

Academic Paper Premaster IKM

P.V. da Cruz Caria

6 References Agarwal, R., & Dhar, V. (2014). Big Data, Data Science, and Analytics: The opportunity and Challange for IS Research. Information Systems Research, 443-448. Baregheh, Rowley, & Sambrook. (2009). Towards a multidisciplinary definition of innovation. Management Decision, 1323 - 1339 . Chen, C. P., & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences (Elsevier), 314-347. Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly, 1165-1188. Cui, T., Ye, H., Hai Teo, H., & Li, J. (2015). Information technology and open innovation: A strategic alignment perspective. Information & Management, 348 - 358. Delen, D., & Demirkan, H. (2013). Data, information and analytics as services. Decision Support Systems; Elsevier, 359-363. Dougherty, D. (1992). Interpretive Barriers to Successful Product Innovation in Large Firms. The Institute of Management Sciences(3), 179-202. Edison, H., Ali, N. b., & Torkar, R. (2013). Towards innovation measurement in the software industry. The Journal of Systems and Software, pp. 1390-1407. Ekbia, Mattioli, Kouper, Arave, Ghazinejad, Bowman, . . . Sugimoto. (2014). Big Data, Bigger Dilemmas: A Critical Review. Journal of the association for information science and technology, 2-24. doi:DOI: 10.1002/asi.23294 Fichman, R. G., Dos Santos, B. L., & Zheng, Z. E. (2014, June). Digital Innovation as a Fundamental and Powerful Concept in the Information Systems Curriculum. MIS Quarterly, pp. 329-353. Jetzek, T., Avital, M., & Bjorn-Andersen, N. (2014). Data-Driven Innovation through Open Government Data. Journal of Theoretical and Applied Electronic Commerce Research, 100120. Kadar, M., Moise, I., & Colomba, C. (2014). Innovation Management in the Globalized Digital Society. Procedia; Social and Behavioral Sciences, pp. 1083-1089. Kambatla, K., Kollias, G., Kumar, V., & Grama, A. (2014). Trends in big data analytics. J. Parallel Distrib. Comput., 2561 - 2573. Kusiak, A. (2009). Innovation: A data-driven approach. Elsevier; Int. J. Production Economics, pp. 440-448. 29

Academic Paper Premaster IKM

P.V. da Cruz Caria

Kwon, O., Lee, N., & Shin, B. (2014). Data quality management, data usage experience and acquisition intention of big data analytics. International Journal of Information Management, 387-394. Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity and Variety. Application Delivery Strategies; Meta Group, 1-2. Marketing Intelligence, M. (n.d.). Interview 2 ComA. (P. da Cruz Caria, Interviewer) Miles, M., Huberman, A., & Saldana, J. (2014). Qualitative Data Analysis: A method sourcebook section. London: Sage. Myers, M. D. (2013). Qualitative Research in Business & Management. Londen: SAGE Publications Ltd. Nylén, & Holmström. (2015). Digital innovation strategy: A framework for diagnosing and improving digital product and service innovation. Business Horizons, 57-67. Online analytics and optimization, M. (n.d.). Interview 3 ComB. (P. da Cruz Caria, Interviewer) Power, D. J. (2014). Using ‘Big Data’ for analytics and decision support . Journal of Decision Systems, 222-228. Product Development, M. (n.d.). Interview 1 ComA. (P. da Cruz Caria, Interviewer) Tambe, P. (2014). Big Data Investment, Skills, and Firm Value. Management Science, 1452-1469. Yoo, Y., Boland, R. j., Lyytinen, K., & Majchrzak, A. (2012). Organizing for innovation in the Digitized World. Organization Science, 1398-1408. ZDNet. (2015, May 14). The Cloud for Clouds IBM and the weather company work on Big Data weather forecasts. Retrieved from ZDNet.com: http://www.zdnet.com/article/the-cloud-forclouds-ibm-and-the-weather-company-work-on-big-data-weather-forecasts/

30

Academic Paper Premaster IKM

P.V. da Cruz Caria

7 Appendix 7.1 Interview Guide Topic: Purpose: Example questions: Notes:

General/Introduction Make participant at ease, giving context Getting approval for recording the interview Participant should be at ease, confidentiality Not mention eventual target of the study, just Big Data and digital innovation process which are researched. Giving a summary of the themes which are covered in the interview

Topic: Culture of the organization and baseline questions Purpose: Providing relevance of this interview Example questions: Can you tell me something about the organization? How long does it exist? For how long do you work at ‘xxxx’? What is your job description and what are your tasks?

Topic:

Big Data Experience of the interviewee in regard to (Big) Data, analyzing it or ‘using’ it. Don’t always use the term Big Data, ‘data’ can also be used. The Purpose: researcher can decide afterwards according to the scientific definition if it is Big Data. Example questions: Can you tell me more about your daily tasks and work? Do you use any data for your work? For which tasks? How do you get that type of data? Which systems? How do you use it (the data)? Do you use different types of data (sources) for every different task? What insights can organizations get when retrieving and analyzing data? Can you give some examples about advantages companies can get when retrieving and analyzing data? Topic:

Innovation

Purpose:

Experience of the interviewee in regard to innovation. Use terms as the development of a new product, provide examples if necessary; ‘new website’. Are you involved in the development of new products? Or new ‘ways of Example questions: working’? Can you give an example of that? Ask about every step and detail of this process, where did it start. What were the difficulties you experienced developing product xxx? Which information did you used during the development? Who came up with the idea for this product/service? What stimulated the forming of this idea?

31

Academic Paper Premaster IKM

P.V. da Cruz Caria

What are the processes an digital innovation goes through from beginning to completing?- Does this company has a standard process when developing a product/service? What are the biggest 'traps' in regard to developing a digital product/service? Topic: Purpose:

Closing Find out remaining issues Are there any remarks or questions? Example questions: Would you like to add anything to what we have discussed? Did I miss anything?

7.2 Interview transcripts The interview transcripts are classified and only available upon request.

32

7.3 Codebook Concept Big Data

Codes Absorptive capacity

Definition Strategy prerequisite in order to have Big Data.

Quote “Frankly, I also find Big Data a sort of a pitfall. I believe so because you can simply add data to a platform without a clear business goal or strategy in mind could lead to an ‘analysis paralysis’. Just adding data to a large server with big computer powers to analyze it and just look, gosh what shall we do with it? It can come to the point that the powerful Big Data analytics server is not used because nobody knows what to do with it”.

Awareness Big Data

Organizational awareness of an organization to really understand and use Big Data and match it to their overall strategy.

"So you can say here you go, enjoy [the Big Data platform and its analytics]. But you could still get a bit of the effect that you try to enforce the different departments and employees to use the Big Data platform. The problem can evolve that we have this powerful technology and all the data you can do so much with it. But if you are in terms of organization, people and knowledge just not ready, then it is of course a risk that it is going nowhere".

Resource governance

Policies, data management and governance of policies within an organization to implement or maintain Big Data.

"So you have data governance which is a little problem with us, the ownership of the data - that it is recorded in a correct way- a certain department logged the data, and for example telephone records need to be correctly logged. That's a responsibility there for the ownership, for us that is governance. This is very important because it affects your quality data for instance".

33

Academic Paper Premaster IKM

Innovation

P.V. da Cruz Caria

Technical connectivity

Pure technical prerequisite to have Big Data.

“In our organization we applied Hadoop, which is a coding language to build a Big Data platform on. It is flexible so it gives the opportunity that different departments with different needs can use the data on their own way. Which means retrieving valuable information and discovering patterns. But in order for the employees to get used to such new platforms it could take a long time. The advantage is that when the organization grows data-wise it [Hadoop] offers the opportunity to easily grow and match and cleanse this data”.

Awareness product development

Awareness of an organization for product awareness

“Yes, you can generate your targeted conversion by only mailing 10%, 20% of your clients. So you don’t have to mail the other 80% to reach the same number. It’s inefficient and our marketing department knows that, but we just never get time to really prove and analyze this and only mail the needed 10/20%. They want to send it sooner than later, and that prevails over the option by only mailing 10/20% and reach the same conversion”.

Business case

Report written when the idea for a product is formed. Consists of all the costs, explanation of the product and cost savings if any. Development stage of innovation

“As you know, at first an idea is created. Then that idea is transformed into a business case. This will show the subject, consumers, competitors, how does it fit into the company, the financials, what is the expected revenue, what are the expected costs and the risks in regard to legal and compliance. Those are the elements of a business case and is the center point of our development process”.

Development

“yes, we actually always start with an hypotheses or problem statement, which can be data-based or based on a strategy goal from higher management. Then when we have our problem or statement, then we do a quantitative search and a plan of analytics to search for the solution of this problem. This could be a hiccup point. Then when we know what the hiccup is we can start with the qualitative improvement, for example: how does it needs to look? But we always start with an hypotheses or problem, and it does not automatically rolls out of our analytical systems (yet). The statement (hypotheses) is based on the vision of a CEO or higher management”.

34

Academic Paper Premaster IKM Openness

Openness

P.V. da Cruz Caria Importance of access of data for an department.

“yess that is very important, an advice, report, analyses, reports or the dataset itself, that accessibility is important.”

35