Case Study Analysis of the Use of Cloud Computing for Assessing Big Data Risks

0


INTRODUCTION
Big data is a key entity for organizations of all sizes nowadays. There are limitless opportunities and intellectual capital associated with big data [1]. Cloud computing is the computing resource that makes big data accessible and can be used to share, archive, and destroy data of many sizes [2]. However, some inherent risks are associated with big data on the cloud if poorly handled. Security threats such as data breaches, confidentiality issues, and threats due to data availability can cause severe damage to companies and their system [3]. The deliberate misuse of datadriven technology by malevolent players and the risks of falling into corporate insecurity can threaten companies. Big data is the large, diversified data set from different locations such as websites, electronic check-ins, organizational databases, diversity, and authenticity are processed before value. Big data leverage clouds computing services like Amazon or Microsoft to operate, simplifying decisionsupport systems [15]. The cloud is the storage chamber, and big data is the item stored in it [14].
Big data management is very crucial to the rising demand of enterprises. Cloud's advanced utility to consolidate, integrate, build and manage large datasets makes it a highly trusted and practised resource [16]. However, there are various risks associated with big data using the cloud. A privacy breach is a common risk that exposes data at an alarming rate [17]. Having owned by third parties, the privacy risks can cause disasters for enterprises and private users [18]. Cloud-based service providers offer numerous real-time applications using big data. Still, these servers and nodes with storage clusters can be vulnerable to privacy breaches due to data leaks, irrelevant guest entries, and mass victimization on the cloud [19]. Another risk in managing big data using the cloud is governance and compliance. Security issues concerning an interface and user access are high due to non-regulated and unorganized service conformity [20]. Further, data availability problems pose huge risks to the technical sub-system as a large amount of big data is unorganized and scattered [21]. There are various advanced methodologies in practice, such as erasure and network coding, but their integration with the system is lacking. New advanced coding and access control systems are designed to prevent these concerns, and there is a need for increased intervention to manage these big data [22] ethically.
The empirical evidence has been derived from showcasing the risks in big data management using the cloud. Van Der Schyff & Krauss [23] conducted a thematic analysis of twelve experts for a semi-structured interview on the security threats in cloud computing in South Africa. The researcher studied virtualization-related security issues with the help of consolidating the issues in a classified manner. The findings revealed that data privacy and protection concerns arise due to multitenancy, malicious insider presence, and shared application usage. Sodikin [24] researched sixty-nine participants through a questionnaire and conducted four interviews with respondents from two organizations exposed to the cloud and virtual working and interviewed them on security issues in cloud computing with the help of analysis revealed that lack of reliability, integrity, and availability are the major issues in cloud computing that make organizations less likely to implement them. However, it can be handled with the help of authentication, auditing, and authorization to safeguard users' big data. Hammouri & Abu-Shanab [9] studied the importance of cloud computing and the utility of cloud services in Jordan with the help of a quantitative analysis of one hundred and forty-three participants. The findings revealed that cost reduction, service quality, and control enhancement urge users to take cloud services. In the same vein, Machuga [25] researched cloud computing usage in European countries through secondary research based on Eurostat's findings, which suggested that adopting cloud technology was highly effective in European nations. People are familiar with the technology, and the presence of strong infrastructure helped make cloud computing usage safe and secure. Similarly, Flora et al. [26] researched forty-four experts on cyber security crimes and probable reasons for attacks with the help of thematic analysis through the expert elicitation method. The interview findings revealed a need for creating an integrated sub-system wherein human intervention to preserve confidentiality, ensure a timely and reliable flow of data, regular compliance, and control over intrusion via confidentiality building a cyber-system can be created.
The above-reviewed studies detailed the significant contributions of cloud services, issues concerning big data, and ways to curb the issues at hand. However, most studies did not emphasize risk assessment methods commonly used to evaluate big data using the cloud globally in different nations. The present study intends to bridge the gap and analyse the capabilities of the risk assessment methods in use. The comparative study demonstrates the risks that can be encountered in less competent methods of risk assessments compared to the more competent and efficient methods practised in some countries. This research is based on the risk assessment methods in use. The current paper throws light on their economy's country-wise technology advancement and cloud-based development.
The main research question explored in this study is "What are the risks associated with big data based on cloud computing technologies in Canada, Jordan, South Africa, and the United Kingdom?" Since big data is ever-growing and increasingly vulnerable at the same time, threats and privacy concerns need to be addressed through effective risk assessments. This goes with analysing the source of risks and the potential damage they can cause. Further, addressing the risks and security issues and effectively eradicating or curbing them is important. Therefore, understanding the risk assessment processes in big data services in cloud computing technologies is instrumental in curbing cyber-attacks and other risks of big data. The study compares the risk assessment methods for big data in the cloud in the respective countries. The contribution of this paper is summarized as follows (1) Analyze the management of big data using the cloud. (2) Study the risks and vulnerabilities associated with cloud services. (3) Study and analyze the risk assessment methods most used in the chosen countries.

METHODOLOGY
This study planned and implemented a comparative risk assessment method for safeguarding big data using the cloud. To fulfil the objectives, an interpretive research paradigm was implemented and inductive. Ontologically, this study interprets the relationship between domains as per participants' views, intending to analyze and understand the risk assessment methods of different nations.

Data Types
The researcher used mixed methods, wherein quantitative and qualitative data were collected and reviewed. Moreover, in this paper, primary data was used and collected via semi-structured interviews wherein the respondents were asked openended questions.

Quantitative vs Qualitative Research
Quantitative research investigates numerical data or data translated into numbers using statistical methodologies. Quantitative research is concerned with numerical data that has been converted into numbers. Statistics refers to the most often used approaches for analyzing numerical data. Statistical procedures deal with the organization, analysis, interpretation, and presentation of numerical data. Statistics is a large field of study with numerous applications, including information systems and data analysis [27]. On the other hand, qualitative research is based on unquantifiable processes and meanings. According to Maanen [28], qualitative label techniques in the social sciences have no definite meaning. Qualitative research refers to the process of collecting, decoding, translating, and comprehending the term's meaning rather than the frequency of naturally occurring events.
In the context of this study, the use of qualitative research allowed the researchers to delve further into the research issue through extensive interviews highlighting professionals' experiences, expertise, and concepts on risk assessment procedures in their specialized areas. The qualitative study aided in comprehending specialists' perspectives on big data and cloud computing difficulties. The explanatory data aided in the formation of patterns from concepts and insights. The interactive interviews and triangulation process assisted in validating the acquired data and providing well-founded reputable information from credible sources [29]. This study used a four-step process to answer the research question and to accomplish the objectives through qualitative research. This study initially defined the field of research connected to big data risk assessment using cloud services and then gathered more information depending on the research topic. This process involved choosing the unit, topic, and analysis. The analysis employs various methodologies to handle risk challenges in big data and cloud computing environments. The company itself was used as a supplementary unit of study. The study then obtained primary data from participants who were professionals operating close to cloud platforms. This step gathered data on various risk assessment approaches in big data and cloud computing. Data was collected through interviews with respondents from the four targeted countries who work in cloud platform organizations or use cloud services, secondary information available on the firms' websites for analysis, and information from the Internet in general, as detailed in Table 1.

Qualitative research
Data was collected via interviews, the firm's websites, and information from the Internet.
The focus is on Canada, Jordan, South Africa, and the UK. There are between 3-4 firms per country; 2-3 interviewees per firm (the target is ten participants from each country).
Inductive analysis and finding patterns. Triangulation to add rigour.
Following the gathering of information, the evidence was reviewed. The study determined if the information gathered was useful. The information was analyzed and examined to eliminate contradictory and irrelevant data and find missing data. The study detected patterns after reviewing the evidence. This stage attempted to organize the data to uncover intriguing patterns related to the problem (i.e., risk factors in big data and cloud computing settings). Grounded theory is based on systematically collecting, analyzing, categorizing, and (iterative) validating data that can aid in defining intriguing phenomena. The study used data and researcher triangulation to increase the research's quality and rigour. The researcher evaluated several factors to guarantee that this research fits the specific ethical criterion. This included obtaining the necessary approvals to carry out the study, securing participation, maintaining data confidentiality at all stages, educating participants about the potential hazards of participating in this research, and adhering to the ethical research process.

Sampling and Data Collection Procedure
Sampling is an important part of research methodology since it helps get information from the right respondents. Sampling is the process of selecting the right population sample to conclude. Unbiased and error-free sampling is needed to draw meaningful conclusions from the study. The target population is used to determine the sample size. The participants in this research are IT professionals and specialists working in organizations that use big data in the cloud. The study's sample size is 40 experienced individuals representing four countries. To ensure that the respondents had a thorough understanding of the risks and issues within their organizations and the concerns that have impacted the security and confidentiality of their big data in the cloud, we limited our sample to experts with solid experience in cloud-based platforms and IT service-providing firms. In addition, experts in the field of information technology could discuss the most upto-date national risk assessment techniques and the seriousness of risk assessment approaches implemented by individual companies in response to the growing prevalence of remote work. Canada, Jordan, SA, and the UK sent samples.
Interviews were done with between three and four companies in each country. To validate and generalize the respondents' responses, we set a goal of having ten people from each country participate. The results of this research have been useful in getting IT department managers and staffers more familiar with big data's role in the cloud. Candidates were chosen for their expertise in areas such as risk analysis, big data, and cloud-based service management. The four countries were chosen because of the high concentration of cloud service users among their businesses. Regarding big data and cloud computing, Canada and the UK may claim to be among the first countries to do so. Canada has an extensive adoption of big data and cloud computing. Perception of people knowing risk assessment (RA) methods can significantly contribute to the study. The UK was selected as it is a developed European nation where cloud solutions have immersed fully in industrial and private utilities. Jordan is a fast-digital transformation nation, but due to high costs and lack of training, there are issues in implementing the cloud for big data management. The perception of participants in Jordan on RA methods is imperative to be included to provide a holistic approach to the investigation. While South Africa is struggling to adapt RA methods in cloud computing services, it needs advancement in infrastructure and internet availability still; sales in the Middle East and Africa's big data analytics market are expected to reach $68 billion by 2025, per research by Frost & Sullivan [30].

Interview
The researcher used grounded theory to perform qualitative analysis for data analysis. In this case, interviews of the respondents were recorded and turned into text. The major themes were distributed as per the evidence gathered, and a comparative study was performed based on the analysis. For effective reliability and validity of the research, the errors were double-checked. The completeness, relevance, and timelessness of data were checked for utility. The interview of respondents from different nations was done based on the perception of big data risks in their nations. Herein, some sets of questions were prepared, and some were left to be asked as per the replies and content provided by the interviewees. The information was systematically analyzed to find a pattern and generalize the findings. Further, the triangulation method confirmed the validity of the inferences drawn with the help of participant observation, research from secondary sources, and data validation.
The interviews were transcribed in Atlas.ti software, which was used to codify the answers. Firstly, the word clouds were generated and used to identify the codes. The data was then coded, and Atlas.ti identified the themes relevant to the research questions answered. Only knowledgeable professionals having relevant exposure and experience in information technology and cloud-based platforms were included in the study. Some of the professionals from relevant departments are considered included for this study was IT operations and infrastructure, IT operations and access management, enterprise architecture and support, IT lead (helpdesk, system administrators, network operations and security, and infrastructure/network administrators), and IT information and security Operations. All relevant data for answering the research questions were preserved in four categories: occurrences primarily impacting security in cloud computing, significant data contexts; company information; and technology advancements to solve risk issues in cloud computing and big data environments. Large sheets of sketch paper and pencils are the tools for organizing the data. Drawings, diagrams, and figures are often used to categorize data. Colour coding is used to distinguish between different data types (i.e., types of risk, potential solutions, and best practices).

Grounded Theory
Grounded theory uncovers patterns. Galal-Edeen [31] says a grounded theory researcher must gather, classify, and validate (iteratively) data. Formalize the data to assist future data collection and analysis. The grounded theory approach provides the methodologies and strategies needed to achieve this goal [36]. Grounded theory is data-based, as the name suggests. The researcher kept the analysis close to the data and established a theoretical framework. The researcher maintained a comprehensive analysis close to the acquired data. In this study, the researcher compared responder observations. Theoretical sampling followed coding and data collecting. This helped the researcher grasp risk evaluations for huge data in cloud computing. The interviews utilized grounded theory and Internet resources. The application of grounded theory to big data and cloud risk evaluation is justified since it provides a set of processes for classifying and evaluating data that are compatible with the interpretative method. It keeps the analysis close to the facts and yields inductive findings about the investigated phenomenon. [32]. The grounded theory process is iterative, with frequent movement between concept and data and comparison across sources [33]. The researcher used the following steps proposed by Bernard [34] to carry out grounded theory: 1) Compiling all of the data from those categories and comparing them. a. Create transcripts of interviews and read a small sample of text. b. Look for potential analytic categories (that is, themes) that emerge. 2) Thinking about how categories are related to one another. 3) Building theoretical models based on the relationships between categories, regularly evaluating the models against facts, predominantly negative scenarios.

4)
Using quotes from the interviews to illustrate the idea, provide the findings of the analysis (exemplars).

Ethical Consideration
Ethical considerations were not a major concern as this study addresses questions about risk assessment in big data and cloud computing scenarios. However, because this study included individuals from various backgrounds and experiences, the researcher took the following factors into account to ensure the effectiveness of this research: 1) Obtaining the necessary written or verbal approval to perform this research.
2) Ensuring that participation was fully voluntary. Participants are under no obligation to contribute information to this study and may withdraw at any moment if they believe it is unnecessary. 3) Maintaining data confidentiality at all times. This includes storing and maintaining data properly. It also requires removing the raw data once the research has been successfully conducted. 4) Because this study did not contain any sensitive material, the chances were minimal.

RESULTS AND DISCUSSION
This section presents the qualitative and quantitative analysis of the data gathered to compare big data risk assessment techniques based on cloud computing for Canada, Jordan, SA, and the UK. The qualitative data is analyzed using thematic analysis with Atlas.ti software and the quantitative data were analyzed using statistical representation. The results obtained are discussed in the subsections that follow:

Level of Knowledge
The knowledge of big data allows individuals to be more aware of methods of data utilization to identify patterns and trends [35]. The respondents were enquired about their knowledge level of big data. The minimum amount of data considered in the question was considered as much as that allowed to draw a strong conclusion. The same is presented in Figure 1.   [36]. The responses are compiled in the form frequency in Figure 2. , and Canada has 60% of respondents with an advanced understanding of cloud computing. Also, the quantitative analysis of respondents' answers revealed the knowledge of risk management in each country to find an organization's existing risk assessment methods and policies. The responses gathered are presented in Figure 3. Figure 3 reveals that 50% of respondents in SA had advanced knowledge of risk, 50% of respondents in the UK also had advanced knowledge, 60% of the respondents in Canada had expert-level knowledge, and in Jordan, 40% belonged to advanced and intermediate groups, respectively. Only SA has 20% of respondents in the limited experience domain and faces challenges in risk assessment.

Implementation of Big Data and Cloud Services in the Organizations
To enquire about the implementation of big data and cloud services in the organizations across the four countries, respondents were asked whether or not big data is implemented in their organization; the responses gathered are presented in Figure 4.  The responses reveal that cloud computing implementation is 90% in Canada, 100% in the UK, 90% in Jordan, and 90% in SA. The 10% of Canadians responding to the lack of implementation of cloud computing suggested they are in the adoption stage to prepare themselves for high customer demand and big data handling. In Jordan, the non-conformity, as suggested by the respondents, aroused due to the high cost of implementation, lack of qualified individuals, and analytical needs for comprehending the impacts of technology. In SA, the respondents working in nonconforming organizations suggested a lack of clarity of why the technology was not adopted as it could improve uptime and overall reliability.

Existing Risk Assessment Methods and Policies in the Organizations
To gauge the policies in place for risk assessments in the organizations in each country, the first policies adopted by the companies for termination or transfer of data to the cloud service were enquired about, and the responses are presented in the graph Figure 6. Are specific arrangements (policies) in place concerning your information in case you want to terminate or transfer to the cloud service? Figure 6 shows that only Canada has 90% of respondents stated the presence of policies for termination or transfer of data to the cloud service, and the rest of the nation has 80% compliance in place. Those respondents showing non-presence of policy in Canada stated implementation based on the real-life scenario-based implementation of risk coverage. In SA, the respondents stated a lack of a clear Information Governance policy. In Jordan, the migration will be determined based on cost, and in UK policies, implementation has been delayed due to different factors, such as COVID-19. For having a clear risk management plan in place, the respondents reveal that in SA, there are Governance Risk and Compliance framework policies for management. The respondents in the UK suggested that part of organizational risk assessment is entwined with a business recovery plan, and it includes regular testing of data and network backup with recovery. In Jordan, the plan is not mature enough; however, the employees are trained. In Canada, there is a risk assessment plan and the case of a cyber-attack. To answer what was covered in the risk plans in organizations, the respondents in Canada identified the prominent themes of data in transit, security, encryption, leakage, and control systems. There was mitigation of data leakage and data protection practices in Canadian organizations. In organizations in the UK, the current risk plan covers training, data security and access, SSL, data authentication, data transmission, and end-user. There is high usage of secure channels for transmission and data backups, too, with TLS (Transport Layer Security) protection that supports end-user authentication and access control. In Jordanian companies, the risk plan covers data access, encryption, security, preventative measures, control systems, and handling of sensitive information. Further, in SA,

Figure 7. Themes covered in risk assessment plans in different countries
The risks related to data protection identified in various nations are presented in Figure 8. It shows the themes of the federal government's role, data protection, and the Personal Information Protection Act in Canada, such as the Personal Information Protection and Electronic Documents Act (PIPEDA). In Jordan, the risk of training, legal threats from clients, and a lack of authority to deal with the illegal use of data. In SA, the threats were related to the Protection of Personal Information Act introduced (POPIA) in July 2021 covered the themes of corruption, data loss, or disclosures that lead to legal issues. Further, in the UK, the risks were reported from individual rights, data access-related risks, regulation and rights-associated risks, lack of legal threats, and data protection around the Data Protection Act 2018.

Challenges and Shortcomings of the Risk Assessment Methods
The challenges and shortcomings of the risk assessment methods in Canada awareness and training related to identifying threats and cybersecurity. The problem also arises from not all parties taking cybersecurity seriously, leading to data breaches. In Jordan, staff training is a must, and data must be used adequately, fairly, transparently, and only for a specific purpose. In SA staff training to cover data-related risks, more was advised to overcome the shortcomings of the risk assessment methods. Lastly, in the UK, emphasis on awareness of acts governing incidences of security threat is less than acts as a challenge for implementing risk assessment. The challenge in the nation also arises from the type of data to be protected, as not all data is important and cannot be pursued with legal action.

Discussions
The study examines risk assessment methods in each country for managing big data using cloud services. The data analysis reveals that risk plans cover data encryption in Canada to prevent data loss and educate employees about cyber security. Concerning the same, Ali et al. [37] are suggestive of the implementation of the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) and Failure Mode and Effect Analysis (FMEA) to counter the problems of database hack risks, broadcast data errors, cyber-attack, network interruptions, server failures, and virus effects. In the UK, the risk plan covers secure transmission channels, security and awareness training, data encryption in transit, Transport Layer Security, prevention of data leaks, and tracking suspicious activity. In addition, General Data Protection Regulation (GDPR) is responsible for transparency, fairness, data minimization, accuracy, accountability, and integrity, among other features [38]. In Jordan, data loss prevention practices, encryption, guidelines, and staff training to ensure cyber security are maintained. Creating management support, funding, technical expertise, alignment with the organization's objectives, and user security awareness is considered imperative [39]. Moreover, the conceptual framework for this study is shown in Fig.9. It provides a model representation of risk assessment and management of risk in big data using cloud services. The framework describes the relationship between big data usage under cloud computing and risk assessment methods. The risk assessment methods and the possible risks prevailing in the cloud-based virtual environment need strategic integration to improve services. In SA, the risk assessment framework lacks a consistent information governance policy and is based on avoidance based on previous learning and training. The findings of Masilela and Nel [40] are similar and suggest adopting measures of

Study limitations
Finding interviewees was a challenge for this project. Jordan and South Africa were the hardest since the researcher had to contact too many people. The researcher used two alternative strategies. First, researchers created a Google Form with questionnaire questions and shared it on social media. The link was posted to a LinkedIn page, and network contacts shared it. The second contingency plan comprised the researcher contacting a US firm that helped share the questionnaires and target a specific population. Due to time constraints and numerous time zones, scheduling was very difficult. The pandemic made potential respondents unwilling to spend 30 minutes answering questionnaires. When the researcher could not interview confirmed responders, he suggested other possibilities, such as audio or video recordings. Due to the Covid-19 outbreak, face-to-face interviews were also impossible, and most potential responders refused to participate in video calls. Covid-19 has caused many to adjust their regular activities, including avoiding others, working from home, and virtually attending school. Again, the researcher gave responders options like voice or video recording or typing the answer.

Areas for future work
Big data and the issue of risk assessment are becoming crucial to organizations all over the world. This thesis adds to our understanding of big data and cloud computing risk assessment, primarily focusing on Canada, Jordan, SA, and the UK. This study acknowledges that new technology is now a requirement rather than a luxury. Some nations' risk assessment plans fail to incorporate the 4th IR. Big data's function and how to use it to your advantage in the competitive market have received little attention. To be completely realized, some of the cloud computing and big data concepts outlined in this thesis will need further development and extension. The researcher suggests the following areas for further research because she thinks they will add to the knowledge that will aid organizations in better-identifying hazards while deploying big data and cloud computing technologies. First and foremost, the researcher contends that more study is necessary to properly understand how analytics and information management have evolved in cloud-based analytics. Another worrying finding of the study was how little cloud computing and big data were employed in less developed nations like Jordan and South Africa. Therefore, research on adaptation and mitigation techniques for dangers in big data and cloud computing is necessary. Thirdly, this study discovered that one of the major barriers to adopting cloud computing and big data is the concern of cyber security threats. Future research in developing tactics and solutions to address privacy and security problems is recommended in this regard by the study. Future research can focus on other areas as long as it strives to change the cloud system from just a data management platform to a scalable data analytics platform; thus, it is not restricted to the ones stated above.

CONCLUSION
The study reveals that UK organizations were leading in the different aspects of knowledge of big data, cloud computing, and risk assessment. However, implementing big data and the cloud completely in organizations is restrained by the pandemic. For Canada, the adoption of cloud strategies is on the rise; hence, organizations are also expanding their horizon on the systematic handling of risks. In Jordan, organizations are moving towards more big data and cloud computing adoption with the government's support. However, the sector faces problems of technological, legal, and organizational barriers. In SA, the adoption of big data is facilitated by acts such as POPIA. In the future, such legislation will allow organizations to advance in the information security domain. The recommendations for the study are as follows: 1) The government needs to explore cost-effective ways to enhance the use of big data. This will allow firms to gain in-depth industry knowledge and multidisciplinary expertise combining technological, Data & Analytical possibilities. 2) In Canada, adopting risk assessment and frameworks can be enhanced by adopting big data technologies and addressing the country's lack of knowledge or skill in implementing the technology. 3) In Jordan, the lack of training, legal threats from clients, and a lack of authority need to be addressed to adopt risk assessment processes better. 4) With legalizations in place, SA organizations must focus on encouraging the workforce to be data specialists and encourage the use of big data and cloud computing technologies.